chore: Rename the library "pentium" 🎉

This commit is contained in:
Clément Renault 2018-10-21 16:40:41 +02:00
parent 13cf084751
commit cf41b20fbb
6 changed files with 21 additions and 17 deletions

View File

@ -1,6 +1,6 @@
[package]
edition = "2018"
name = "raptor"
name = "pentium"
version = "0.1.0"
authors = ["Kerollmops <renault.cle@gmail.com>"]

View File

@ -1,27 +1,31 @@
# raptor-rs
Raptor, the new RISE
# pentium
A search engine based on the [blog posts serie](https://blog.algolia.com/inside-the-algolia-engine-part-1-indexing-vs-search/) of the great Algolia company.
This is a library, this means that binary are not part of this repository
but since I'm still nice I have made some examples for you in the `examples/` folder.
## Usage
First you need to generate the index files.
Pentium work with an index like most of the search engines.
So to test the library you can create one by indexing a simple csv file.
```bash
$ cargo build --release
$ time ./target/release/raptor-cli index csv --stop-words stop-words.txt the-csv-file.csv
cargo build --release --example csv-indexer
time ./target/release/examples/csv-indexer --stop-words misc/en.stopwords.txt misc/kaggle.csv
```
The `stop-words.txt` file here is a simple file that contains one stop word by line.
The `en.stopwords.txt` file here is a simple file that contains one stop word by line (e.g. or, and...).
Once the command finished indexing you will have 3 files that compose the index:
- The `xxx.map` represent the fst map.
- The `xxx.idx` represent the doc indexes matching the words in the map.
- The `xxx.sst` is a file that contains all the fields and the values asociated with it, it is passed to the internal RocksDB.
Now you can easily use `raptor server console` or `raptor serve http` with the name of the dump. (e.g. relaxed-colden).
Now you can easily run the `serve-console` or `serve-http` examples with the name of the dump. (e.g. relaxed-colden).
```bash
$ cargo build --release --default-features --features serve-console
$ ./target/release/raptor-cli serve console --stop-words stop-words.txt relaxed-colden
cargo build --release --example serve-console
./target/release/examples/serve-console relaxed-colden
```
Note: If you have performance issues run the searcher in release mode (i.e. `--release`).

View File

@ -6,7 +6,7 @@ use std::fs::File;
use std::io;
use csv::ReaderBuilder;
use raptor::{MetadataBuilder, DocIndex, Tokenizer, CommonWords};
use pentium::{MetadataBuilder, DocIndex, Tokenizer, CommonWords};
use rocksdb::{SstFileWriter, EnvOptions, ColumnFamilyOptions};
use structopt::StructOpt;

View File

@ -7,7 +7,7 @@ use std::path::PathBuf;
use serde_json::from_str;
use rocksdb::{SstFileWriter, EnvOptions, ColumnFamilyOptions};
use raptor::{MetadataBuilder, DocIndex, Tokenizer, CommonWords};
use pentium::{MetadataBuilder, DocIndex, Tokenizer, CommonWords};
use structopt::StructOpt;
#[derive(Debug, StructOpt)]

View File

@ -5,8 +5,8 @@ use std::path::PathBuf;
use elapsed::measure_time;
use rocksdb::{DB, DBOptions, IngestExternalFileOptions};
use raptor::rank::{criterion, Config, RankedStream, Document};
use raptor::{automaton, DocumentId, Metadata, CommonWords};
use pentium::rank::{criterion, Config, RankedStream};
use pentium::{automaton, DocumentId, Metadata};
#[derive(Debug, StructOpt)]
pub struct CommandConsole {

View File

@ -7,8 +7,8 @@ use std::path::PathBuf;
use std::error::Error;
use std::sync::Arc;
use raptor::rank::{criterion, Config, RankedStream};
use raptor::{automaton, Metadata, CommonWords};
use pentium::rank::{criterion, Config, RankedStream};
use pentium::{automaton, Metadata};
use rocksdb::{DB, DBOptions, IngestExternalFileOptions};
use warp::Filter;
use structopt::StructOpt;