Go to file
2018-10-21 16:40:41 +02:00
examples chore: Rename the library "pentium" 🎉 2018-10-21 16:40:41 +02:00
misc chore: Remove useless files 2018-10-21 16:38:33 +02:00
src feat: Keep a stable order of documents 2018-10-18 17:29:27 +02:00
.gitignore chore: Remove useless files 2018-10-21 16:38:33 +02:00
Cargo.toml chore: Rename the library "pentium" 🎉 2018-10-21 16:40:41 +02:00
LICENSE Initial commit 2018-05-05 10:16:18 +02:00
README.md chore: Rename the library "pentium" 🎉 2018-10-21 16:40:41 +02:00

pentium

A search engine based on the blog posts serie of the great Algolia company.

This is a library, this means that binary are not part of this repository but since I'm still nice I have made some examples for you in the examples/ folder.

Usage

Pentium work with an index like most of the search engines. So to test the library you can create one by indexing a simple csv file.

cargo build --release --example csv-indexer
time ./target/release/examples/csv-indexer --stop-words misc/en.stopwords.txt misc/kaggle.csv

The en.stopwords.txt file here is a simple file that contains one stop word by line (e.g. or, and...).

Once the command finished indexing you will have 3 files that compose the index:

  • The xxx.map represent the fst map.
  • The xxx.idx represent the doc indexes matching the words in the map.
  • The xxx.sst is a file that contains all the fields and the values asociated with it, it is passed to the internal RocksDB.

Now you can easily run the serve-console or serve-http examples with the name of the dump. (e.g. relaxed-colden).

cargo build --release --example serve-console
./target/release/examples/serve-console relaxed-colden