Reintroduce stop words
MeiliDB
A full-text search database using a key-value store internally.
It uses RocksDB like a classic database, to store documents and internal data. The key-value store power allow us to handle updates and queries with small memory and CPU overheads.
You can read the deep dive if you want more informations on the engine, it describes the whole process of generating updates and handling queries.
We will be proud if you send pull requests to help us grow this project, you can start with issues tagged "good-first-issue" to start !
At the moment this is a library only, this means that binaries are not part of this repository but since I'm still nice I have made some examples for you in the examples/
folder that works with the data located in the misc/
folder.
In a near future MeiliDB we be a binary like any database: updated and queried using some kind of protocol. It is the final goal, see the milestones. MeiliDB will just be a bunch of network and protocols functions wrapping the library which itself will be published to https://crates.io, following the same update cycle.
Performances
these informations have been made with a version dated of october 2018, we must update them
We made some tests on remote machines and found that we can handle with a dataset of near 280k products, on a server that cost 5$/month with 1vCPU and 1GB of ram and on the same index and with a simple query:
- near 190 users with an average response time of 90ms
- 150 users with an average response time of 70ms
- 100 users with an average response time of 45ms
Network is mesured, servers are located in amsterdam and tests are made between two different datacenters.
Notes
The default Rust allocator has recently been changed to use the system allocator. We have seen much better performances when using jemalloc as the global allocator.
Usage and examples
MeiliDB work with an index like most of the search engines. So to test the library you can create one by indexing a simple csv file.
cargo run --release --example create-database -- test.mdb misc/kaggle.csv
Once the command finished indexing the database should have been saved under the test.mdb
folder.
Now you can easily run the query-database
example to check what is stored in it.
cargo run --release --example query-database -- test.mdb