Update the indexing timings in the README

This commit is contained in:
Kerollmops 2021-09-13 16:06:45 +02:00
parent a43f99c600
commit 2741aa8589
No known key found for this signature in database
GPG Key ID: 92ADA4E935E71FA4

View File

@ -32,10 +32,10 @@ cargo run --release -- --db my-database.mdb -vvv --indexing-jobs 8
### Index your documents ### Index your documents
It can index a massive amount of documents in not much time, I already achieved to index: It can index a massive amount of documents in not much time, I already achieved to index:
- 115m songs (song and artist name) in ~1h and take 107GB on disk. - 115m songs (song and artist name) in \~48min and take 81GiB on disk.
- 12m cities (name, timezone and country ID) in 15min and take 10GB on disk. - 12m cities (name, timezone and country ID) in \~4min and take 6GiB on disk.
All of that on a 39$/month machine with 4cores. These metrics are done on a MacBook Pro with the M1 processor.
You can feed the engine with your CSV (comma-seperated, yes) data like this: You can feed the engine with your CSV (comma-seperated, yes) data like this:
@ -43,9 +43,9 @@ You can feed the engine with your CSV (comma-seperated, yes) data like this:
printf "id,name,age\n1,hello,32\n2,kiki,24\n" | http POST 127.0.0.1:9700/documents content-type:text/csv printf "id,name,age\n1,hello,32\n2,kiki,24\n" | http POST 127.0.0.1:9700/documents content-type:text/csv
``` ```
Don't forget to specify the `id` of the documents. Also Note that it also support JSON and Don't forget to specify the `id` of the documents. Also, note that it supports JSON and JSON
JSON streaming, you can send them to the engine by using the `content-type:application/json` streaming: you can send them to the engine by using the `content-type:application/json` and
and `content-type:application/x-ndjson` headers respectively. `content-type:application/x-ndjson` headers respectively.
### Querying the engine via the website ### Querying the engine via the website