From 45d0d7c3d42794f532b530df40949ae05d2ac195 Mon Sep 17 00:00:00 2001 From: Kerollmops Date: Mon, 6 Jul 2020 17:38:22 +0200 Subject: [PATCH] Clean up the README --- README.md | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index 65e299cb7..a12f1bc7f 100644 --- a/README.md +++ b/README.md @@ -23,19 +23,10 @@ All of that on a 39$/month machine with 4cores. ### Index your documents -You first need to split your csv yourself, the engine is currently not able to split it itself. -The bigger the split size is the faster the engine will index your documents but the higher the RAM usage will be too. - -Here we use [the awesome xsv tool](https://github.com/BurntSushi/xsv) to split our big dataset. +You can feed the engine with your CSV data: ```bash -cat my-data.csv | xsv split -s 2000000 my-data-split/ -``` - -Once your data is ready you can feed the engine with it, it will spawn one thread by CSV part up to one by number of core. - -```bash -./target/release/indexer --db my-data.mmdb ../my-data-split/* +./target/release/indexer --db my-data.mmdb ../my-data.csv ``` ## Querying