406: return document count from builder r=MarinPostma a=MarinPostma
`DocumentBatchBuilder::finish` now returns the number of documents in the batch. This is more compact that calling `len()` just before calling finish.
Co-authored-by: marin postma <postma.marin@protonmail.com>
402: Optimize document transform r=MarinPostma a=MarinPostma
This pr optimizes the transform of documents additions in the obkv format. Instead on accepting any serializable objects, we instead treat json and CSV specifically:
- For json, we build a serde `Visitor`, that transform the json straight into obkv without intermediate representation.
- For csv, we directly write the lines in the obkv, applying other optimization as well.
Co-authored-by: marin postma <postma.marin@protonmail.com>
404: remove search crate r=Kerollmops a=MarinPostma
The functionalities of the search crate have been moved to the cli crate. The outstanding files are removed by this pr.
Co-authored-by: marin postma <postma.marin@protonmail.com>
394: Added search_geo benchmark in cron job r=irevoire a=fumblehool
fixes: #392
`search_geo` cron will run every friday at 18:30
Co-authored-by: Damanpreet Singh <daman.4880@gmail.com>
397: Fix typo in repo r=curquiza a=saintmalik
Fix the single typo found in this repo
Co-authored-by: SaintMalik <37118134+saintmalik@users.noreply.github.com>
398: Update version for the next release (v0.18.2) r=irevoire a=curquiza
Breaking because of https://github.com/meilisearch/milli/pull/358
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
390: Add helper methods on the settings r=Kerollmops a=irevoire
This would be a good addition to look at the content of a setting without consuming it.
It’s useful for analytics.
Co-authored-by: Irevoire <tamo@meilisearch.com>
384: Replace memmap with memmap2 r=Kerollmops a=palfrey
[memmap is unmaintained](https://rustsec.org/advisories/RUSTSEC-2020-0077.html) and needs replacing. memmap2 is a drop-in replacement fork that's well maintained. Note that the version numbers got reset on fork, hence the lower values.
Co-authored-by: Tom Parker-Shemilt <palfrey@tevp.net>
388: fix primary key inference r=MarinPostma a=MarinPostma
The primary key is was infered from a hashtable index of the field. For this reason the order in which the fields were interated upon was not deterministic, and the primary key was chosed ffrom the first field containing "id".
This fix sorts the the index by field_id when infering the primary key.
Co-authored-by: mpostma <postma.marin@protonmail.com>
368: Remove limit of 1000 position per attribute r=irevoire a=ManyTheFish
Instead of using an arbitrary limit we encode the absolute position in a u32
using one strong u16 for the field id and a weak u16 for the relative position in the attribute.
- [x] check database size difference
below is the database size difference for each dataset:
![Capture d’écran 2021-09-27 à 18 01 44](https://user-images.githubusercontent.com/6482087/134944199-bd25fed0-6c34-475c-9afc-197871e06553.png)
- [ ] check search time on big dataset
Related to [product#202](https://github.com/meilisearch/product/issues/202)
Co-authored-by: many <maxime@meilisearch.com>
Instead of using an arbitrary limit we encode the absolute position in a u32
using one strong u16 for the field id and a weak u16 for the relative position in the attribute.