2172 Commits

Author SHA1 Message Date
Loïc Lecrenier
b7f2428961 Fix formatting and warning after rebasing from main 2022-10-26 13:49:33 +02:00
Loïc Lecrenier
3b1f908e5e Revert behaviour of facet distribution to what it was before
Where the docid that is used to get the original facet string value
definitely belongs to the candidates
2022-10-26 13:48:01 +02:00
Loïc Lecrenier
14ca8048a8 Add some documentation on how to run the facet db fuzzer 2022-10-26 13:48:01 +02:00
Loïc Lecrenier
206a3e00e5 cargo fmt 2022-10-26 13:48:01 +02:00
Loïc Lecrenier
f198b20c42 Add facet deletion tests that use both the incremental and bulk methods
+ update deletion snapshots to the new database format
2022-10-26 13:47:46 +02:00
Loïc Lecrenier
e3ba1fc883 Make deletion tests for both soft-deletion and hard-deletion 2022-10-26 13:47:46 +02:00
Loïc Lecrenier
ab5e56fd16 Add document deletion snapshot tests and tests for hard-deletion 2022-10-26 13:47:46 +02:00
Loïc Lecrenier
d885de1600 Add option to avoid soft deletion of documents 2022-10-26 13:47:46 +02:00
Loïc Lecrenier
ee1abfd1c1 Ignore files generated by fuzzcheck 2022-10-26 13:47:46 +02:00
Loïc Lecrenier
2295e0e3ce Use real delete function in facet indexing fuzz tests
By deleting multiple docids at once instead of one-by-one
2022-10-26 13:47:46 +02:00
Loïc Lecrenier
acc8caebe6 Add link to GitHub PR to document of update/facet module 2022-10-26 13:47:46 +02:00
Loïc Lecrenier
a034a1e628 Move StrRefCodec and ByteSliceRefCodec to their own files 2022-10-26 13:47:46 +02:00
Loïc Lecrenier
1165ba2171 Make facet deletion incremental 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
0ade699873 Don't crash when failing to decode using StrRef codec 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
d0109627b9 Fix a bug in facet_range_search and add documentation 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
a2270b7432 Change fuzzcheck dependency to point to git repository 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
1ecd3bb822 Fix bug in FieldDocIdFacetCodec 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
51961e1064 Polish some details 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
cb8442a119 Further unify facet databases of f64s and strings 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
3baa34d842 Fix compiler errors/warnings 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
86d9f50b9c Fix bugs in incremental facet indexing with variable parameters
e.g. add one facet value incrementally with a group_size = X and then
add another one with group_size = Y

It is not actually possible to do so with the public API of milli,
but I wanted to make sure the algorithm worked well in those cases
anyway.

The bugs were found by fuzzing the code with fuzzcheck, which I've added
to milli as a conditional dev-dependency. But it can be removed later.
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
de52a9bf75 Improve documentation of some facet-related algorithms 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
985a94adfc cargo fmt 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
b1ab09196c Remove outdated TODOs 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
3d7ed3263f Fix bug in string facet distribution with few candidates 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
fca4577e23 Return original string in facet distributions, work on facet tests 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
27454e9828 Document and refine facet indexing algorithms 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
bee3c23b45 Add comparison benchmark between bulk and incremental facet indexing 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
b2f01ad204 Refactor facet database tests 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
9026867d17 Give same interface to bulk and incremental facet indexing types
+ cargo fmt, oops, sorry for the bad history :(
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
330c9eb1b2 Rename facet codecs and refine FacetsUpdate API 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
485a72306d Refactor facet-related codecs 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
9b55e582cd Add FacetsUpdate type that wraps incremental and bulk indexing methods 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
3d145d7f48 Merge the two <facetttype>_faceted_documents_ids methods into one 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
982efab88f Fix encoding bugs in facet databases 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
079ed4a992 Add more snapshots 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
afdf87f6f7 Fix bugs in asc/desc criterion and facet indexing 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
a7201ece04 cargo fmt 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
36296bbb20 Add facet incremental indexing snapshot tests + fix bug 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
07ff92c663 Add more snapshots from facet tests 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
61252248fb Fix some facet indexing bugs 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
68cbcdf08b Fix compile errors/warnings in http-ui and infos 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
85824ee203 Try to make facet indexing incremental 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
d30c89e345 Fix compile error+warnings in new tests 2022-10-26 13:46:46 +02:00
Loïc Lecrenier
e8a156d682 Reorganise facets database indexing code 2022-10-26 13:46:46 +02:00
Loïc Lecrenier
fb8d23deb3 Reintroduce db_snap! for facet databases 2022-10-26 13:46:14 +02:00
Loïc Lecrenier
e570c23153 Reintroduce asc/desc functionality 2022-10-26 13:46:14 +02:00
Loïc Lecrenier
bd2c0e1ab6 Remove unused code 2022-10-26 13:46:14 +02:00
Loïc Lecrenier
39a4a0a362 Reintroduce filter range search and facet extractors 2022-10-26 13:46:14 +02:00
Loïc Lecrenier
22d80eeaf9 Reintroduce facet deletion functionality 2022-10-26 13:46:14 +02:00