Loïc Lecrenier
ab5e56fd16
Add document deletion snapshot tests and tests for hard-deletion
2022-10-26 13:47:46 +02:00
Loïc Lecrenier
d885de1600
Add option to avoid soft deletion of documents
2022-10-26 13:47:46 +02:00
Loïc Lecrenier
2295e0e3ce
Use real delete function in facet indexing fuzz tests
...
By deleting multiple docids at once instead of one-by-one
2022-10-26 13:47:46 +02:00
Loïc Lecrenier
acc8caebe6
Add link to GitHub PR to document of update/facet module
2022-10-26 13:47:46 +02:00
Loïc Lecrenier
a034a1e628
Move StrRefCodec and ByteSliceRefCodec to their own files
2022-10-26 13:47:46 +02:00
Loïc Lecrenier
1165ba2171
Make facet deletion incremental
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
51961e1064
Polish some details
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
cb8442a119
Further unify facet databases of f64s and strings
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
3baa34d842
Fix compiler errors/warnings
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
86d9f50b9c
Fix bugs in incremental facet indexing with variable parameters
...
e.g. add one facet value incrementally with a group_size = X and then
add another one with group_size = Y
It is not actually possible to do so with the public API of milli,
but I wanted to make sure the algorithm worked well in those cases
anyway.
The bugs were found by fuzzing the code with fuzzcheck, which I've added
to milli as a conditional dev-dependency. But it can be removed later.
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
985a94adfc
cargo fmt
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
b1ab09196c
Remove outdated TODOs
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
27454e9828
Document and refine facet indexing algorithms
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
bee3c23b45
Add comparison benchmark between bulk and incremental facet indexing
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
b2f01ad204
Refactor facet database tests
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
9026867d17
Give same interface to bulk and incremental facet indexing types
...
+ cargo fmt, oops, sorry for the bad history :(
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
330c9eb1b2
Rename facet codecs and refine FacetsUpdate API
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
485a72306d
Refactor facet-related codecs
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
9b55e582cd
Add FacetsUpdate type that wraps incremental and bulk indexing methods
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
3d145d7f48
Merge the two <facetttype>_faceted_documents_ids methods into one
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
079ed4a992
Add more snapshots
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
afdf87f6f7
Fix bugs in asc/desc criterion and facet indexing
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
a7201ece04
cargo fmt
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
36296bbb20
Add facet incremental indexing snapshot tests + fix bug
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
07ff92c663
Add more snapshots from facet tests
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
61252248fb
Fix some facet indexing bugs
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
68cbcdf08b
Fix compile errors/warnings in http-ui and infos
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
85824ee203
Try to make facet indexing incremental
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
d30c89e345
Fix compile error+warnings in new tests
2022-10-26 13:46:46 +02:00
Loïc Lecrenier
e8a156d682
Reorganise facets database indexing code
2022-10-26 13:46:46 +02:00
Loïc Lecrenier
bd2c0e1ab6
Remove unused code
2022-10-26 13:46:14 +02:00
Loïc Lecrenier
39a4a0a362
Reintroduce filter range search and facet extractors
2022-10-26 13:46:14 +02:00
Loïc Lecrenier
22d80eeaf9
Reintroduce facet deletion functionality
2022-10-26 13:46:14 +02:00
Loïc Lecrenier
6cc91824c1
Remove unused heed codec files
2022-10-26 13:46:14 +02:00
Loïc Lecrenier
63ef0aba18
Start porting facet distribution and sort to new database structure
2022-10-26 13:46:14 +02:00
Loïc Lecrenier
7913d6365c
Update Facets indexing to be compatible with new database structure
2022-10-26 13:46:14 +02:00
Loïc Lecrenier
c3f49f766d
Prepare refactor of facets database
...
Prepare refactor of facets database
2022-10-26 13:46:14 +02:00
Loïc Lecrenier
9a569d73d1
Minor code style change
2022-10-24 15:30:43 +02:00
Loïc Lecrenier
d76d0cb1bf
Merge branch 'main' into word-pair-proximity-docids-refactor
2022-10-24 15:23:00 +02:00
Loïc Lecrenier
a983129613
Apply suggestions from code review
2022-10-20 09:49:37 +02:00
Loïc Lecrenier
ab2f6f3aa4
Refine some details in word_prefix_pair_proximity indexing code
2022-10-18 10:37:34 +02:00
Loïc Lecrenier
178d00f93a
Cargo fmt
2022-10-18 10:37:34 +02:00
Loïc Lecrenier
072b576514
Fix proximity value in keys of prefix_word_pair_proximity_docids
2022-10-18 10:37:34 +02:00
Loïc Lecrenier
6c3a5d69e1
Update snapshots
2022-10-18 10:37:34 +02:00
Loïc Lecrenier
a7de4f5b85
Don't add swapped word pairs to the word_pair_proximity_docids db
2022-10-18 10:37:34 +02:00
Loïc Lecrenier
264a04922d
Add prefix_word_pair_proximity database
...
Similar to the word_prefix_pair_proximity one but instead the keys are:
(proximity, prefix, word2)
2022-10-18 10:37:34 +02:00
Loïc Lecrenier
1dbbd8694f
Rename StrStrU8Codec to U8StrStrCodec and reorder its fields
2022-10-18 10:37:34 +02:00
Loïc Lecrenier
bdeb47305e
Change encoding of word_pair_proximity DB to (proximity, word1, word2)
...
Same for word_prefix_pair_proximity
2022-10-18 10:37:34 +02:00
Ewan Higgs
beb987d3d1
Fixing piles of clippy errors.
...
Most of these are calling clone when the struct supports Copy.
Many are using & and &mut on `self` when the function they are called
from already has an immutable or mutable borrow so this isn't needed.
I tried to stay away from actual changes or places where I'd have to
name fresh variables.
2022-10-13 22:02:54 +02:00
msvaljek
762e320c35
Add proximity calculation for the same word
2022-10-07 12:59:12 +02:00