bors[bot]
509a56a43d
Merge #158
...
158: Implements the dumps r=irevoire a=irevoire
closes #20
divergence from legacy meilisearch:
- dump v2 added, support loading of pending updates (only works dumps created from v2)
- added time stamps to the dump info
- Dump info are only persisted in an internal data structure, and they are not fetched from fs on demand anymore. This was a potential security flaw. This means that the dump infos are flushed on every restart.
Co-authored-by: tamo <tamo@meilisearch.com>
Co-authored-by: Marin Postma <postma.marin@protonmail.com>
2021-06-02 12:06:47 +00:00
Clémentine Urquizar
ef1ac8a0cb
Update README
2021-06-02 11:13:22 +02:00
Clémentine Urquizar
edfcdb171c
Update benchmarks/scripts/list.sh
...
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-06-02 11:13:22 +02:00
Clémentine Urquizar
3c91a9a551
Update following reviews
2021-06-02 11:13:22 +02:00
Tamo
bc4f4ee829
remove s3cmd as a dependency and provide a script to list all the available benchmarks
2021-06-02 11:13:22 +02:00
Clémentine Urquizar
61fe422a88
Update benchmarks/scripts/compare.sh
...
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-06-02 11:13:22 +02:00
Clémentine Urquizar
57ed96622b
Update benchmarks/scripts/compare.sh
...
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-06-02 11:13:22 +02:00
Clémentine Urquizar
b3c0d43890
Update benchmarks/scripts/compare.sh
...
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-06-02 11:13:22 +02:00
Clémentine Urquizar
0d0e900158
Add CI for benchmarks
2021-06-02 11:13:22 +02:00
tamo
4536dfccd0
add a way to provide primary_key or autogenerate documents ids
2021-06-02 11:13:20 +02:00
tamo
06c414a753
move the benchmarks to another crate so we can download the datasets automatically without adding overhead to the build of milli
2021-06-02 11:11:50 +02:00
tamo
3c84075d2d
uses an env variable to find the datasets
2021-06-02 11:05:07 +02:00
tamo
4969abeaab
update the facets for the benchmarks
2021-06-02 11:05:07 +02:00
tamo
e5dfde88fd
fix the facets conditions
2021-06-02 11:05:07 +02:00
tamo
7c7fba4e57
remove the time limitation to let criterion do what it wants
2021-06-02 11:05:07 +02:00
tamo
5d5d115608
reformat all the files
2021-06-02 11:05:07 +02:00
tamo
7086009f93
improve the base search
2021-06-02 11:05:07 +02:00
tamo
d0b44c380f
add benchmarks on a wiki dataset
2021-06-02 11:05:07 +02:00
tamo
beae843766
add a missing space
2021-06-02 11:05:07 +02:00
tamo
5132a106a1
refactorize everything related to the songs dataset in a songs benchmark file
2021-06-02 11:05:07 +02:00
tamo
136efd6b53
fix the benches
2021-06-02 11:05:07 +02:00
tamo
4b78ef31b6
add the configuration of the searchable fields and displayed fields and a default configuration for the songs
2021-06-02 11:05:07 +02:00
tamo
ea0c6d8c40
add a bunch of queries and start the introduction of the filters and the new dataset
2021-06-02 11:05:07 +02:00
tamo
3def42abd8
merge all the criterion only benchmarks in one file
2021-06-02 11:05:07 +02:00
tamo
a2bff68c1a
remove the optional words for the typo criterion
2021-06-02 11:05:07 +02:00
tamo
aee49bb3cd
add the proximity criterion
2021-06-02 11:05:07 +02:00
tamo
49e4cc3daf
add the words criterion to the bench
2021-06-02 11:05:07 +02:00
tamo
15cce89a45
update the README with instructions to get the download the dataset
2021-06-02 11:05:07 +02:00
tamo
e425f70ef9
let criterion decide how much iteration it wants to do in 10s
2021-06-02 11:05:07 +02:00
tamo
4fdbfd6048
push a first version of the benchmark for the typo
2021-06-02 11:05:07 +02:00
Tamo
2d7785ae0c
remove the dump_batch_size option from the CLI
2021-06-01 20:42:06 +02:00
Tamo
d0552e765e
forbid deserialization of Setting<Checked>
2021-06-01 20:41:45 +02:00
bors[bot]
270da98c46
Merge #202
...
202: Add field id word count docids database r=Kerollmops a=LegendreM
This PR introduces a new database, `field_id_word_count_docids`, that maps the number of words in an attribute with a list of document ids. This relation is limited to attributes that contain less than 11 words.
This database is used by the exactness criterion to know if a document has an attribute that contains exactly the query without any additional word.
Fix #165
Fix #196
Related to [specifications:#36](https://github.com/meilisearch/specifications/pull/36 )
Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Many <legendre.maxime.isn@gmail.com>
2021-06-01 16:09:48 +00:00
many
e857ca4d7d
Fix PR comments
2021-06-01 18:06:46 +02:00
Many
ab2cf69e8d
Update milli/src/update/delete_documents.rs
...
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-06-01 17:04:10 +02:00
Many
8e6d1ff0dc
Update milli/src/update/index_documents/store.rs
...
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-06-01 17:04:02 +02:00
bors[bot]
168fe0aa28
Merge #206
...
206: Fix http-ui r=Kerollmops a=irevoire
I just noticed that `http-ui` was not compiling on `main`.
I'm not sure this is the best fix, but it works 👀
Co-authored-by: Tamo <irevoire@hotmail.fr>
2021-06-01 14:31:32 +00:00
Tamo
608c5bad24
fix http-ui
2021-06-01 16:24:46 +02:00
bors[bot]
7d36d664a7
Merge #203
...
203: Make the MatchingWords return the number of matching bytes r=Kerollmops a=LegendreM
Make the MatchingWords return the number of matching bytes using a custom Levenshtein algorithm.
Fix #138
Co-authored-by: many <maxime@meilisearch.com>
2021-06-01 12:00:33 +00:00
many
225ae6fd25
Resolve PR comments
2021-06-01 11:53:09 +02:00
bors[bot]
3a7c1f2469
Merge #191
...
191: dumps v2 r=irevoire a=MarinPostma
Co-authored-by: Marin Postma <postma.marin@protonmail.com>
Co-authored-by: marin <postma.marin@protonmail.com>
2021-06-01 09:46:31 +00:00
marin
df6ba0e824
Apply suggestions from code review
...
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-06-01 11:18:37 +02:00
bors[bot]
2f9f6a1f21
Merge #169
...
169: Optimize roaring codec r=Kerollmops a=MarinPostma
Optimize the `BoRoaringBitmapCodec` by preventing it from emiting useless error that caused allocation. On my flamegraph, the byte_decode function went from 4.13% to 1.70% (of transplant graph).
This may not be the greatest optimization ever, but hey, this was a low hanging fruit.
before:
![image](https://user-images.githubusercontent.com/28804882/116241125-17018880-a754-11eb-9f9d-a67418d100e1.png )
after:
![image](https://user-images.githubusercontent.com/28804882/116241167-21bc1d80-a754-11eb-9afc-d9d72727477c.png )
Co-authored-by: Marin Postma <postma.marin@protonmail.com>
2021-06-01 06:30:25 +00:00
Marin Postma
984dc7c1ed
rewrite roaring codec without byteorder.
2021-05-31 22:15:39 +02:00
Marin Postma
1373637da1
optimize roaring codec
2021-05-31 22:15:35 +02:00
Marin Postma
6609f9e3be
review edits
2021-05-31 18:41:37 +02:00
many
1df68d342a
Make the MatchingWords return the number of matching bytes
2021-05-31 18:22:29 +02:00
many
b8e6db0feb
Add database in infos crate
2021-05-31 16:29:27 +02:00
many
c701f8bf36
Use field id word count database in exactness criterion
2021-05-31 16:27:28 +02:00
many
4ddf008be2
add field id word count database
2021-05-31 16:27:28 +02:00