MeiliSearch/crates
meili-bors[bot] cac355bfa7
Merge #5124
5124: Optimize Prefixes and Merges r=ManyTheFish a=Kerollmops

In this PR, we plan to optimize the read of LMDB to use read the entries in lexicographic order and better use the memory-mapping OS cache:

 - Optimize the prefix generation for word position docids (`@manythefish)`
 - Optimize the parallel merging of the caches to sort entries before merging the caches (`@kerollmops)`
 
## Benchmarks on 1cpu 2gb gpo3 (5k IOps)
 
Before on the tag meilisearch-v1.12.0-rc.3.

```
word_position_docids:merge_and_send_docids: 988s
compute_word_fst: 23.3s
word_pair_proximity_docids:merge_and_send_docids: 428s
compute_word_prefix_fid_docids:recompute_modified_prefixes: 76.3s
compute_word_prefix_position_docids:recompute_modified_prefixes:from_prefixes: 429s
```

After sorting the whole `HashMap`s in a `Vec` on this branch.

```
word_position_docids:merge_and_send_docids: 202s
compute_word_fst: 20.4s
word_pair_proximity_docids:merge_and_send_docids: 427s
compute_word_prefix_fid_docids:recompute_modified_prefixes: 65.5s
compute_word_prefix_position_docids:recompute_modified_prefixes:from_prefixes: 62.5s
```

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2024-12-05 09:35:52 +00:00
..
benchmarks remove mimalloc on Windows 2024-12-02 18:13:56 +01:00
build-info Move crates under a sub folder to clean up the code 2024-10-21 08:18:43 +02:00
dump Clean up dependencies 2024-11-27 14:30:34 +01:00
file-store Move crates under a sub folder to clean up the code 2024-10-21 08:18:43 +02:00
filter-parser Move crates under a sub folder to clean up the code 2024-10-21 08:18:43 +02:00
flatten-serde-json Move crates under a sub folder to clean up the code 2024-10-21 08:18:43 +02:00
fuzzers Plug the NoPanicThreadPool in the tests and benchmarks 2024-11-27 17:04:49 +01:00
index-scheduler increase the margin allowed to delete task 2024-12-03 11:07:03 +01:00
json-depth-checker Move crates under a sub folder to clean up the code 2024-10-21 08:18:43 +02:00
meili-snap Add comment explaining why we fixed the version of insta 2024-11-21 16:56:56 +01:00
meilisearch Merge #5094 2024-12-03 08:00:55 +00:00
meilisearch-auth Clean up dependencies 2024-11-27 14:30:34 +01:00
meilisearch-types Stop allocating 1GiB for documents 2024-12-02 16:30:14 +01:00
meilitool Merge branch 'main' into indexer-edition-2024 2024-11-20 16:59:58 +01:00
milli Merge #5124 2024-12-05 09:35:52 +00:00
permissive-json-pointer Add indices field to _matchesPosition to specify where in an array a match comes from (#5005) 2024-11-20 01:00:43 +01:00
tracing-trace Remove orphan span 2024-11-21 12:12:07 +01:00
xtask Make clippy happy 2024-12-04 17:39:10 +01:00