MeiliSearch/crates
meili-bors[bot] 81a38099ec
Merge #5336
5336: Meilitool Hair Dryer r=dureuill a=Kerollmops

This pull request introduces a new subcommand to hair dry a specific part of specific indexes. It is useful when [the memory-mapped pages are not hot in the cache](https://arc.net/l/quote/ixhcdwcq) and must be. Hair drying those interesting pages makes the search requests using the vector store much faster.

The previous technique used the "cat method," which consists of reading the whole LMDB data file and pipping it into the null file descriptor. By doing that, the whole LMDB data file becomes hot in the cache. However, when the database is large, at least 30% of it is free, and unused pages and many other pages don't need to be hot, e.g., raw JSON documents or uninteresting parts of the inverted index.

This new subcommand reads all the Arroy pages of a given index to make them hot, and only those. More coming...

The current algorithm is single-threaded and takes a lot of time. I am in the process of multithreading it. This is the time it takes to hair dry a 305GiB database with a single thread.

```
real    21m51.054s
user    0m3.155s
sys     0m19.393s
```

## To Do
- [ ] (optional) Do the reads in parallel.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-12 10:45:16 +00:00
..
benchmarks fix the bad index version on opening 2025-01-23 16:51:24 +01:00
build-info Upgrade compatible dependencies 2025-01-08 13:52:14 +01:00
dump use serde_json::to_writer instead of serializing + writing 2025-02-11 11:14:49 +01:00
file-store Upgrade incompatible dependencies 2025-01-08 15:58:03 +01:00
filter-parser Fix insta to 1.39 2025-01-08 15:18:08 +01:00
flatten-serde-json Move crates under a sub folder to clean up the code 2024-10-21 08:18:43 +02:00
fuzzers fix the bad index version on opening 2025-01-23 16:51:24 +01:00
index-scheduler rename the atomic to something more meaningful 2025-02-11 11:14:49 +01:00
json-depth-checker Move crates under a sub folder to clean up the code 2024-10-21 08:18:43 +02:00
meili-snap Upgrade compatible dependencies 2025-01-08 13:52:14 +01:00
meilisearch Merge #5332 2025-02-11 18:51:33 +00:00
meilisearch-auth Upgrade incompatible dependencies 2025-01-08 15:58:03 +01:00
meilisearch-types fix the missing batch in the dumps in meilisearch and meilitools 2025-02-11 11:14:49 +01:00
meilitool Remove unsafes 2025-02-12 10:46:45 +01:00
milli Change the updated* functions to only_new functions, hopefully better communicating what they do 2025-02-11 15:27:10 +01:00
permissive-json-pointer Add indices field to _matchesPosition to specify where in an array a match comes from (#5005) 2024-11-20 01:00:43 +01:00
tracing-trace Upgrade compatible dependencies 2025-01-08 13:52:14 +01:00
xtask Fix after upgrading sysinfo 2025-01-08 15:59:30 +01:00