199 Commits

Author SHA1 Message Date
ManyTheFish
689e69d6d2 Take into account PR messages 2025-03-10 13:46:33 +01:00
ManyTheFish
8ec0c322ea Apply PR requests related to Refactor the FieldIdMapWithMetadata 2025-03-06 11:42:53 +01:00
ManyTheFish
ae8d453868 Refactor Document indexing process (searchables)
**Changes:**
The searchable database extraction is now relying on the AttributePatterns and FieldIdMapWithMetadata to match the field to extract.
Remove the SearchableExtractor trait to make the code less complex.

**Impact:**
- Document Addition/modification searchable indexing
- Document deletion searchable indexing
2025-03-03 10:32:42 +01:00
ManyTheFish
95bccaf5f5 Refactor Document indexing process (Facets)
**Changes:**
The Documents changes now take a selector closure instead of a list of field to match the field to extract.
The seek_leaf_values_in_object function now uses a selector closure of a list of field to match the field to extract
The facet database extraction is now relying on the FilterableAttributesRule to match the field to extract.
The facet-search database extraction is now relying on the FieldIdMapWithMetadata to select the field to index.
The facet level database extraction is now relying on the FieldIdMapWithMetadata to select the field to index.

**Important:**
Because the filterable attributes are patterns now,
the fieldIdMap will only register the fields that exists in at least one document.
if a field doesn't exist in any document, it will not be registered even if it has been specified in the filterable fields.

**Impact:**
- Document Addition/modification facet indexing
- Document deletion facet indexing
2025-03-03 10:32:03 +01:00
ManyTheFish
d25953f322
fix clippy 2025-02-26 17:02:43 +01:00
ManyTheFish
9f3663e768
Implement Incremental document database stats computing 2025-02-26 17:01:35 +01:00
Kerollmops
76fd5d92d7
Clarify the tail writing to database 2025-02-20 17:35:23 +01:00
Kerollmops
245a55722a
Remove commented code 2025-02-20 16:48:18 +01:00
Kerollmops
05cc8c650c
Expose the write channel congestion in the batches 2025-02-19 15:47:54 +01:00
meili-bors[bot]
0f1aeb8eaa
Merge #5351
5351: Bring back v1.13.0 changes into main r=irevoire a=Kerollmops

This PR brings back the changes made in v1.13 into the main branch.

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Clémentine <clementine@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2025-02-18 08:05:02 +00:00
Louis Dureuil
b83275c9c5
Change the updated* functions to only_new functions, hopefully better communicating what they do 2025-02-11 15:27:10 +01:00
Louis Dureuil
d7f35ee3ba
Use merged document instead of updated 2025-02-11 15:27:10 +01:00
meili-bors[bot]
0c3e7fe963
Merge #5316
5316: Fix the dumpless upgrade corruption r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/5280

## What does this PR do?
- Add a test that ensure we write the version in the index-scheduler even if we have a bug while writing the VERSION file
- Do what was described in the issue


Co-authored-by: Tamo <tamo@meilisearch.com>
2025-02-10 09:53:57 +00:00
Tamo
45f843ccb9 fmt 2025-02-10 10:46:42 +01:00
Kerollmops
2b0e17ede0
Make sure arroy is using the rayon thread-pool 2025-02-06 15:28:10 +01:00
meili-bors[bot]
796acd1aee
Merge #5288
5288: Improve AI logging r=dureuill a=Kerollmops

This PR fixes #5285 and brings the changes from #5233 to simplify debugging indexation and search performance issues related to AI. The following texts can be found in the logs to debug and understand performance issues:

 - `embed_one: search` represents the time we spent waiting for the embedding generation, i.e., OpenAI, local HuggingFace, Ollama.
 - `filtered_universe: search::universe` the time spent filtering the documents.
 - ~`next_bucket: search::vector_sort` is the time spent finding the nearest neighbors (ANNs) in the vector store (arroy), locally~ was being triggered too many times.
 - `indexing::vectors` is the time arroy spends indexing the new vectors for a batch.
 - `documents::extract vectors` and `documents::merge vectors` to see the time spent generating and writing the embeddings.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-04 10:20:45 +00:00
Tamo
d34f0b606c
Update crates/milli/src/update/new/document_change.rs 2025-02-03 12:08:52 +01:00
Kerollmops
acc400face
Support merging update and replacement operations 2025-02-03 11:47:17 +01:00
Kerollmops
8e6893ddbe
Make sure we correctly mix different document operations 2025-02-03 10:34:06 +01:00
Kerollmops
424c5bde40
Move the embedding computation and extraction log to debug 2025-01-29 16:40:36 +01:00
Kerollmops
cb1b7513af
Log the memory metrics only once 2025-01-29 15:21:52 +01:00
Clément Renault
a9d0f4a002
Improve english comments
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-01-29 15:16:40 +01:00
Kerollmops
db032079d8
Show indexation allocated memory 2025-01-29 14:21:02 +01:00
Clément Renault
a00796c46a
Improve the naming in the log message 2025-01-29 14:21:02 +01:00
Kerollmops
6112bd8caa
Display the channel congestion 2025-01-29 14:21:02 +01:00
Kerollmops
cec88cfc29
Measure the bbqueue congestion 2025-01-29 14:21:02 +01:00
Kerollmops
4a5923a55e
log the time arroy took to insert embeddings 2025-01-27 14:22:17 +01:00
Clément Renault
9b579069df
Comment the max grant of the bbqueue
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-01-24 12:18:32 +01:00
Louis Dureuil
f5a4a1c8b2
Give more RAM to bbqueue.
- bbqueue buffers used to have (5% * 2%) / num_threads
- they now have 5% / num_threads
2025-01-24 12:18:32 +01:00
Kerollmops
5ab4cdb1f3
Reduce the maximum grant possible we can store in the BBQueue 2025-01-24 12:18:32 +01:00
Louis Dureuil
d6063079af
Unify facet strings by their normalized value 2025-01-22 15:50:42 +01:00
Louis Dureuil
a6470a0c37
Improve error log 2025-01-22 15:50:41 +01:00
Louis Dureuil
8a54f14b8e
Demote panic to error log 2025-01-22 15:49:24 +01:00
Kerollmops
63c8cbae5b
Improve the panic message when deleting an unknown entry 2025-01-14 10:31:44 +01:00
Louis Dureuil
72ded27e98
Update after review 2025-01-14 10:24:50 +01:00
Louis Dureuil
4070895a21
Add support to upgrade to v1.12.3 in meilitool 2025-01-14 10:24:27 +01:00
Louis Dureuil
a21711f473
Fix test 2025-01-14 10:23:59 +01:00
meili-bors[bot]
247eaed872
Merge #5221
5221: Merge bitmaps by using `Extend::extend` r=Kerollmops a=Kerollmops

This PR tries to speed up the merging of bitmaps by using [the new `Extend::extend` implementation](https://github.com/RoaringBitmap/roaring-rs/pull/306).

Co-authored-by: Clément Renault <clement@meilisearch.com>
2025-01-13 13:43:28 +00:00
meili-bors[bot]
cc4aca78c4
Merge #5220
5220: Merge back changes of v1.12.2 in main r=dureuill a=dureuill



Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: dureuill <dureuill@users.noreply.github.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-01-13 10:54:36 +00:00
Clément Renault
00a03742ff
Prefer using extend when merging bitmaps than unions (less allocations) 2025-01-09 10:42:38 +01:00
Louis Dureuil
4aa7c8f7b1
Remove unused FacetFieldIdOperation 2025-01-09 10:36:37 +01:00
Louis Dureuil
c14967eeac
Use new incremental facet indexing and enable sanity checks in debug 2025-01-09 10:36:35 +01:00
Tamo
908adee6fc
Fix the addition of empty payload 2025-01-09 10:24:36 +01:00
Clément Renault
71e5605daa
Make clippy happy 2025-01-08 18:24:39 +01:00
Louis Dureuil
4275833bab
Rename compute.rs to post_process.rs 2025-01-07 15:31:20 +01:00
Louis Dureuil
de7f8c4406
refactor indexer mod 2025-01-07 15:29:02 +01:00
Gnosnay
44eb153619 Replace hardcoded string with constants 2024-12-28 20:35:55 +08:00
meili-bors[bot]
ba11121cfc
Merge #5159
5159: Fix the New Indexer Spilling r=irevoire a=Kerollmops

Fix two bugs in the merging of the spilled caches. Thanks to `@ManyTheFish` and `@irevoire` 👏

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-12-12 17:16:53 +00:00
ManyTheFish
acdd5aa6ea
Use the thread source id instead of the destination id
when filtering on the cache to merge
2024-12-12 18:12:00 +01:00
Kerollmops
2f3cc8cdd2
Fix the merge_caches_sorted function 2024-12-12 16:15:37 +01:00