386 Commits

Author SHA1 Message Date
ManyTheFish
6d52c6e711 Merge branch 'main' into granular-filterable-attributes 2025-03-11 10:05:58 +01:00
ManyTheFish
dfb8411647 Revert "Remove filter pre-check"
This reverts commit b12ffd13569e1c90f7ae1b3a45211eec4594b0e2.
2025-03-11 09:48:30 +01:00
ManyTheFish
40c5f911fd Revert metadata creation when computing the facet-distribution 2025-03-10 17:05:41 +01:00
ManyTheFish
abef655849 Revert metadata creation when computing facet search and distinct 2025-03-10 15:45:59 +01:00
ManyTheFish
b12ffd1356 Remove filter pre-check 2025-03-10 14:29:45 +01:00
ManyTheFish
c9a4c6ed96 REvert metadata creation when computing filters at search time 2025-03-10 14:29:44 +01:00
ManyTheFish
689e69d6d2 Take into account PR messages 2025-03-10 13:46:33 +01:00
ManyTheFish
ed1dcbe0f7 Fix behavior change in the Attributes criterion 2025-03-06 14:18:25 +01:00
ManyTheFish
5ceddbda84 Add the max_weight of the weight map if it's lacking 2025-03-06 13:58:28 +01:00
ManyTheFish
ca41ce3bbd Old indexer document addition now check if facet search is globally activated 2025-03-06 11:43:42 +01:00
ManyTheFish
8ec0c322ea Apply PR requests related to Refactor the FieldIdMapWithMetadata 2025-03-06 11:42:53 +01:00
ManyTheFish
b88aa9cc76 Rely on FieldIdMapWithMetadata in facet search and filters 2025-03-05 18:22:12 +01:00
meili-bors[bot]
3fd86e8d76
Merge #5371
5371: Composite embedders r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #5343 

## What does this PR do?
- Implement [public usage](https://www.notion.so/meilisearch/Composite-embedder-usage-14a4b06b651f81859dc3df21e8cd02a0)
- Refactor the way we check if a parameter is mandatory/allowed/disallowed for a given source
- Take the "nesting context" into account for computer if a parameter is mandatory/allowed/disallowed
- Add tests checking all parameters with all sources, and made sure the results didn't change compared with v1.13

## Dumpless Upgrade

- This adds a new value for an existing parameter => compatible without change
- This adds new optional parameters => compatible without change

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-03-05 17:18:11 +00:00
ManyTheFish
67f7470c83 Apply PR requests related to Refactor search and facet-search 2025-03-05 18:17:42 +01:00
Louis Dureuil
4fab72cbea
Rename SettingsDiff::diff to SettingsDiff::apply_and_diff 2025-03-05 18:16:57 +01:00
Louis Dureuil
afb4b9677f
Remove Embedder:embed 2025-03-05 18:16:57 +01:00
Louis Dureuil
73d2dbd60f
Error handling 2025-03-05 18:16:57 +01:00
ManyTheFish
63e753bde0 Apply PR requests related to settings API 2025-03-05 12:05:40 +01:00
ManyTheFish
5fa4b5c50a Add a test on filterable attributes rules priority
**Changes:**
- Add a new test playing with filterable attributes rules priority
- Optimize the faceted field selector avoiding to match false positives
2025-03-05 09:44:52 +01:00
ManyTheFish
a7a62e5e4c Add some documentation in modules 2025-03-05 08:49:18 +01:00
meili-bors[bot]
683a2ac685
Merge #5379
5379: Bring back the changes from v1.13.2 into main r=dureuill a=Kerollmops



Co-authored-by: Kerollmops <Kerollmops@users.noreply.github.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-03-04 13:24:25 +00:00
ManyTheFish
23e07f1a93 Attribute positions changed in snapshots
**Reason:**
Only the existing field are registered in the fieldid_map
2025-03-03 10:33:39 +01:00
ManyTheFish
6dbec91d2b Index document in filterable attributes tests
**Reason:**
Because the filterable attributes are patterns now,
the fieldIdMap will only register the fields that exists in at least one document.
if a field doesn't exist in any document, it will not be registered even if it has been specified in the filterable fields.
2025-03-03 10:33:39 +01:00
ManyTheFish
9a75dc6ab3 Update tests using filterable attributes rules
**Changes:**
Replace the BTreeSet<String> by Vec<FilterableAttributesRule> without changing the test results

**Impact:**
- None
2025-03-03 10:33:34 +01:00
ManyTheFish
ae8d453868 Refactor Document indexing process (searchables)
**Changes:**
The searchable database extraction is now relying on the AttributePatterns and FieldIdMapWithMetadata to match the field to extract.
Remove the SearchableExtractor trait to make the code less complex.

**Impact:**
- Document Addition/modification searchable indexing
- Document deletion searchable indexing
2025-03-03 10:32:42 +01:00
ManyTheFish
95bccaf5f5 Refactor Document indexing process (Facets)
**Changes:**
The Documents changes now take a selector closure instead of a list of field to match the field to extract.
The seek_leaf_values_in_object function now uses a selector closure of a list of field to match the field to extract
The facet database extraction is now relying on the FilterableAttributesRule to match the field to extract.
The facet-search database extraction is now relying on the FieldIdMapWithMetadata to select the field to index.
The facet level database extraction is now relying on the FieldIdMapWithMetadata to select the field to index.

**Important:**
Because the filterable attributes are patterns now,
the fieldIdMap will only register the fields that exists in at least one document.
if a field doesn't exist in any document, it will not be registered even if it has been specified in the filterable fields.

**Impact:**
- Document Addition/modification facet indexing
- Document deletion facet indexing
2025-03-03 10:32:03 +01:00
ManyTheFish
659855c88e Refactor Settings Indexing process
**Changes:**
The transform structure is now relying on FieldIdMapWithMetadata and AttributePatterns to prepare
the obkv documents during a settings reindexing.
The InnerIndexSettingsDiff and InnerIndexSettings structs are now relying on FieldIdMapWithMetadata, FilterableAttributesRule and AttributePatterns to define the field and the databases that should be reindexed.
The faceted_fields_ids, localized_searchable_fields_ids and localized_faceted_fields_ids have been removed in favor of the FieldIdMapWithMetadata.
We are now relying on the FieldIdMapWithMetadata to retain vectors_fids from the facets and the searchables.

The searchable database computing is now relying on the FieldIdMapWithMetadata to know if a field is searchable and retrieve the locales.

The facet database computing is now relying on the FieldIdMapWithMetadata to compute the facet databases, the facet-search and retrieve the locales.

The facet level database computing is now relying on the FieldIdMapWithMetadata and the facet level database are cleared depending on the settings differences (clear_facet_levels_based_on_settings_diff).

The vector point extraction uses the FieldIdMapWithMetadata instead of FieldsIdsMapWithMetadata.

**Impact:**
- Dump import
- Settings update
2025-03-03 10:32:02 +01:00
ManyTheFish
286d310287 Fix inconsistency in attribute ranking rule computation
**Changes:**
The building of the Attributes ranking rule graph was comparing fieldids with weights
which doesn't make sense and may be bug prone, we are now comparing fieldids with fieldids.

**Impact:**
- search: Attribute ranking rule
2025-03-03 10:29:34 +01:00
ManyTheFish
4f7ece2411 Refactor the FieldIdMapWithMetadata
**Changes:**
The FieldIdMapWithMetadata structure now stores more information about fields.
The metadata_for_field function computes all the needed information relying on the user provided data instead of the enriched data (searchable/sortable)
which may solve an indexing bug on sortable attributes that was not matching the nested fields.

The FieldIdMapWithMetadata structure was duplicated in the embeddings as FieldsIdsMapWithMetadata,
so the FieldsIdsMapWithMetadata has been removed in favor of FieldIdMapWithMetadata.

The Facet distribution is now relying on the FieldIdMapWithMetadata with metadata to match is a field can be faceted.

**Impact:**
- searchable attributes matching
- searchable attributes weight computation
- sortable attributes matching
- faceted fields matching
- prompt computing
- facet distribution
2025-03-03 10:29:33 +01:00
ManyTheFish
967033579d Refactor search and facet-search
**Changes:**
The search filters are now using the FilterableAttributesFeatures from the FilterableAttributesRules to know if a field is filterable.
Moreover, the FilterableAttributesFeatures is more precise and an error will be returned if an operator is used on a field that doesn't have the related feature.
The facet-search is now checking if the feature is allowed in the FilterableAttributesFeatures and an error will be returned if the field doesn't have the related feature.

**Impact:**
- facet-search is now relying on AttributePatterns to match the locales
- search using filters is now relying on FilterableAttributesFeatures
- distinct attribute is now relying on FilterableAttributesRules
2025-03-03 10:25:32 +01:00
ManyTheFish
0200c65ebf Change the filterableAttributes setting API
**Changes:**
The filterableAttributes type has been changed from a `BTreeSet<String>` to a `Vec<FilterableAttributesRule>`,
Which is a list of rules defining patterns to match the documents' fields and a set of feature to apply on the matching fields.
The rule order given by the user is now an important information, the features applied on a filterable field will be chosen based on the rule order as we do for the LocalizedAttributesRules.
This means that the list will not be reordered anymore and will keep the user defined order,
moreover, if there are any duplicates, they will not be de-duplicated anymore.

**Impact:**
- Settings API
- the database format of the filterable attributes changed
- may impact the LocalizedAttributesRules due to the AttributePatterns factorization
- OpenAPI generator
2025-03-03 10:22:02 +01:00
meili-bors[bot]
c63c25a9a2
Merge #5355
5355: Support fetching the pooling method from the model configuration r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes #5354 

## What does this PR do?
- Fetches the pooling configuration from the model repository
- Use a pooling method that depends on the pooling configuration of that model.
- Allow overriding the pooling method with a new huggingFace embedder parameter `pooling`
  - for backward-compatibility with Meilisearch v1.13
  - for compatibility with embedders that exhibit the same behavior as Meilisearch v1.13
- Handle the default value of that new parameter
   - for compatibility, when importing a db/a dump, it should be set to `forceMean`
   - when (re)set from the settings for an embedder, it should be set to `useModel`


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-02-27 14:55:13 +00:00
Louis Dureuil
5e7f226ac9
Support dumpless upgrade for all v1.13 patches 2025-02-27 15:17:23 +01:00
ManyTheFish
d4063c9dcd
Fix fmt 2025-02-26 17:02:45 +01:00
Many the fish
abebc574f6
Update crates/milli/src/index.rs
Co-authored-by: Tamo <tamo@meilisearch.com>
2025-02-26 17:02:45 +01:00
Many the fish
f32ab67819
Update crates/milli/src/index.rs
Co-authored-by: Tamo <tamo@meilisearch.com>
2025-02-26 17:02:44 +01:00
ManyTheFish
d25953f322
fix clippy 2025-02-26 17:02:43 +01:00
ManyTheFish
405bbd04c1
Dumpless upgrade 2025-02-26 17:01:38 +01:00
ManyTheFish
9f3663e768
Implement Incremental document database stats computing 2025-02-26 17:01:35 +01:00
ManyTheFish
d9642ec916
Use checked_div in average computation 2025-02-26 17:01:34 +01:00
ManyTheFish
818e8b0237
Fix zero division 2025-02-26 17:01:31 +01:00
ManyTheFish
4f77a7fba5
fix clippy 2025-02-26 17:01:29 +01:00
ManyTheFish
9a6c1730aa
Add document database stats 2025-02-26 17:01:25 +01:00
ManyTheFish
15788773af
Check the exact_word database when computing zero typo query 2025-02-26 17:01:22 +01:00
Louis Dureuil
24fe6cd205
Fix multiple embeddings in hf 2025-02-24 16:24:04 +01:00
Louis Dureuil
e374b095a2
Fix tests 2025-02-24 14:11:26 +01:00
Louis Dureuil
9f3e4801b1
Refactor settings validation and introduce SubEmbedderSettings 2025-02-24 13:58:26 +01:00
Louis Dureuil
b85180fedb
Error types 2025-02-24 13:58:26 +01:00
Louis Dureuil
294cf39cad
Integrate composite embedder 2025-02-24 13:58:26 +01:00
Louis Dureuil
4a2643daa2
Rename embed_one to embed_search and embed_chunks* to embed_index* 2025-02-24 13:58:26 +01:00