300 Commits

Author SHA1 Message Date
Tamo
b025f1bcf1
Merge branch 'main' into release-v1.14.0-tmp 2025-04-14 12:35:47 +02:00
Clément Renault
a0bfcf8872
Make cargo fmt happy 2025-04-01 11:27:41 +02:00
Clément Renault
64477aac60
Box the large GeoError error variant 2025-04-01 11:26:34 +02:00
Clément Renault
4d90e3d2ec
Make Cargo and Clippy happy 2025-04-01 11:26:34 +02:00
Louis Dureuil
f729864466
Check dimension mismatch at insertion time 2025-03-31 15:27:49 +02:00
Clément Renault
bb2e9419d3
Merge pull request #5468 from meilisearch/more-precise-post-processing
More Precise Post Processing
2025-03-27 10:07:09 +00:00
Kerollmops
811143cbe9
Add more progress precision when doing post processing 2025-03-27 10:17:28 +01:00
Kerollmops
db7ce03763
Improve the performances of computing the size of the documents database 2025-03-26 17:40:12 +01:00
vuthanhtung2412
bf3a29b60d Document problematic case in test and acknowledge PR comment 2025-03-26 12:57:25 +01:00
vuthanhtung2412
43c8a206b4 detail comments 2025-03-25 13:07:17 +01:00
vuthanhtung2412
6b1c262b74 fix all tests 2025-03-25 12:43:15 +01:00
vuthanhtung2412
d71c6f3483 allow multiple embedding in per document per embedder to pass 2025-03-25 12:04:25 +01:00
Many the fish
a09d08c7b6 Avoid reindexing searchable order changes
Update settings.rs

Update settings.rs
2025-03-24 16:26:52 +01:00
vuthanhtung2412
e019ad7692 Display more detailed error message instead of panic 2025-03-21 15:41:31 +01:00
meili-bors[bot]
cbdf80893d
Merge #5422
5422: Add more progress levels to measure merging r=Kerollmops a=Kerollmops

I found out that Meilisearch was not correctly reporting the long indexing times in the progress and that a lot of time was spent on extracting words with all documents already extracted. The reason was that there was no step to report merging the cache and sending the entries to write to the writer thread. This PR adds these entries to the progress.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-03-17 12:02:46 +00:00
Kerollmops
e2156ddfc7
Simplify the IndexingStep progress enum 2025-03-17 11:40:50 +01:00
meili-bors[bot]
13a88d6131
Merge #5407
5407: Geo update bug r=irevoire a=ManyTheFish

# Pull Request

## Related issue
Fixes #5380
Fixes #5399



Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2025-03-17 10:24:33 +00:00
Kerollmops
cb16baab18
Add more progress levels to measure merging 2025-03-17 10:13:29 +01:00
Tamo
009c36a4d0 Add support for the progress API of arroy 2025-03-13 19:00:43 +01:00
Louis Dureuil
e2d372823a
Disable the cache by default and make it experimental 2025-03-13 17:22:51 +01:00
Tamo
5ef7767429 Let arroy uses all the memory available instead of 50% of the 70% 2025-03-13 15:06:03 +01:00
Clément Renault
a92a48b9b9
Do not recompute stats on dumpless upgrade
Co-authored-by: Tamo <tamo@meilisearch.com>
2025-03-13 13:58:58 +01:00
Tamo
d53225bf64 uses a random seed instead of 42 2025-03-13 12:43:31 +01:00
Tamo
ef9d9f8481
set the memory in arroy 2025-03-13 11:29:00 +01:00
Kerollmops
fedb444e66
Fix the upgrade arroy calls 2025-03-13 11:07:49 +01:00
Kerollmops
566b4efb06
Dumpless upgrade from v1.13 to v1.14 2025-03-13 11:07:44 +01:00
Kerollmops
21bbbdec76
Specify WithoutTls everywhere 2025-03-13 11:07:38 +01:00
Kerollmops
34df44a002
Open Env without TLS 2025-03-13 11:07:38 +01:00
Kerollmops
0197dc87e0
Make sure to delete useless prefixes 2025-03-12 11:24:13 +01:00
ManyTheFish
d3cd5ea689 Check if the geo fields changed additionally to the other faceted fields when reindexing facets 2025-03-12 11:20:10 +01:00
ManyTheFish
ea7e299663 Update has_changed_for_fields documentation 2025-03-11 16:48:55 +01:00
ManyTheFish
8790880589 Fix clippy 2025-03-11 15:22:39 +01:00
ManyTheFish
6d52c6e711 Merge branch 'main' into granular-filterable-attributes 2025-03-11 10:05:58 +01:00
ManyTheFish
abef655849 Revert metadata creation when computing facet search and distinct 2025-03-10 15:45:59 +01:00
ManyTheFish
689e69d6d2 Take into account PR messages 2025-03-10 13:46:33 +01:00
ManyTheFish
ca41ce3bbd Old indexer document addition now check if facet search is globally activated 2025-03-06 11:43:42 +01:00
ManyTheFish
8ec0c322ea Apply PR requests related to Refactor the FieldIdMapWithMetadata 2025-03-06 11:42:53 +01:00
ManyTheFish
b88aa9cc76 Rely on FieldIdMapWithMetadata in facet search and filters 2025-03-05 18:22:12 +01:00
meili-bors[bot]
3fd86e8d76
Merge #5371
5371: Composite embedders r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #5343 

## What does this PR do?
- Implement [public usage](https://www.notion.so/meilisearch/Composite-embedder-usage-14a4b06b651f81859dc3df21e8cd02a0)
- Refactor the way we check if a parameter is mandatory/allowed/disallowed for a given source
- Take the "nesting context" into account for computer if a parameter is mandatory/allowed/disallowed
- Add tests checking all parameters with all sources, and made sure the results didn't change compared with v1.13

## Dumpless Upgrade

- This adds a new value for an existing parameter => compatible without change
- This adds new optional parameters => compatible without change

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-03-05 17:18:11 +00:00
meili-bors[bot]
683a2ac685
Merge #5379
5379: Bring back the changes from v1.13.2 into main r=dureuill a=Kerollmops



Co-authored-by: Kerollmops <Kerollmops@users.noreply.github.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-03-04 13:24:25 +00:00
ManyTheFish
9a75dc6ab3 Update tests using filterable attributes rules
**Changes:**
Replace the BTreeSet<String> by Vec<FilterableAttributesRule> without changing the test results

**Impact:**
- None
2025-03-03 10:33:34 +01:00
ManyTheFish
ae8d453868 Refactor Document indexing process (searchables)
**Changes:**
The searchable database extraction is now relying on the AttributePatterns and FieldIdMapWithMetadata to match the field to extract.
Remove the SearchableExtractor trait to make the code less complex.

**Impact:**
- Document Addition/modification searchable indexing
- Document deletion searchable indexing
2025-03-03 10:32:42 +01:00
ManyTheFish
95bccaf5f5 Refactor Document indexing process (Facets)
**Changes:**
The Documents changes now take a selector closure instead of a list of field to match the field to extract.
The seek_leaf_values_in_object function now uses a selector closure of a list of field to match the field to extract
The facet database extraction is now relying on the FilterableAttributesRule to match the field to extract.
The facet-search database extraction is now relying on the FieldIdMapWithMetadata to select the field to index.
The facet level database extraction is now relying on the FieldIdMapWithMetadata to select the field to index.

**Important:**
Because the filterable attributes are patterns now,
the fieldIdMap will only register the fields that exists in at least one document.
if a field doesn't exist in any document, it will not be registered even if it has been specified in the filterable fields.

**Impact:**
- Document Addition/modification facet indexing
- Document deletion facet indexing
2025-03-03 10:32:03 +01:00
ManyTheFish
659855c88e Refactor Settings Indexing process
**Changes:**
The transform structure is now relying on FieldIdMapWithMetadata and AttributePatterns to prepare
the obkv documents during a settings reindexing.
The InnerIndexSettingsDiff and InnerIndexSettings structs are now relying on FieldIdMapWithMetadata, FilterableAttributesRule and AttributePatterns to define the field and the databases that should be reindexed.
The faceted_fields_ids, localized_searchable_fields_ids and localized_faceted_fields_ids have been removed in favor of the FieldIdMapWithMetadata.
We are now relying on the FieldIdMapWithMetadata to retain vectors_fids from the facets and the searchables.

The searchable database computing is now relying on the FieldIdMapWithMetadata to know if a field is searchable and retrieve the locales.

The facet database computing is now relying on the FieldIdMapWithMetadata to compute the facet databases, the facet-search and retrieve the locales.

The facet level database computing is now relying on the FieldIdMapWithMetadata and the facet level database are cleared depending on the settings differences (clear_facet_levels_based_on_settings_diff).

The vector point extraction uses the FieldIdMapWithMetadata instead of FieldsIdsMapWithMetadata.

**Impact:**
- Dump import
- Settings update
2025-03-03 10:32:02 +01:00
meili-bors[bot]
c63c25a9a2
Merge #5355
5355: Support fetching the pooling method from the model configuration r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes #5354 

## What does this PR do?
- Fetches the pooling configuration from the model repository
- Use a pooling method that depends on the pooling configuration of that model.
- Allow overriding the pooling method with a new huggingFace embedder parameter `pooling`
  - for backward-compatibility with Meilisearch v1.13
  - for compatibility with embedders that exhibit the same behavior as Meilisearch v1.13
- Handle the default value of that new parameter
   - for compatibility, when importing a db/a dump, it should be set to `forceMean`
   - when (re)set from the settings for an embedder, it should be set to `useModel`


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-02-27 14:55:13 +00:00
Louis Dureuil
5e7f226ac9
Support dumpless upgrade for all v1.13 patches 2025-02-27 15:17:23 +01:00
ManyTheFish
d25953f322
fix clippy 2025-02-26 17:02:43 +01:00
ManyTheFish
405bbd04c1
Dumpless upgrade 2025-02-26 17:01:38 +01:00
ManyTheFish
9f3663e768
Implement Incremental document database stats computing 2025-02-26 17:01:35 +01:00
Louis Dureuil
e374b095a2
Fix tests 2025-02-24 14:11:26 +01:00