Commit Graph

10481 Commits

Author SHA1 Message Date
meili-bors[bot]
462a2329f1
Merge #4941
4941: Implement the binary quantization in meilisearch r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4873

## What does this PR do?
- Add a settings for the binary quantization
- Once enabled, the bq cannot be disabled

TODO:
- [ ] Missing a bunch of tests

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-09-19 15:50:24 +00:00
Tamo
afa3ae0cbd WIP 2024-09-19 17:42:52 +02:00
Tamo
f6483cf15d apply review comment 2024-09-19 16:47:06 +02:00
meili-bors[bot]
bd34ed01d9
Merge #4945
4945: Add swedish in default pipelines r=dureuill a=ManyTheFish

# Summary
## Fix Swedish support

In Swedish the characters `å`/`ä`/`ö` are completely different than `a` or `o`  and should not be normalized as the same character.
because the Swedish specialized pipeline was not activated by default, these characters were normalized even with the settings:
```json
{
  "localizedAttributes": [ { "locales": ["swe"], "attributePatterns": ["*"] } ]
}
```

## Update Charabia adding German support

German segmentation will now be activated using the setting:
```json
{
  "localizedAttributes": [ { "locales": ["deu"], "attributePatterns": ["*"] } ]
}
```

# TODO

- [x] Activate Swedish Pipeline
- [x] Add a test to avoid future regressions
- [x] Update Charabia


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-09-19 14:42:03 +00:00
Tamo
74199f328d Make clippy happy 2024-09-19 16:27:34 +02:00
Tamo
1113c42de0 fix broken comments 2024-09-19 16:18:36 +02:00
ManyTheFish
465afe01b2 Add test for German 2024-09-19 16:09:01 +02:00
ManyTheFish
7d6768e4c4 Add german tokenization pipeline 2024-09-19 16:09:01 +02:00
ManyTheFish
f77661ec44 Update Charabia v0.9.1 2024-09-19 16:08:59 +02:00
Tamo
b8fd85a46d Get rids of useless collect before an iteration on the readers 2024-09-19 15:57:38 +02:00
Tamo
fd43c6c404 Improve the error message explaining you can't un-bq an embedder 2024-09-19 15:51:29 +02:00
Tamo
2564ec1496
Update milli/src/index.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-19 15:41:44 +02:00
Tamo
b6b73fe41c
Update milli/src/update/settings.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-19 15:41:14 +02:00
Tamo
6dde41cc46 stop using a local version of arroy and instead point to the git repo with the rev 2024-09-19 15:25:38 +02:00
Tamo
163f8023a1 remove debug println 2024-09-19 12:13:25 +02:00
Tamo
2b120b89e4 update the test now that the embedder must be specified 2024-09-19 12:08:59 +02:00
Tamo
84f842233d snapshots the embedder settings in the dump import with vector test 2024-09-19 12:00:58 +02:00
Tamo
633537ccd7 fix updating documents without updating the settings 2024-09-19 12:00:58 +02:00
Tamo
e8d7c00d30 add a test on the settings value 2024-09-19 12:00:58 +02:00
Tamo
3f6301dbc9 fix the missing embedder name in the error message when trying to disable the binary quantization 2024-09-19 12:00:58 +02:00
Tamo
ca71b63ed1 adds integration tests 2024-09-19 12:00:58 +02:00
Tamo
2b6952eda1 rename the ArroyReader to an ArroyWrapper since it can read and write 2024-09-19 12:00:58 +02:00
Tamo
79f29eed3c fix the tests and the arroy_readers method 2024-09-19 12:00:58 +02:00
Tamo
cc45e264ca implement the binary quantization in meilisearch 2024-09-19 12:00:56 +02:00
meili-bors[bot]
5f474a640d
Merge #4938
4938: Remove default embedder r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes #4738 

## What does this PR do?

[See public usage](https://meilisearch.notion.site/v1-11-AI-search-changes-0e37727193884a70999f254fa953ce6e#1044b06b651f80edb9d4ef6dc367bad0)

- Remove `hybrid.embedder` boolean from analytics because embedder is now mandatory and so the boolean would always be `true`
- Rework search kind so that a search without query but with vector is a vector search regardless of (non-zero) semantic ratio


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-19 09:17:14 +00:00
ManyTheFish
bbaee3dbc6 Add Swedish pipeline in all-tokenization feature 2024-09-19 08:34:51 +02:00
ManyTheFish
877717cb26 Add a test using Swedish documents 2024-09-19 08:34:04 +02:00
F. Levi
0ffeea5a52 Remove wrong comments 2024-09-19 09:06:40 +03:00
Ian Ornstein
716817122a Correct broken links in README 2024-09-18 16:30:29 -05:00
meili-bors[bot]
ff523a2357
Merge #4939
4939: Introduce the `STARTS WITH` filter operator r=irevoire a=Kerollmops

This PR fixes #4872 by introducing the `STARTS WITH` filter operator and gating it under the _contains filter_ experimental feature along with the `CONTAINS` one. I also updated [the experimental feature discussion page](https://github.com/orgs/meilisearch/discussions/763).

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-09-18 10:19:48 +00:00
meili-bors[bot]
29c3aca72a
Merge #4929
4929: Add facets support to federated r=Kerollmops a=dureuill

# Pull Request

## Related issue 

- Fixes #4932 (sprint issue)
- Fixes  #4913 (user-opened issue)

## What does this PR do?

See [public usage](https://meilisearch.notion.site/v1-11-Federated-search-59b30e03383c40729d7541a3dffb0069)

> [!CAUTION]
> This PR introduces a 🚨**breaking change**🚨: `queries.facets` when `federation` is present and non-`null` is now **an error**

### Implementation standpoint:

- Facet distribution: fix issue where truncated facet distribution would have a wrong order
- facet distribution: implement Display for OrderBy


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-18 09:47:20 +00:00
Louis Dureuil
00f8d03f43
Use f32::min and f32::max 2024-09-18 11:46:10 +02:00
Clément Renault
50981ea778
Update the error messages 2024-09-18 11:44:29 +02:00
Louis Dureuil
c2caff1716
Remove obsolete enum 2024-09-18 11:26:43 +02:00
F. Levi
30aa1f6dea Merge with main 2024-09-18 11:03:33 +03:00
F. Levi
83113998f9 Add more test assertions 2024-09-18 10:35:23 +03:00
meili-bors[bot]
4c355bede7
Merge #4937
4937: Support iso 639 1 r=ManyTheFish a=ManyTheFish

# Pull Request

## Related issue
Fixes #4827

## What does this PR do?
- Add iso-639-1 variants to the Locales enum
- Convert iso-639-1 into iso-639-3


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-09-18 05:29:32 +00:00
Louis Dureuil
174d69ff72
Don't override max value in indexes 2024-09-17 18:16:14 +02:00
Louis Dureuil
52a52f97cf
Update tests 2024-09-17 17:49:12 +02:00
Louis Dureuil
5de4b48552
Fixup error messages 2024-09-17 17:49:00 +02:00
Louis Dureuil
df648ce7a6
Update tests 2024-09-17 17:40:14 +02:00
Louis Dureuil
af8edab21d
Remove mention of sort order and recommend changing index settings on inconsistent order error 2024-09-17 17:39:51 +02:00
Louis Dureuil
c42746c4cd
Update tests 2024-09-17 17:22:14 +02:00
Louis Dureuil
98b77aec66
Remove runtime sortFacetValuesBy 2024-09-17 17:22:03 +02:00
Clément Renault
54d3ba3357
Fix tests that check error message content 2024-09-17 17:14:39 +02:00
ManyTheFish
6e058709f2 Rustfmt 2024-09-17 17:02:06 +02:00
ManyTheFish
0fbf9ea5b1 Factorize using macro 2024-09-17 17:00:03 +02:00
Clément Renault
9f1fb4b425
Introduce the STARTS WITH filter operator gated under an experimental feature 2024-09-17 16:44:11 +02:00
F. Levi
f7337affd6 Adjust tests to changes 2024-09-17 17:31:09 +03:00
Louis Dureuil
1120a5296c
Update tests 2024-09-17 16:30:43 +02:00