Commit Graph

1043 Commits

Author SHA1 Message Date
meili-bors[bot]
3753f87fd8
Merge #5011
5011: Revamp analytics r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/5009

## What does this PR do?
- Force every analytics to go through a trait that forces you to handle aggregation correcty
- Put the code to retrieve the `user-agent`, `timestamp` and `requests.total_received` in common between all aggregates, so there is no mistake
- Get rids of all the different channel for each kind of event in favor of an any map
- Ensure that we never [send empty event ever again](https://github.com/meilisearch/meilisearch/pull/5001)
- Merge all the sub-settings route into a global « Settings Updated » event.
- Fix: When using one of the three following feature, we were not sending any analytics IF they were set from the global route
  - /non-separator-tokens
  - /separator-tokens
  - /dictionary

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-10-21 15:08:49 +00:00
Tamo
5675585fe8 move all the searches structures to new modules 2024-10-20 17:54:43 +02:00
Tamo
af589c85ec reverse all the settings to keep the last one received instead of the first one received in case we receive the same setting multiple times 2024-10-20 17:40:31 +02:00
Tamo
ac919df37d simplify the trait a bit more by getting rids of the downcast_aggregate method 2024-10-20 17:36:29 +02:00
Tamo
73b5722896 rename the other parameter of the aggregate method to new to avoid confusion 2024-10-20 17:31:35 +02:00
Tamo
c94679bde6 apply review comments 2024-10-20 17:24:12 +02:00
Tamo
89e2d2b2b9 fix the doctest 2024-10-17 13:55:49 +02:00
Tamo
3a7a20c716 remove the segment feature and always import segment 2024-10-17 11:21:14 +02:00
Tamo
fa1db6b721 fix the tests 2024-10-17 09:55:30 +02:00
Tamo
1ab6fec903 send all experimental features in the info event including the runtime one 2024-10-17 09:49:21 +02:00
Tamo
18ac4032aa Remove the experimental feature seen 2024-10-17 09:35:11 +02:00
Tamo
d9115b74f0 move the analytics settings code to a dedicated file 2024-10-17 09:32:54 +02:00
Tamo
0fde49640a make clippy happy 2024-10-17 09:18:25 +02:00
Tamo
4ee65d870e remove a lot of ununsed code 2024-10-17 09:14:34 +02:00
Tamo
ef77c7699b add the required shared values between all the events and fix the timestamp 2024-10-17 09:06:23 +02:00
Tamo
7382fb21e4 fix the main 2024-10-17 08:38:11 +02:00
Tamo
e4ace98004 fix all the routes + move to a better version of mopa 2024-10-17 01:04:25 +02:00
Tamo
aa7a34ffe8 make the aggregate method send 2024-10-17 00:43:34 +02:00
Tamo
6728cfbfac fix the analytics 2024-10-17 00:38:18 +02:00
Tamo
ea6883189e finish the analytics in all the routes 2024-10-16 21:17:06 +02:00
Tamo
fdeb47fb54 implements all routes 2024-10-16 17:16:33 +02:00
Tamo
e66fccc3f2 get rids of the analytics closure 2024-10-16 15:51:48 +02:00
Tamo
73e87c152a rewrite most of the analytics especially the settings 2024-10-16 15:43:27 +02:00
meili-bors[bot]
75b2f22add
Merge #5008
5008: Display vectors when no custom vectors where ever provided r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes the issue reported on [Discord](https://discord.com/channels/1006923006964154428/1294653031958446080/1295336784896589967).

## What does this PR do?
- Normal behavior of Meilisearch is to hide `_vectors` even when `retrieveVectors: true` when there is an explicit list of displayed attributes that does not contain vectors
- However, this relied on the field id for the `_vectors` field to exist, which wasn't the case when no `_vectors` was manually provided to documents. This would often be the case for people using autoembedders such as the OpenAI integration.
- This PR fixes the behavior by looking for the `_vectors` string in the `displayedAttributes` when there is no `_vectors` fid.
- This PR also adds a test for this specific situation, that would fail before the PR, and pass after the PR


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-10-15 13:08:47 +00:00
Louis Dureuil
5a74d4729c
Add test failing before this PR, OK now 2024-10-14 16:23:28 +02:00
Louis Dureuil
e44e7b5e81
Fix retrieveVectors when explicitly passed in displayed attributes without any document containing _vectors 2024-10-14 16:17:19 +02:00
Tamo
4b4a6c7863 Update meilisearch/src/option.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-10-14 14:39:34 +02:00
Tamo
3085092e04 Update meilisearch/src/option.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-10-14 14:39:34 +02:00
Tamo
c4efd1df4e Update meilisearch/src/option.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-10-14 14:39:34 +02:00
Tamo
c32282acb1 improve doc 2024-10-14 14:39:34 +02:00
Tamo
92070a3578 Implement the experimental drop search after and nb search per core 2024-10-14 14:39:33 +02:00
Tamo
466604725e Do not send empty edit document by function 2024-10-10 23:47:15 +02:00
curquiza
6e37ae8619 Update mini-dashboard 2024-10-09 19:13:14 +02:00
Tamo
7f5d0837c3 fix the bad experimental search queue size 2024-10-09 11:46:57 +02:00
Louis Dureuil
0c2661ea90
Fix tests 2024-10-02 11:20:29 +02:00
ManyTheFish
e9580fe619 Add turkish normalization 2024-09-25 11:03:17 +02:00
meili-bors[bot]
462a2329f1
Merge #4941
4941: Implement the binary quantization in meilisearch r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4873

## What does this PR do?
- Add a settings for the binary quantization
- Once enabled, the bq cannot be disabled

TODO:
- [ ] Missing a bunch of tests

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-09-19 15:50:24 +00:00
meili-bors[bot]
bd34ed01d9
Merge #4945
4945: Add swedish in default pipelines r=dureuill a=ManyTheFish

# Summary
## Fix Swedish support

In Swedish the characters `å`/`ä`/`ö` are completely different than `a` or `o`  and should not be normalized as the same character.
because the Swedish specialized pipeline was not activated by default, these characters were normalized even with the settings:
```json
{
  "localizedAttributes": [ { "locales": ["swe"], "attributePatterns": ["*"] } ]
}
```

## Update Charabia adding German support

German segmentation will now be activated using the setting:
```json
{
  "localizedAttributes": [ { "locales": ["deu"], "attributePatterns": ["*"] } ]
}
```

# TODO

- [x] Activate Swedish Pipeline
- [x] Add a test to avoid future regressions
- [x] Update Charabia


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-09-19 14:42:03 +00:00
ManyTheFish
465afe01b2 Add test for German 2024-09-19 16:09:01 +02:00
ManyTheFish
7d6768e4c4 Add german tokenization pipeline 2024-09-19 16:09:01 +02:00
ManyTheFish
f77661ec44 Update Charabia v0.9.1 2024-09-19 16:08:59 +02:00
Tamo
fd43c6c404 Improve the error message explaining you can't un-bq an embedder 2024-09-19 15:51:29 +02:00
Tamo
2b120b89e4 update the test now that the embedder must be specified 2024-09-19 12:08:59 +02:00
Tamo
633537ccd7 fix updating documents without updating the settings 2024-09-19 12:00:58 +02:00
Tamo
e8d7c00d30 add a test on the settings value 2024-09-19 12:00:58 +02:00
Tamo
3f6301dbc9 fix the missing embedder name in the error message when trying to disable the binary quantization 2024-09-19 12:00:58 +02:00
Tamo
ca71b63ed1 adds integration tests 2024-09-19 12:00:58 +02:00
Tamo
cc45e264ca implement the binary quantization in meilisearch 2024-09-19 12:00:56 +02:00
meili-bors[bot]
5f474a640d
Merge #4938
4938: Remove default embedder r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes #4738 

## What does this PR do?

[See public usage](https://meilisearch.notion.site/v1-11-AI-search-changes-0e37727193884a70999f254fa953ce6e#1044b06b651f80edb9d4ef6dc367bad0)

- Remove `hybrid.embedder` boolean from analytics because embedder is now mandatory and so the boolean would always be `true`
- Rework search kind so that a search without query but with vector is a vector search regardless of (non-zero) semantic ratio


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-19 09:17:14 +00:00
ManyTheFish
877717cb26 Add a test using Swedish documents 2024-09-19 08:34:04 +02:00