Commit Graph

991 Commits

Author SHA1 Message Date
meili-bors[bot]
9a756cf2c5
Merge #4888
4888: bring back v1.10.0 into main r=Kerollmops a=ManyTheFish



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-08-27 14:02:08 +00:00
ManyTheFish
b12e997c8a Add pinyin flag 2024-08-21 14:38:04 +02:00
ManyTheFish
8bf89ec394 Infer locales from index settings 2024-08-21 10:47:40 +02:00
meili-bors[bot]
ee62d9ce30
Merge #4845
4845: Fix perf regression facet strings r=ManyTheFish a=dureuill

Benchmarks between v1.9 and v1.10 show a performance regression of about x2 (+3dB regression) for most indexing workloads (+44s for hackernews).

[Benchmark interpretation in the engine weekly meeting](https://www.notion.so/meilisearch/Engine-weekly-4d49560d374c4a87b4e3d126a261d4a0?pvs=4#98a709683276450295fcfe1f8ea5cef3).

- Initial investigation pointed to #4819 as the origin of the regression.
- Further investigation points towards the hypernormalization of each facet value in `extract_facet_string_docids`
- Most of the slowdown is in `normalize_facet_strings`, and precisely in `detection.language()`.

This PR improves the situation (-10s compared with `main` for hackernews, so only +34s regression compared with `v1.9`) by skipping normalization when it can be skipped.

I'm not sure how to fix the root cause though. Should we skip facet locale normalization for now? Cc `@ManyTheFish` 

---

Tentative resolution options:

1. remove locale normalization from facet. I'm not sure why this is required, I believe we weren't doing this before, so maybe we can stop doing that again.
2. don't do language detection when it can be helped: won't help with the regressions in benchmark, but maybe we can skip language detection when the locales contain only one language?
3. use a faster language detection library: `@Kerollmops` told me about https://github.com/quickwit-oss/whichlang which bolsters x10 to x100 throughput compared with whatlang. Should we consider replacing whatlang with whichlang? Now I understand whichlang supports fewer languages than whatlang, so I also suggest:
4. use whichlang when the list of locales is empty (autodetection), or when it only contains locales that whichlang can detect. If the list of locales contains locales that whichlang *cannot* detect, **then** use whatlang instead.

---

> [!CAUTION]
> this PR contains a commit that adds detailed spans, that were used to detect which part of `extract_facet_string_docids` was taking too much time. As this commit adds spans that are called too often and adds 7s overhead, it should be removed before landing.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-08-19 06:29:48 +00:00
ManyTheFish
ade54493ab Only detect language for a facet if several locales have been specified by the user in the settings 2024-08-14 12:03:52 +02:00
meili-bors[bot]
2d16d0aea1
Merge #4839
4839: In prometheus metrics return the route pattern instead of the real route when returning the HTTP requests total r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4825

## What does this PR do?
- return the route pattern instead of the real route when returning the HTTP requests total


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-08-05 10:14:51 +00:00
Louis Dureuil
21aa430b5e
Fix openai tests 2024-07-31 17:57:55 +02:00
Louis Dureuil
8535dc0be2
Fix existing tests 2024-07-31 17:57:32 +02:00
Louis Dureuil
72b9005344
Redact uid for Value 2024-07-31 17:57:13 +02:00
Louis Dureuil
ab1ec9ca21
Add tokenized test 2024-07-31 15:03:45 +02:00
Louis Dureuil
9d6efd92d2
new assets for tokenized test 2024-07-31 15:03:45 +02:00
Louis Dureuil
abdb337fd6
Add openai tests 2024-07-31 15:03:45 +02:00
Louis Dureuil
1c755c8899
Add openai responses 2024-07-31 15:03:45 +02:00
Louis Dureuil
3a42c3134e
update tests after changing authorized error message 2024-07-31 15:03:45 +02:00
meili-bors[bot]
25791e3f46
Merge #4836
4836: Attach declared localized-attributes subroutes r=dureuill a=dureuill

RC.0 unexpectedly doesn't contain the `GET /indexes/{indexUid}/localized-attributes` and `PUT /indexes/{indexUid}/localized-attributes` subroute.

This PR makes them available.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-30 19:01:54 +00:00
Tamo
f05ea04879 In prometheus metrics return the route pattern instead of the real route when returning the HTTP requests total 2024-07-30 16:24:49 +02:00
Tamo
b1b3a1a98b add a get, set and put test for the localized attributes setting 2024-07-30 15:51:02 +02:00
Tamo
c457069367 ensure a test is 100% not flaky 2024-07-30 15:41:51 +02:00
Tamo
bb1283222e make clippy happy 2024-07-30 15:10:56 +02:00
Tamo
7a5a38f870 fix a sync issue on empty indexes 2024-07-30 15:09:12 +02:00
Tamo
ded3cd0dd6 an additionnal 30% of perf for the tests 2024-07-30 15:03:54 +02:00
Tamo
68f885f1c4 fix two snapshots 2024-07-30 14:45:59 +02:00
Tamo
9372c34dab prepare the tests to share indexes with api key 2024-07-30 14:34:11 +02:00
Tamo
6666c57880 reduce the number of thread spawned by milli 2024-07-30 14:34:10 +02:00
Tamo
b53a019b07 fix the initialization problem over the shared indexes with documents 2024-07-30 14:24:57 +02:00
Tamo
d262b1df32 craft an API over the Shared Server and Shared index to avoid hard to debug mistakes 2024-07-30 14:24:57 +02:00
Tamo
ed795bc837 fmt 2024-07-30 14:24:57 +02:00
Tamo
993264227d reuse an index with already indexed documents instead of reindexing from scratch 2024-07-30 14:24:57 +02:00
Tamo
953d3a44bd make the new_shared function synchronous and stop indexing documents when it's not required 2024-07-30 14:24:57 +02:00
Tamo
e5345fb0eb shave off 15s by providing a shared instance to the integration tests 2024-07-30 14:24:55 +02:00
Louis Dureuil
9719dec443
Attach declared attributes-localized subroutes 2024-07-29 16:19:35 +02:00
Louis Dureuil
fa77a949aa
Log error from main using tracing 2024-07-29 14:58:39 +02:00
Louis Dureuil
8532fe8afc
Fix tests 2024-07-25 12:10:32 +02:00
Louis Dureuil
6c598fa06d
test custom headers 2024-07-25 12:01:51 +02:00
Louis Dureuil
8338df0dbe
Fix tests 2024-07-25 12:01:51 +02:00
Louis Dureuil
22ef2d877f
Ensure test server has a single indexing thread 2024-07-25 12:01:51 +02:00
Louis Dureuil
59115fd058
Fix tests 2024-07-25 10:52:57 +02:00
ManyTheFish
a918561ac1
Fix PR comments 2024-07-25 10:52:56 +02:00
ManyTheFish
70d71581ee
fix clippy 2024-07-25 10:52:56 +02:00
ManyTheFish
04fa44e7eb
Implement localized attributes settings 2024-07-25 10:51:27 +02:00
ManyTheFish
90c0a6db7d
Implement localized search 2024-07-25 10:51:27 +02:00
ManyTheFish
d82f8fd904
Add tests 2024-07-25 10:51:27 +02:00
Tamo
988552e178
add tests on the rest embedder 2024-07-24 14:34:17 +02:00
meili-bors[bot]
6e9d0de8b7
Merge #4806
4806: Update rustls as much as possible r=Kerollmops a=irevoire

# Pull Request

## Related issue
Part of https://github.com/meilisearch/meilisearch/issues/4753

## What does this PR do?
- Update rustls as much as possible

## What is missing

In rustls-0.22.0 two structures we were using have been removed with no explanation or workaround
<img width="518" alt="image" src="https://github.com/user-attachments/assets/fa112db1-3400-4163-8819-7913f22d6b87">



Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-17 17:00:01 +00:00
Tamo
1bfb16386c Update rustls as much as possible 2024-07-17 18:21:26 +02:00
meili-bors[bot]
ea73615abf
Merge #4804
4804: Implements the experimental contains filter operator r=irevoire a=irevoire

# Pull Request
Related PRD: (private link) https://www.notion.so/meilisearch/Contains-Like-Filter-Operator-0d8ad53c6761466f913432eb1d843f1e
Public usage page: https://meilisearch.notion.site/Contains-filter-operator-usage-3e7421b0aacf45f48ab09abe259a1de6

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/3613

## What does this PR do?
- Extract the contains operator from this PR: https://github.com/meilisearch/meilisearch/pull/3751
- Gate it behind a feature flag
- Add tests


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-17 15:47:11 +00:00
Tamo
02c61eabfa fix the range reported when the experimental feature has not been set 2024-07-17 16:54:33 +02:00
Tamo
56b60ec7a0 apply review comment 2024-07-17 16:13:40 +02:00
Tamo
2af9481804 Implements the experimental contains filter operator« 2024-07-17 11:13:37 +02:00
Louis Dureuil
8d6ac261ae
Add tests on various failure modes for embedders 2024-07-16 13:39:02 +02:00
Louis Dureuil
b4c8b01c88
Update existing snapshots 2024-07-16 13:39:01 +02:00
meili-bors[bot]
1582c7e788
Merge #4769
4769: Federated search r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes #4747 

[Usage](https://meilisearch.notion.site/v1-10-federated-search-698dfe36ab6b4668b044f735fb40f0b2)

## What does this PR do?
- multi-search now allows a top-level federation object. When not `null`, the results of multi-search are modified to be a single list of results rather than a list of a list of results
- changed lifetimes around tokenizer et al. to be able to make hits one by one rather than using a vector
- adds `roaring` to Meilisearch itself. As the federated search happens at the Meilisearch level (reuses the search functions declared at the Meilisearch level + merge happens after the hits were created), `RoaringBitmap`s are needed to track the candidates: hits that were seen,  all candidates.
- Refactor `make_hits` to allow for an individual, optimized `make_hit` 
- Score details comparison no longer fail when sorting on different field names or target point (for geo)

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-16 08:14:46 +00:00
Louis Dureuil
20094eba06
Apply review comments 2024-07-15 12:43:29 +02:00
Louis Dureuil
c35904d6e8
search::federated::ranking_rules -> search::ranking_rules 2024-07-15 08:43:22 +02:00
Louis Dureuil
2cacc448b6
Rename src/search.rs -> src/search/mod.rs 2024-07-15 08:43:21 +02:00
Louis Dureuil
a61b852695
Add tests 2024-07-15 08:43:21 +02:00
Louis Dureuil
3167411e98
Analytics 2024-07-15 08:43:21 +02:00
Louis Dureuil
83d71662aa
Changes to multi_search route 2024-07-15 08:43:21 +02:00
Louis Dureuil
5c323cecc7
search: introduce federated search 2024-07-15 08:43:21 +02:00
Louis Dureuil
d3a6d2a6fa
search: introduce hitmaker 2024-07-11 16:35:59 +02:00
Louis Dureuil
2123d76089
search: introduce "search_from_kind" 2024-07-11 16:35:11 +02:00
Louis Dureuil
edab4e75b0
Make SearchKind cloneable 2024-07-11 16:33:24 +02:00
Louis Dureuil
b9982587d4
Add new errors to meilisearch 2024-07-11 16:31:44 +02:00
Louis Dureuil
12a7a45930
Add roaring to meilisearch 2024-07-11 16:27:50 +02:00
meili-bors[bot]
677ed6bbf6
Merge #4787
4787: Add index exists function in index_scheduler which stops opening indexes to only check if they exist. r=Kerollmops a=Karribalu

# Pull Request

## Related issue
Fixes #4784

## What does this PR do?
- Added index_exists function in the index_scheduler.
- Resolved opening indexes to only check if they exist.
- Made changes to existing tests to test this function.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: karribalu <karri.balu123456@gmail.com>
2024-07-11 13:05:20 +00:00
Clément Renault
6e80364c50
Apply review comments 2024-07-11 11:00:27 +02:00
karribalu
603676cb3b Address PR review changes 2024-07-10 19:42:16 +01:00
karribalu
23e102ca71 Address PR review changes 2024-07-10 19:33:16 +01:00
Clément Renault
487997f6ad
Support the new editDocumentsByFunction experimental feature 2024-07-10 16:29:18 +02:00
Clément Renault
94809090a3
Support not specifying a context 2024-07-10 16:29:18 +02:00
Clément Renault
01144b2c74
Make the edit documents by function route experimental 2024-07-10 16:29:18 +02:00
Clément Renault
e97600eead
Improve the analytics for the document edition by function 2024-07-10 16:29:18 +02:00
Clément Renault
767553519d
Create errors for the HTTP route issues 2024-07-10 16:29:18 +02:00
Clément Renault
e706023969
Fix some analytics issues 2024-07-10 16:29:17 +02:00
Clément Renault
862d49e4af
Editing documents requires the documents.all action (add, get, and del) 2024-07-10 16:29:05 +02:00
Clément Renault
400e6b93ce
Support user-provided context for documents edition 2024-07-10 16:28:15 +02:00
Clément Renault
f32e6c32fc
Rename editionCode to function 2024-07-10 16:28:15 +02:00
Clément Renault
f07256971a
Fix tests 2024-07-10 16:28:14 +02:00
Clément Renault
246f0e7130
Make the filter field really optional 2024-07-10 16:28:14 +02:00
Clément Renault
45af18ae9c
Check the Rhai syntax before accepting the script 2024-07-10 16:28:13 +02:00
Clément Renault
efc156a4a4
Executing Lua works correctly 2024-07-10 16:27:36 +02:00
Clément Renault
ba85959642
Support filtering the documents to edit with lua 2024-07-10 16:23:21 +02:00
Clément Renault
9d6885793e
Upgrade dependencies 2024-07-10 13:46:24 +02:00
Clément Renault
5f4530ce57
Remove more unused dependencies 2024-07-10 13:36:34 +02:00
Tamo
4d5005b01a make clippy happy 2024-07-10 10:06:59 +02:00
Tamo
9feba5028d update byte-unit 2024-07-09 23:41:29 +02:00
karribalu
ea21b948b1 Address PR review changes 2024-07-09 09:18:57 +01:00
karribalu
47e526f5ea Add index exists function in index_scheduler 2024-07-08 22:27:10 +01:00
Tamo
4aa7d386d8 remove http and uses actix_web::http instead 2024-07-08 21:17:10 +02:00
Tamo
ef8d9a20f8 update actix-web 2024-07-08 18:36:32 +02:00
Tamo
6afa578688 update most incompatible dependencies 2024-07-08 18:31:15 +02:00
Tamo
300bdfc2a7 update most dependencies 2024-07-08 18:09:12 +02:00
Louis Dureuil
128e6c7502
Search: spans with a finer granularity 2024-07-02 16:13:53 +02:00
Tamo
3d9befd64f fix warning 2024-07-02 15:30:16 +02:00
Tamo
ee14d5196c fix the tests 2024-07-02 15:18:30 +02:00
ManyTheFish
015d90a962 merge main 2024-07-01 11:50:36 +02:00
Louis Dureuil
e53de15b8e
Fix behavior of limit and offset for hybrid search when keyword results are returned early
The test is fixed
2024-06-27 14:25:33 +02:00
Louis Dureuil
8c4921b9dd
Add failing test on limit+offset for hybrid search 2024-06-27 14:21:34 +02:00
Tamo
ce08dc509b add more tests and improve the location of the error 2024-06-27 11:51:45 +02:00
Tamo
1daaed163a Make _vectors.:embedding.regenerate mandatory + tests + error messages 2024-06-27 11:04:58 +02:00