Commit Graph

141 Commits

Author SHA1 Message Date
Louis Dureuil
e53de15b8e
Fix behavior of limit and offset for hybrid search when keyword results are returned early
The test is fixed
2024-06-27 14:25:33 +02:00
Louis Dureuil
8c4921b9dd
Add failing test on limit+offset for hybrid search 2024-06-27 14:21:34 +02:00
meili-bors[bot]
e580d6b98f
Merge #4693
4693: Introduce distinct attributes at search time r=irevoire a=Kerollmops

This PR fixes #4611.

### To Do
- [x] Remove the `distinguishableAttributes` settings (not even a commit about that).
- [x] Use the `filterableAttributes` to be able to use the `distinct` parameter at search.
- [x] Work on the errors and make tests.

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-18 07:45:03 +00:00
Tamo
8ba65e333b add snapshot files 2024-06-17 16:50:26 +02:00
Tamo
43875e6758 fix bug around nested fields 2024-06-17 15:59:30 +02:00
Tamo
d7844a6e45 add a bunch of tests on the errors of the distinct at search time 2024-06-17 15:37:32 +02:00
meili-bors[bot]
e9bf4c43a4
Merge #4649
4649: Don't store the vectors in the documents database r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4607

## What does this PR do?
- Ensure that anything falling under `_vectors` is NOT searchable, filterable or sortable
- [x] per embedder, add a roaring bitmap of documents that provide "userProvided" embeddings
- [x] in the indexing process in extract_vector_points, set the bit corresponding to the document depending on the "userProvided" subfield in the _vectors field.
- [x] in the document DB in typed chunks, when writing the _vectors field, remove all keys corresponding to an embedder

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-17 12:32:03 +00:00
Louis Dureuil
09d9b63e1c
- test case where all vectors were generated
- update tests following changes in behavior from previous commit
2024-06-13 17:16:41 +02:00
Tamo
6bf07d969e add failing test 2024-06-13 15:49:42 +02:00
Louis Dureuil
3f212a8202
Update tests 2024-06-12 18:13:34 +02:00
Tamo
600e97d9dc gate the retrieveVectors parameter behind the vectors feature flag 2024-06-10 18:26:12 +02:00
ManyTheFish
57d066595b fix Tests almost all features 2024-06-06 17:24:50 +02:00
Tamo
31a793d226 fix the regeneration of the embeddings in the search 2024-06-06 11:39:29 +02:00
Tamo
400cf3eb92 add api error test on the new retrieveVectors parameter 2024-06-06 11:39:29 +02:00
Tamo
6b29676e7e update snapshots 2024-06-06 11:39:29 +02:00
Tamo
5d50850e12 always push the user defined vectors in arroy 2024-06-06 11:39:29 +02:00
Tamo
04f6523f3c expose a new parameter to retrieve the embedders at search time 2024-06-06 11:36:11 +02:00
meili-bors[bot]
fc584f1db3
Merge #4666
4666: Add a score threshold search parameter r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4609

## What does this PR do?
- See [usage](https://meilisearch.notion.site/Filter-by-score-usage-224a183ce7b24ca99b6a9a8da755668a?pvs=25#95b76ded400342ba9ab3d67c734836f0) and [the known limitation](https://meilisearch.notion.site/Filter-by-score-usage-224a183ce7b24ca99b6a9a8da755668a?pvs=25#e4e32195bf0e4195b5daecdbb7a97a17)


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-03 08:42:44 +00:00
meili-bors[bot]
d6bd88ce4f
Merge #4667
4667: Frequency matching strategy r=Kerollmops a=ManyTheFish

# Pull Request

## Related issue
Fixes #3773

## What does this PR do?
- add test for matching strategy
- implement frequency matching strategy

See the [PRD for more details](https://www.notion.so/meilisearch/Frequency-Matching-Strategy-0f3ba08833a442a39590a53a1505ab00).

[Public API](https://www.notion.so/meilisearch/frequency-matching-strategy-89868fb7fc584026bc56e378eb854a7f).


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-05-30 14:53:31 +00:00
ManyTheFish
3f1a510069 Add tests and fix matching strategy 2024-05-30 12:02:42 +02:00
Louis Dureuil
41976b82b1
Tests for ranking_score_threshold 2024-05-30 11:22:26 +02:00
Many the fish
e1fbfde6c4
Merge branch 'main' into merge-release-v1.8.1-in-main 2024-05-29 11:31:03 +02:00
Clément Renault
487431a035
Fix tests 2024-05-27 16:12:20 +02:00
meili-bors[bot]
19acc65ad2
Merge #4646
4646: Reduce `Transform`'s disk usage r=Kerollmops a=Kerollmops

This PR implements what is described in #4485. It reduces the number of disk writes and disk usage.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-05-23 16:06:50 +00:00
Clément Renault
fe17c0f52e
Construct the minimal OBKVs according to the settings diff 2024-05-23 11:23:57 +02:00
ManyTheFish
3e94a90722 Fixes 2024-05-21 13:39:46 +02:00
Louis Dureuil
afcd7b9f0c
Test hybrid search with hf embedder 2024-05-20 14:44:10 +02:00
Tamo
0f78703b85 add a test reproducing the bug 2024-05-20 10:58:08 +02:00
Louis Dureuil
d05d49ffd8
Fix tests 2024-05-20 10:36:18 +02:00
Tamo
f2d0a59f1d when no searchable attributes are defined, makes all the weight equals to zero 2024-05-16 01:06:33 +02:00
Louis Dureuil
ca499a0302
Fix test after rebase 2024-04-04 16:04:07 +02:00
Louis Dureuil
355e5282b2
Remove _semanticScore 2024-04-04 16:04:07 +02:00
Louis Dureuil
7c27417a5d
Add tests 2024-04-04 16:04:07 +02:00
Louis Dureuil
1ff2a2d6fb
Add semanticHitCount 2024-04-04 16:04:06 +02:00
meili-bors[bot]
5509bafff8
Merge #4535
4535: Support Negative Keywords r=ManyTheFish a=Kerollmops

This PR fixes #4422 by supporting `-` before any word in the query.

The minus symbol `-`, from the ASCII table, is not the only character that can be considered the negative operator. You can see the two other matching characters under the `Based on "-" (U+002D)` section on [this unicode reference website](https://www.compart.com/en/unicode/U+002D).

It's important to notice the strange behavior when a query includes and excludes the same word; only the derivative ( synonyms and split) will be kept:
 - If you input `progamer -progamer`, the engine will still search for `pro gamer`.
 - If you have the synonym `like = love` and you input `like -like`, it will still search for `love`.

## TODO
 - [x] Add analytics
 - [x] Add support to the `-` operator
 - [x] Make sure to support spaces around `-` well
 - [x] Support phrase negation
 - [x] Add tests


Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-04-04 13:10:27 +00:00
Clément Renault
90e812fc0b
Add some tests 2024-04-04 15:08:37 +02:00
meili-bors[bot]
56bf8503db
Merge #4537
4537: Expose distribution shift in settings r=ManyTheFish a=dureuill

See [usage page](https://meilisearch.notion.site/v1-8-AI-search-API-usage-135552d6e85a4a52bc7109be82aeca42#d652adc0890445658aaf36352dbc8802)

# Changes

- Distribution shift added to all embedders.
- Exposed in settings
- Changed the reindexing logic to not trigger a reindex operation when only the distribution shift or API key change

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-04-03 09:08:58 +00:00
Louis Dureuil
cde7ce4f44
Add test 2024-03-27 14:02:09 +01:00
Tamo
3a1f458139 fix a flaky test 2024-03-26 21:06:55 +01:00
Tamo
2e36f069c2 fmt imports 2024-03-26 19:23:55 +01:00
Tamo
8127c9a115 handle the case of a queue of zero elements 2024-03-26 19:04:39 +01:00
Tamo
e7704f1fc1 add a test to ensure we effectively returns a retry-after when the search queue is full 2024-03-26 18:08:59 +01:00
Tamo
e433fd53e6 rename the method to get a permit and use it in all search requests 2024-03-26 17:28:03 +01:00
Tamo
c41e1274dc push and test the search queue datastructure 2024-03-26 15:56:43 +01:00
Tamo
d8fe4fe49d return the order in the score details 2024-03-19 15:45:04 +01:00
Tamo
6a0c399c2f rename the search_cutoff parameter to search_cutoff_ms 2024-03-19 10:35:47 +01:00
Tamo
038c26c118 stop returning the degraded boolean when a search was cutoff 2024-03-19 10:35:47 +01:00
Tamo
ad9192fbbf reduce the size of an integration test 2024-03-19 10:35:47 +01:00
Tamo
b8cda6c300 fix the search cutoff and add a test 2024-03-19 10:35:47 +01:00
Clément Renault
6c9823d7bb
Add tests to sortFacetValuesBy count 2024-03-13 11:59:39 +01:00