Commit Graph

26 Commits

Author SHA1 Message Date
bors[bot] 414b3fae89
Merge #3571
3571: Introduce two filters to select documents with `null` and empty fields r=irevoire a=Kerollmops

# Pull Request

## Related issue
This PR implements the `X IS NULL`, `X IS NOT NULL`, `X IS EMPTY`, `X IS NOT EMPTY` filters that [this comment](https://github.com/meilisearch/product/discussions/539#discussioncomment-5115884) is describing in a very detailed manner.

## What does this PR do?

### `IS NULL` and `IS NOT NULL`

This PR will be exposed as a prototype for now. Below is the copy/pasted version of a spec that defines this filter.

- `IS NULL` matches fields that `EXISTS` AND `= IS NULL`
- `IS NOT NULL` matches fields that `NOT EXISTS` OR `!= IS NULL`

1. `{"name": "A", "price": null}`
2. `{"name": "A", "price": 10}`
3. `{"name": "A"}`

`price IS NULL` would match 1
`price IS NOT NULL` or `NOT price IS NULL` would match 2,3
`price EXISTS` would match 1, 2
`price NOT EXISTS` or `NOT price EXISTS` would match 3

common query : `(price EXISTS) AND (price IS NOT NULL)` would match 2

### `IS EMPTY` and `IS NOT EMPTY`

- `IS EMPTY` matches Array `[]`, Object `{}`, or String `""` fields that `EXISTS` and are empty
- `IS NOT EMPTY` matches fields that `NOT EXISTS` OR are not empty.

1. `{"name": "A", "tags": null}`
2. `{"name": "A", "tags": [null]}`
3. `{"name": "A", "tags": []}`
4. `{"name": "A", "tags": ["hello","world"]}`
5. `{"name": "A", "tags": [""]}`
6. `{"name": "A"}`
7. `{"name": "A", "tags": {}}`
8. `{"name": "A", "tags": {"t1":"v1"}}`
9. `{"name": "A", "tags": {"t1":""}}`
10. `{"name": "A", "tags": ""}`

`tags IS EMPTY` would match 3,7,10
`tags IS NOT EMPTY` or `NOT tags IS EMPTY` would match 1,2,4,5,6,8,9
`tags IS NULL` would match 1
`tags IS NOT NULL` or `NOT tags IS NULL` would match 2,3,4,5,6,7,8,9,10
`tags EXISTS` would match 1,2,3,4,5,7,8,9,10
`tags NOT EXISTS` or `NOT tags EXISTS` would match 6

common query : `(tags EXISTS) AND (tags IS NOT NULL) AND (tags IS NOT EMPTY)` would match 2,4,5,8,9

## What should the reviewer do?

- Check that I tested the filters
- Check that I deleted the ids of the documents when deleting documents


Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2023-04-27 13:14:00 +00:00
Clément Renault 1a9c58a7ab
Fix a bug with the new flattening rules 2023-03-15 16:56:44 +01:00
ManyTheFish 2f8eb4f54a last PR fixes 2023-03-09 15:34:36 +01:00
ManyTheFish 5deea631ea fix clippy too many arguments 2023-03-09 11:19:13 +01:00
ManyTheFish b4b859ec8c Fix typos 2023-03-09 10:58:35 +01:00
ManyTheFish 24c0775c67 Change indexing threshold 2023-03-08 12:36:04 +01:00
ManyTheFish 3092cf0448 Fix clippy errors 2023-03-08 10:53:42 +01:00
ManyTheFish da48506f15 Rerun extraction when language detection might have failed 2023-03-07 18:35:26 +01:00
ManyTheFish bbecab8948 fix clippy 2023-02-21 10:18:44 +01:00
f3r10 d8207356f4 Skip script,language insertion if language is undetected 2023-01-31 11:28:05 +01:00
f3r10 fd60a39f1c Format code 2023-01-31 11:28:05 +01:00
f3r10 d97fb6117e Extract and index data 2023-01-31 11:28:05 +01:00
Loïc Lecrenier 8d0ace2d64 Avoid creating a MatchingWord for words that exceed the length limit 2022-11-28 10:20:13 +01:00
unvalley 3009981d31 Fix clippy errors
Add clippy job

Add clippy job to CI
2022-11-04 08:58:14 +09:00
Ewan Higgs 6b2fe94192 Fixes for clippy bringing us down to 18 remaining issues.
This brings us a step closer to enforcing clippy on each build.
2022-10-25 20:49:02 +02:00
Loïc Lecrenier 3794962330 Use an unstable algorithm for grenad::Sorter when possible 2022-09-13 14:49:53 +02:00
Kerollmops fe3973a51c
Make sure that long words are correctly skipped 2022-09-07 15:03:32 +02:00
ManyTheFish 86ac8568e6 Use Charabia in milli 2022-06-02 16:59:11 +02:00
Clément Renault f367cc2e75
Finally bump grenad to v0.4.1 2022-02-16 15:28:48 +01:00
many c5a6075484
Make max_position_per_attributes changable 2021-10-12 10:10:50 +02:00
many 360c5ff3df
Remove limit of 1000 position per attribute
Instead of using an arbitrary limit we encode the absolute position in a u32
using one strong u16 for the field id and a weak u16 for the relative position in the attribute.
2021-10-12 10:10:50 +02:00
many e54280fbfc
Skip empty normalized words 2021-09-08 15:25:23 +02:00
many e09eec37bc
Handle distance addition with hard separators 2021-09-01 16:48:40 +02:00
many fc7cc770d4
Add logging timers 2021-09-01 16:48:40 +02:00
many 2d1727697d
Take stop word in account 2021-09-01 16:48:40 +02:00
many 1d314328f0
Plug new indexer 2021-09-01 16:48:36 +02:00