Commit Graph

1640 Commits

Author SHA1 Message Date
Tamo bab898ce86
move the flatten-serde-json crate inside of milli 2022-04-07 18:20:44 +02:00
Tamo ab458d8840
fix tests after rebase 2022-04-07 17:00:00 +02:00
Irevoire 4f3ce6d9cd
nested fields 2022-04-07 16:58:46 +02:00
bors[bot] 4ae7aea3b2
Merge #486
486: Update version (v0.25.0) r=curquiza a=curquiza

v0.25.0 will be released once #478 is merged

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-04-06 11:40:41 +00:00
bors[bot] aadb0c58c9
Merge #478
478: Disable typo on attribute r=Kerollmops a=MarinPostma

disable typo on attributes


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-05 23:45:35 +00:00
ad hoc 86249e2ae4
add missing \t in cli update display
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-04-05 21:35:06 +02:00
ad hoc b799f3326b
rename merge_nothing to merge_ignore_values 2022-04-05 18:44:35 +02:00
ad hoc 201fea0fda
limit extract_word_docids memory usage 2022-04-05 14:14:15 +02:00
ad hoc 5cfd3d8407
add exact attributes documentation 2022-04-05 14:10:22 +02:00
Clémentine Urquizar 9eec44dd98
Update version (v0.25.0) 2022-04-05 12:06:42 +02:00
ad hoc b85cd4983e
remove field_id_from_position 2022-04-05 09:50:34 +02:00
ad hoc dac81b2d44
add missing \n in cli settings 2022-04-05 09:48:56 +02:00
ad hoc ab185a59b5
fix infos 2022-04-05 09:46:56 +02:00
ad hoc 59e41d98e3
add comments to integration test 2022-04-04 21:17:06 +02:00
ad hoc 1810927dbd
rephrase exact_attributes doc 2022-04-04 21:04:49 +02:00
ad hoc b7694c34f5
remove println 2022-04-04 21:00:07 +02:00
ad hoc 6cabd47c32
fix typo in comment 2022-04-04 20:59:20 +02:00
ad hoc 9963f11172
fix infos crate compilation issue 2022-04-04 20:54:03 +02:00
ad hoc c8d3a09af8
add integration test for disabel typo on attributes 2022-04-04 20:54:03 +02:00
ad hoc bfd81ce050
add exact atttributes to cli settings 2022-04-04 20:54:03 +02:00
ad hoc 6b2c2509b2
fix bug in exact search 2022-04-04 20:54:03 +02:00
ad hoc 56b4f5dce2
add exact prefix to query_docids 2022-04-04 20:54:03 +02:00
ad hoc 21ae4143b1
add exact_word_prefix to Context 2022-04-04 20:54:03 +02:00
ad hoc e8f06f6c06
extract exact_word_prefix_docids 2022-04-04 20:54:03 +02:00
ad hoc 6dd2e4ffbd
introduce exact_word_prefix database in index 2022-04-04 20:54:03 +02:00
ad hoc ba0bb29cd8
refactor WordPrefixDocids to take dbs instead of indexes 2022-04-04 20:54:02 +02:00
ad hoc c4c6e35352
query exact_word_docids in resolve_query_tree 2022-04-04 20:54:02 +02:00
ad hoc 8d46a5b0b5
extract exact word docids 2022-04-04 20:54:02 +02:00
ad hoc 5451c64d5d
increase criteria asc desc test map size 2022-04-04 20:54:02 +02:00
ad hoc 0a77be4ec0
introduce exact_word_docids db 2022-04-04 20:54:02 +02:00
ad hoc 5f9f82757d
refactor spawn_extraction_task 2022-04-04 20:54:02 +02:00
ad hoc f82d4b36eb
introduce exact attribute setting 2022-04-04 20:54:02 +02:00
ad hoc c882d8daf0
add test for exact words 2022-04-04 20:54:01 +02:00
ad hoc 7e9d56a9e7
disable typos on exact words 2022-04-04 20:54:01 +02:00
bors[bot] 900825bac0
Merge #474
474: Disable typos on exact word r=MarinPostma a=MarinPostma

This PR introduces the `exact_word` setting to disable typo tolerance on custom words.

If a user query contains a word from `exact_words`, no typo derivation will be made for that particular word.

I have chosen to store the words in a FST, to save on deserialization, and allow for fast lookups.

I had some trouble with the `serde` module, and had to rename it `serde_impl`.

## steps:
- [x] introduce new settings to register words to disable typos on
- [x] in `typos`, return exact match is the current word is part of the word to disable typos for.
- [x] update `Context` to return the exact words dictionary.
- [x] merge #473 


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-04 18:39:43 +00:00
ad hoc 3e67d8818c
fix typo in test comment 2022-04-04 20:34:23 +02:00
ad hoc 284d8a24e0
add intergration test for disabled typon on word 2022-04-04 20:15:51 +02:00
ad hoc 30a2711bac
rename serde module to serde_impl module
needed because of issues with rustfmt
2022-04-04 20:10:55 +02:00
ad hoc 0fd55db21c
fmt 2022-04-04 20:10:55 +02:00
ad hoc 559e46be5e
fix bad rebase bug 2022-04-04 20:10:55 +02:00
ad hoc 8b1e5d9c6d
add test for exact words 2022-04-04 20:10:55 +02:00
ad hoc 774fa8f065
disable typos on exact words 2022-04-04 20:10:55 +02:00
ad hoc 9bbffb8fee
add exact words setting 2022-04-04 20:10:54 +02:00
bors[bot] 48a5ce7434
Merge #473
473: set minimum word len for typos r=MarinPostma a=MarinPostma

this PR allows the configuration on the minimum word length for typos.

The default values are the same as previously.

## steps
- [x] introduce settings for the minimum word length for 1 and 2 typos
- [x] update the settings update flow to set this setting
- [x] create a structure `TypoConfig` to configure typo tolerance in the query builder
- [x] in `typo`, use the configuration to create the appropriate query tree node.
- [x] extend `Context` to return the setting for minimum word length for typos
- [x] return correct error message for wrong settings.
- [x] merge #469 

Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-04 17:53:14 +00:00
bors[bot] 6bf9824fec
Merge #485
485: fix bug on 2 typos derivation r=Kerollmops a=MarinPostma

I found a bug while working on #473. This pr fixes it and add the missing tests on word derivations.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-04 17:17:53 +00:00
ad hoc 853b4a520f
fmt 2022-04-04 10:41:46 +02:00
ad hoc 2cb71dff4a
add typo integration tests 2022-04-04 10:41:46 +02:00
ad hoc 1941072bb2
implement Copy on Setting 2022-04-04 10:41:46 +02:00
ad hoc fdaf45aab2
replace hardcoded value with constant in TestContext 2022-04-04 10:41:46 +02:00
ad hoc 950a740bd4
refactor typos for readability 2022-04-04 10:41:46 +02:00