Jakub Jirutka
13f1277637
Allow to disable specialized tokenizations (again)
...
In PR #2773 , I added the `chinese`, `hebrew`, `japanese` and `thai`
feature flags to allow melisearch to be built without huge specialed
tokenizations that took up 90% of the melisearch binary size.
Unfortunately, due to some recent changes, this doesn't work anymore.
The problem lies in excessive use of the `default` feature flag, which
infects the dependency graph.
Instead of adding `default-features = false` here and there, it's easier
and more future-proof to not declare `default` in `milli` and
`meilisearch-types`. I've renamed it to `all-tokenizers`, which also
makes it a bit clearer what it's about.
2023-05-04 15:45:40 +02:00
Louis Dureuil
f8f190cd40
Update exactness tests following charabia camelCase tokenization
2023-05-03 14:45:09 +02:00
Louis Dureuil
1aaf24ccbf
Cargo fmt
2023-05-03 12:21:58 +02:00
Louis Dureuil
342c4ff85d
geosort: Remove rtree unwrap
2023-05-03 09:52:16 +02:00
Tamo
c85392ce40
make the descendent geosort fast
2023-05-03 09:13:12 +02:00
Tamo
8875d24a48
deserialize the rtree only when its needed, and keep it in memory once it has been deserialized
2023-05-03 09:13:12 +02:00
Tamo
c470b67fa2
revamp the test to use execute_iterative_and_rtree_returns_the_same
2023-05-03 09:13:12 +02:00
Louis Dureuil
b60840ebff
Remove self.iterating from words
2023-05-02 18:54:23 +02:00
Louis Dureuil
fdc1763838
Use MultiOps for resolve_query_graph
2023-05-02 18:54:09 +02:00
Louis Dureuil
75819bc940
Remove too many arguments on resolve_maximally_reduced_query_graph
2023-05-02 18:53:40 +02:00
Louis Dureuil
7b8cc25625
rename located_query_terms_from_string -> located_query_terms_from_tokens
2023-05-02 18:53:01 +02:00
Loïc Lecrenier
aa63091752
Fix bug in exact_attribute
2023-05-02 10:48:32 +02:00
Loïc Lecrenier
1b514517f5
Fix bug in computation of query term at a position
2023-05-02 10:48:32 +02:00
Loïc Lecrenier
11f814821d
Minor cleanup
2023-05-02 10:48:32 +02:00
Loïc Lecrenier
30fb1153cc
Speed up graph based ranking rule when a lot of different costs exist
2023-05-02 09:59:42 +02:00
Loïc Lecrenier
3b2c8b9f25
Improve performance of position rr
2023-05-02 09:59:42 +02:00
Loïc Lecrenier
2a7f9adf78
Build query graph more correctly from paths
...
Update snapshots
2023-05-02 09:59:42 +02:00
Loïc Lecrenier
608ceea440
Fix bug in position rr
2023-05-02 09:59:42 +02:00
Loïc Lecrenier
79001b9c97
Improve performance of the cheapest path finder algorithm
2023-05-02 09:59:42 +02:00
Loïc Lecrenier
59b12fca87
Fix errors, clippy warnings, and add review comments
2023-04-29 11:48:11 +02:00
Loïc Lecrenier
48f5bb1693
Implements the geo-sort ranking rule
2023-04-29 11:02:16 +02:00
Loïc Lecrenier
bc4efca611
Add more tests for the attribute ranking rule
2023-04-29 10:56:48 +02:00
Loïc Lecrenier
899baa0ea5
Update forgotten snapshot from previous commit
2023-04-27 13:43:04 +02:00
Loïc Lecrenier
374095d42c
Add tests for stop words and fix a couple of bugs
2023-04-27 13:30:09 +02:00
Louis Dureuil
b41a6cbd7a
Check sort criteria also in placeholder search
2023-04-26 16:28:17 +02:00
Louis Dureuil
c8af572697
Add tests for exact words and exact attributes
2023-04-26 16:13:01 +02:00
Loïc Lecrenier
b448aca49c
Add more tests for exactness rr
2023-04-26 11:04:18 +02:00
Loïc Lecrenier
55bad07c16
Fix bug in exact_attribute rr implementation
2023-04-26 10:40:05 +02:00
Loïc Lecrenier
3421125a55
Prevent the exactness
ranking rule from removing random words
...
Make it strictly follow the term matching strategy
2023-04-26 09:09:19 +02:00
Loïc Lecrenier
d3a94e8b25
Fix bugs and add tests to exactness ranking rule
2023-04-25 16:49:08 +02:00
Loïc Lecrenier
8f2e971879
Add tests for "exactness" rr, make correct universe computation
2023-04-24 16:57:34 +02:00
Loïc Lecrenier
d1fdbb63da
Make all search tests pass, fix distinctAttribute bug
2023-04-24 12:12:08 +02:00
Loïc Lecrenier
84d9c731f8
Fix bug in encoding of word_position_docids and word_fid_docids
2023-04-24 09:59:30 +02:00
Loïc Lecrenier
bd9aba4d77
Add "position" part of the attribute ranking rule
2023-04-13 10:46:09 +02:00
Loïc Lecrenier
8edad8291b
Add logger to attribute rr, fix a bug
2023-04-13 10:25:00 +02:00
Kerollmops
d9cebff61c
Add a simple test to check that attributes are ranking correctly
2023-04-13 08:27:09 +02:00
Loïc Lecrenier
30f7bd03f6
Fix compiler warning/errors caused by previous merge
2023-04-13 08:27:09 +02:00
Kerollmops
df0d9bb878
Introduce the attribute ranking rule in the list of ranking rules
2023-04-13 08:27:09 +02:00
Kerollmops
5230ddb3ea
Resolve the attribute ranking rule conditions
2023-04-13 08:27:09 +02:00
Kerollmops
d6a7c28e4d
Implement the attribute ranking rule edge computation
2023-04-13 08:27:09 +02:00
Kerollmops
e55efc419e
Introduce a new cache for the words fids
2023-04-13 08:27:09 +02:00
Loïc Lecrenier
644e136aee
Merge branch 'search-refactor-typo-attributes' into search-refactor
2023-04-13 08:26:56 +02:00
Louis Dureuil
38b7b31beb
Decide to use prefix DB if the word is not an ngram
2023-04-12 16:45:38 +02:00
Louis Dureuil
7a01f20df7
Use word_prefix_docids, make get_word_prefix_docids private
2023-04-12 16:45:38 +02:00
Louis Dureuil
c20c38a7fa
Add SearchContext::word_prefix_docids() method
2023-04-12 16:44:43 +02:00
Louis Dureuil
5ab46324c4
Everyone uses the SearchContext::word_docids instead of get_db_word_docids
...
make get_db_word_docids private
2023-04-12 16:44:43 +02:00
Louis Dureuil
325f17488a
Add SearchContext::word_docids() method
2023-04-12 16:37:05 +02:00
Louis Dureuil
e7ff987c46
Update call sites
2023-04-12 16:36:38 +02:00
Louis Dureuil
244003e36f
Refactor DB cache to return Roaring Bitmaps directly instead of byte slices
2023-04-12 16:35:48 +02:00
Loïc Lecrenier
1f813a6f3b
Simplify implementation of the detailed (=visual) logger
2023-04-12 16:32:53 +02:00