Loïc Lecrenier
65474c8de5
Update new sort ranking rule after rebasing
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
fbb1ba3de0
Cargo fmt
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
a59ca28e2c
Add forgotten file
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
825f742000
Simplify graph-based ranking rule impl
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
dd491320e5
Simplify graph-based ranking rule impl
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
c6ff97a220
Rewrite the dead-ends cache to detect more dead-ends
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
49240c367a
Fix bug in cost of typo conditions
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
1e6e624078
Fix bug in SmallBitmap
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
8b4e07e1a3
WIP
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
2853009987
Renaming Edge -> Condition
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
aa59c3bc2c
Replace EdgeCondition with an Option<..> + other code cleanup
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
7b1d8f4c6d
Make PathSet strongly typed
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
a49ddec9df
Prune the query graph after executing a ranking rule
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
05fe856e6e
Merge forward and backward proximity conditions in proximity graph
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
c0cdaf9f53
Fix bug in the proximity ranking rule for queries with ngrams
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
e9cf58d584
Refactor of the Interner
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
31628c5cd4
Merge Phrase and WordDerivations into one structure
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
3004e281d7
Support ngram typos + splitwords and splitwords+synonyms in proximity
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
14e8d0aaa2
Rename lifetime
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
1c58cf8426
Intern ranking rule graph edge conditions as well
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
5155fd2bf1
Reorganise initialisation of ranking rules + rename PathsMap -> PathSet
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
9ec9c204d3
Small code cleanup
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
78b9304d52
Implement distinct attribute
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
0465ba4a05
Intern more values
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
2099991dd1
Continue documenting and cleaning up the code
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
c232cdabf5
Add documentation
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
4e266211bf
Small code reorganisation
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
57fa689131
Cargo fmt
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
10626dddfc
Add a few more optimisations to new search algorithms
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
9051065c22
Apply a few optimisations for graph-based ranking rules
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
e8c76cf7bf
Intern all strings and phrases in the search logic
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
3f1729a17f
Update new search test
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
cab2b6bcda
Fix: computation of initial universe, code organisation
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
c4979a2fda
Fix code visibility issue + unimplemented detail in proximity rule
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
23931f8a4f
Fix small bug in visual logger of search algo
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
aa414565bb
Fix proximity graph edge builder to include all proximities
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
1db152046e
WIP on split words and synonyms support
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
c27ea2677f
Rewrite cheapest path algorithm and empty path cache
...
It is now much simpler and has much better performance.
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
caa1e1b923
Add typo ranking rule to new search impl
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
71f18e4379
Add sort ranking rule to new search impl
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
600e3dd1c5
Remove warnings
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
362eb0de86
Add support for filters
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
998d46ac10
Add support for search offset and limit
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
6c85c0d95e
Fix more bugs + visual empty path cache logging
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
0e1fbbf7c6
Fix bugs in query graph's "remove word" and "cheapest paths" algos
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
6806640ef0
Fix d2 description of paths map
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
173e37584c
Improve the visual/detailed search logger
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
6ba4d5e987
Add a search logger
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
dd12d44134
Support swapped word pairs in new proximity ranking rule impl
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
c8e251bf24
Remove noise in codebase
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
a938fbde4a
Use a cache when resolving the query graph
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
dcf3f1d18a
Remove EdgeIndex and NodeIndex types, prefer u32 instead
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
66d0c63694
Add some documentation and use bitmaps instead of hashmaps when possible
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
132191360b
Introduce the sort ranking rule working with the new search structures
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
345c99d5bd
Introduce the words ranking rule working with the new search structures
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
89d696c1e3
Introduce the proximity ranking rule as a graph-based ranking rule
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
c645853529
Introduce a generic graph-based ranking rule
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
a70ab8b072
Introduce a function to find the K shortest paths in a graph
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
48aae76b15
Introduce a function to find the docids of a set of paths in a graph
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
23bf572dea
Introduce cache structures used with ranking rule graphs
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
864f6410ed
Introduce a structure to represent a set of graph paths efficiently
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
c9bf6bb2fa
Introduce a structure to implement ranking rules with graph algorithms
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
46249ea901
Implement a function to find a QueryGraph's docids
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
ce0d1e0e13
Introduce a common way to manage the coordination between ranking rules
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
5065d8b0c1
Introduce a DatabaseCache to memorize the addresses of LMDB values
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
a83007c013
Introduce structure to represent search queries as graphs
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
79e0a6dd4e
Introduce a new search module, eventually meant to replace the old one
...
The code here does not compile, because I am merely splitting one giant
commit into smaller ones where each commit explains a single file.
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
2d88089129
Remove unused term matching strategies
2023-03-20 09:41:55 +01:00
bors[bot]
4f1ccbc495
Merge #3525
...
3525: Fix phrase search containing stop words r=ManyTheFish a=ManyTheFish
# Summary
A search with a phrase containing only stop words was returning an HTTP error 500,
this PR filters the phrase containing only stop words dropping them before the search starts, a query with a phrase containing only stop words now behaves like a placeholder search.
fixes https://github.com/meilisearch/meilisearch/issues/3521
related v1.0.2 PR on milli: https://github.com/meilisearch/milli/pull/779
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-03-02 10:55:37 +00:00
ManyTheFish
37489fd495
Return an internal error in the case of matching word is invalid
2023-03-01 19:05:16 +01:00
bors[bot]
ac5a1e4c4b
Merge #3423
...
3423: Add min and max facet stats r=dureuill a=dureuill
# Pull Request
## Related issue
Fixes #3426
## What does this PR do?
### User standpoint
- When using a `facets` parameter in search, the facets that have numeric values are displayed in a new section of the response called `facetStats` that contains, per facet, the numeric min and max value of the hits returned by the search.
<details>
<summary>
Sample request/response
</summary>
```json
❯ curl \
-X POST 'http://localhost:7700/indexes/meteorites/search?facets=mass ' \
-H 'Content-Type: application/json' \
--data-binary '{ "q": "LL6", "facets":["mass", "recclass"], "limit": 5 }' | jsonxf
{
"hits": [
{
"name": "Niger (LL6)",
"id": "16975",
"nametype": "Valid",
"recclass": "LL6",
"mass": 3.3,
"fall": "Fell"
},
{
"name": "Appley Bridge",
"id": "2318",
"nametype": "Valid",
"recclass": "LL6",
"mass": 15000,
"fall": "Fell",
"_geo": {
"lat": 53.58333,
"lng": -2.71667
}
},
{
"name": "Athens",
"id": "4885",
"nametype": "Valid",
"recclass": "LL6",
"mass": 265,
"fall": "Fell",
"_geo": {
"lat": 34.75,
"lng": -87.0
}
},
{
"name": "Bandong",
"id": "4935",
"nametype": "Valid",
"recclass": "LL6",
"mass": 11500,
"fall": "Fell",
"_geo": {
"lat": -6.91667,
"lng": 107.6
}
},
{
"name": "Benguerir",
"id": "30443",
"nametype": "Valid",
"recclass": "LL6",
"mass": 25000,
"fall": "Fell",
"_geo": {
"lat": 32.25,
"lng": -8.15
}
}
],
"query": "LL6",
"processingTimeMs": 15,
"limit": 5,
"offset": 0,
"estimatedTotalHits": 42,
"facetDistribution": {
"mass": {
"110000": 1,
"11500": 1,
"1161": 1,
"12000": 1,
"1215.5": 1,
"127000": 1,
"15000": 1,
"1676": 1,
"1700": 1,
"1710.5": 1,
"18000": 1,
"19000": 1,
"220000": 1,
"2220": 1,
"22300": 1,
"25000": 2,
"265": 1,
"271000": 1,
"2840": 1,
"3.3": 1,
"3000": 1,
"303": 1,
"32000": 1,
"34000": 1,
"36.1": 1,
"45000": 1,
"460": 1,
"478": 1,
"483": 1,
"5500": 2,
"600": 1,
"6000": 1,
"67.8": 1,
"678": 1,
"680.5": 1,
"6930": 1,
"8": 1,
"8300": 1,
"840": 1,
"8400": 1
},
"recclass": {
"L/LL6": 3,
"LL6": 39
}
},
"facetStats": {
"mass": {
"min": 3.3,
"max": 271000.0
}
}
}
```
</details>
## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-02-22 13:06:43 +00:00
ManyTheFish
900bae3d9d
keep phrases that has at least one word
2023-02-21 18:16:51 +01:00
ManyTheFish
8aa808d51b
Merge branch 'main' into enhance-language-detection
2023-02-20 18:14:34 +01:00
Many the fish
119e6d8811
Update milli/src/search/mod.rs
...
Co-authored-by: Tamo <tamo@meilisearch.com>
2023-02-20 15:33:10 +01:00
Louis Dureuil
eb28d4c525
add facet test
2023-02-20 13:52:28 +01:00
Louis Dureuil
9ac981d025
Remove some clippy type complexity warns by deboxing iters
2023-02-20 13:52:27 +01:00
Louis Dureuil
74859ecd61
Add min and max facet stats
2023-02-20 13:52:27 +01:00
Louis Dureuil
8ae441a4db
Update usage of iterators
2023-02-20 13:52:27 +01:00
Louis Dureuil
042d86cbb3
facet sort ascending/descending now also return the values
2023-02-20 13:52:27 +01:00
bors[bot]
143e3cf948
Merge #3490
...
3490: Fix attributes set candidates r=curquiza a=ManyTheFish
# Pull Request
Fix attributes set candidates for v1.1.0
## details
The attribute criterion was not returning the remaining candidates when its internal algorithm was been exhausted.
We had a loss of candidates by the attribute criterion leading to the bug reported in the issue linked below.
After some investigation, it seems that it was the only criterion that had this behavior.
We are now returning the remaining candidates instead of an empty bitmap.
## Related issue
Fixes #3483
PR on milli for v1.0.1: https://github.com/meilisearch/milli/pull/777
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-02-15 17:38:07 +00:00
Filip Bachul
a53536836b
fmt
2023-02-14 17:04:22 +01:00
Filip Bachul
d7ad39ad77
fix: clippy error
2023-02-14 00:15:35 +01:00
Filip Bachul
7481559e8b
move BadGeo to FilterError
2023-02-14 00:15:35 +01:00
Filip Bachul
83c765ce6c
implement From<ParseGeoError> for FilterError
2023-02-14 00:15:35 +01:00
Filip Bachul
825923f6fc
export ParseGeoError
2023-02-14 00:15:35 +01:00
Filip Bachul
e405702733
chore: introduce new error ParseGeoError type
2023-02-14 00:15:35 +01:00
ManyTheFish
6fa877efb0
Fix attributes set candidates
2023-02-13 17:49:52 +01:00
bors[bot]
c88c3637b4
Merge #3461
...
3461: Bring v1 changes into main r=curquiza a=Kerollmops
Also bring back changes in milli (the remote repository) into main done during the pre-release
Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Philipp Ahlner <philipp@ahlner.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2023-02-07 11:27:27 +00:00
Tamo
7a38fe624f
throw an error if the top left corner is found below the bottom right corner
2023-02-06 17:50:47 +01:00
Tamo
1b005f697d
update the syntax of the geoboundingbox filter to uses brackets instead of parens around lat and lng
2023-02-06 16:50:27 +01:00
Kerollmops
fbec48f56e
Merge remote-tracking branch 'milli/main' into bring-v1-changes
2023-02-06 16:48:10 +01:00
Tamo
3ebc99473f
Apply suggestions from code review
...
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-02-06 13:29:37 +01:00
Tamo
d27007005e
comments the geoboundingbox + forbid the usage of the lexeme method which could introduce bugs
2023-02-06 11:36:49 +01:00
Tamo
fcb09ccc3d
add tests on the geoBoundingBox
2023-02-02 18:19:56 +01:00
Louis Dureuil
ae8660e585
Add Token::original_span rather than making Token::span pub
2023-02-02 15:03:34 +01:00
Guillaume Mourier
0d71c80ba6
add tests
2023-02-02 12:31:27 +01:00
Guillaume Mourier
b078477d80
Add error handling and earth lap collision with bounding box
2023-02-02 12:17:38 +01:00
ManyTheFish
0bc1a18f52
Use Languages list detected during indexing at search time
2023-02-01 18:57:43 +01:00
ManyTheFish
643d99e0f9
Add expectancy test
2023-02-01 18:39:54 +01:00
Louis Dureuil
20f05efb3c
clippy: needless_lifetimes
2023-01-31 11:12:59 +01:00