This bug is an old bug but was hidden by the proximity criterion,
Phrase search were always returning an empty candidates list.
Before the fix, we were trying to find any words[n] near words[n]
instead of finding any words[n] near words[n+1], for example:
for a phrase search '"Hello world"' we were searching for "hello" near "hello" first, instead of "hello" near "world".
Instead of using an arbitrary limit we encode the absolute position in a u32
using one strong u16 for the field id and a weak u16 for the relative position in the attribute.
184: Transfer numbers and strings facets into the appropriate facet databases r=Kerollmops a=Kerollmops
This pull request is related to https://github.com/meilisearch/milli/issues/152 and changes the layout of the facets values, numbers and strings are now in dedicated databases and the user no more needs to define the type of the fields. No more conversion between the two types is done, numbers (floats and integers converted to f64) go to the facet float database and strings go to the strings facet database.
There is one related issue that I found regarding CSVs, the values in a CSV are always considered to be strings, [meilisearch/specifications#28](d916b57d74/text/0028-indexing-csv.md) fixes this issue by allowing the user to define the fields types using `:` in the "CSV Formatting Rules" section.
All previous tests on facets have been modified to pass again and I have also done hand-driven tests with the 115m songs dataset. Everything seems to be good!
Fixes#192.
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>