Commit Graph

603 Commits

Author SHA1 Message Date
ManyTheFish
734d0899d3 Publish Matcher 2022-04-05 17:41:32 +02:00
ManyTheFish
4428cb5909 Add some tests and fix some corner cases 2022-04-05 17:41:32 +02:00
ManyTheFish
844f546a8b Add matches algorithm V1 2022-04-05 17:41:32 +02:00
ManyTheFish
3be1790803 Add crop algorithm with naive match algorithm 2022-04-05 17:41:32 +02:00
ManyTheFish
d96e72e5dc Create formater with some tests 2022-04-05 17:41:32 +02:00
ad hoc
30a2711bac
rename serde module to serde_impl module
needed because of issues with rustfmt
2022-04-04 20:10:55 +02:00
ad hoc
0fd55db21c
fmt 2022-04-04 20:10:55 +02:00
ad hoc
559e46be5e
fix bad rebase bug 2022-04-04 20:10:55 +02:00
ad hoc
8b1e5d9c6d
add test for exact words 2022-04-04 20:10:55 +02:00
ad hoc
774fa8f065
disable typos on exact words 2022-04-04 20:10:55 +02:00
ad hoc
9bbffb8fee
add exact words setting 2022-04-04 20:10:54 +02:00
ad hoc
853b4a520f
fmt 2022-04-04 10:41:46 +02:00
ad hoc
1941072bb2
implement Copy on Setting 2022-04-04 10:41:46 +02:00
ad hoc
fdaf45aab2
replace hardcoded value with constant in TestContext 2022-04-04 10:41:46 +02:00
ad hoc
950a740bd4
refactor typos for readability 2022-04-04 10:41:46 +02:00
ad hoc
66020cd923
rename min_word_len* to use plain letter numbers 2022-04-04 10:41:46 +02:00
ad hoc
4c4b336ecb
rename min word len for typo error 2022-04-01 11:17:03 +02:00
ad hoc
286dd7b2e4
rename min_word_len_2_typo 2022-04-01 11:17:03 +02:00
ad hoc
55af85db3c
add tests for min_word_len_for_typo 2022-04-01 11:17:02 +02:00
ad hoc
9102de5500
fix error message 2022-04-01 11:17:02 +02:00
ad hoc
a1a3a49bc9
dynamic minimum word len for typos in query tree builder 2022-04-01 11:17:02 +02:00
ad hoc
5a24e60572
introduce word len for typo setting 2022-04-01 11:17:02 +02:00
ad hoc
9fe40df960
add word derivations tests 2022-04-01 11:05:18 +02:00
ad hoc
d5ddc6b080
fix 2 typos word derivation bug 2022-04-01 10:51:22 +02:00
ad hoc
3e34981d9b
add test for authorize_typos in update 2022-03-31 14:12:00 +02:00
ad hoc
6ef3bb9d83
fmt 2022-03-31 14:06:23 +02:00
ad hoc
f782fe2062
add authorize_typo_test 2022-03-31 10:08:39 +02:00
ad hoc
c4653347fd
add authorize typo setting 2022-03-31 10:05:44 +02:00
bors[bot]
90276d9a2d
Merge #472
472: Remove useless variables in proximity r=Kerollmops a=ManyTheFish

Was passing by plane sweep algorithm to find some inspiration, and I discover that we have useless variables that were not detected because of the recursive function.

Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-03-16 15:33:11 +00:00
ManyTheFish
49d59d88c2 Remove useless variables in proximity 2022-03-16 16:12:52 +01:00
Bruno Casali
adc71742c8 Move string concat to the struct instead of in the calling 2022-03-16 10:26:12 -03:00
Bruno Casali
4822fe1beb Add a better error message when the filterable attrs are empty
Fixes https://github.com/meilisearch/meilisearch/issues/2140
2022-03-15 18:13:59 -03:00
bors[bot]
ad4c982c68
Merge #439
439: Optimize typo criterion r=Kerollmops a=MarinPostma

This pr implements a couple of optimization for the typo criterion:

- clamp max typo on concatenated query words to 1: By considering that a concatenated query word is a typo, we clamp the max number of typos allowed o it to 1. This is useful because we noticed that concatenated query words often introduced words with 2 typos in queries that otherwise didn't allow for 2 typo words.

- Make typos on the first letter count for 2. This change is a big performance gain: by considering the typos on the first letter to count as 2 typos, we drastically restrict the search space for 1 typo, and if we reach 2 typos, the search space is reduced as well, as we only consider: (2 typos ∩ correct first letter) ∪ (wrong first letter ∩ 1 typo) instead of 2 typos anywhere in the word.

## benches
```
group                                                                                                    main                                   typo
-----                                                                                                    ----                                   ----
smol-songs.csv: asc + default/Notstandskomitee                                                           2.51      5.8±0.01ms        ? ?/sec    1.00      2.3±0.01ms        ? ?/sec
smol-songs.csv: asc + default/charles                                                                    2.48      3.0±0.01ms        ? ?/sec    1.00   1190.9±1.29µs        ? ?/sec
smol-songs.csv: asc + default/charles mingus                                                             5.56     10.8±0.01ms        ? ?/sec    1.00   1935.3±1.00µs        ? ?/sec
smol-songs.csv: asc + default/david                                                                      1.65      3.9±0.00ms        ? ?/sec    1.00      2.4±0.01ms        ? ?/sec
smol-songs.csv: asc + default/david bowie                                                                3.34     12.5±0.02ms        ? ?/sec    1.00      3.7±0.00ms        ? ?/sec
smol-songs.csv: asc + default/john                                                                       1.00   1849.7±3.74µs        ? ?/sec    1.01   1875.1±4.65µs        ? ?/sec
smol-songs.csv: asc + default/marcus miller                                                              4.32     15.7±0.01ms        ? ?/sec    1.00      3.6±0.01ms        ? ?/sec
smol-songs.csv: asc + default/michael jackson                                                            3.31     12.5±0.01ms        ? ?/sec    1.00      3.8±0.00ms        ? ?/sec
smol-songs.csv: asc + default/tamo                                                                       1.05    565.4±0.86µs        ? ?/sec    1.00    539.3±1.22µs        ? ?/sec
smol-songs.csv: asc + default/thelonious monk                                                            3.49     11.5±0.01ms        ? ?/sec    1.00      3.3±0.00ms        ? ?/sec
smol-songs.csv: asc/Notstandskomitee                                                                     2.59      5.6±0.02ms        ? ?/sec    1.00      2.2±0.01ms        ? ?/sec
smol-songs.csv: asc/charles                                                                              6.05      2.1±0.00ms        ? ?/sec    1.00    347.8±0.60µs        ? ?/sec
smol-songs.csv: asc/charles mingus                                                                       14.46     9.4±0.01ms        ? ?/sec    1.00    649.2±0.97µs        ? ?/sec
smol-songs.csv: asc/david                                                                                3.87      2.4±0.00ms        ? ?/sec    1.00    618.2±0.69µs        ? ?/sec
smol-songs.csv: asc/david bowie                                                                          10.14     9.8±0.01ms        ? ?/sec    1.00    970.8±1.55µs        ? ?/sec
smol-songs.csv: asc/john                                                                                 1.00    546.5±1.10µs        ? ?/sec    1.00    547.1±2.11µs        ? ?/sec
smol-songs.csv: asc/marcus miller                                                                        11.45    10.4±0.06ms        ? ?/sec    1.00    907.9±1.37µs        ? ?/sec
smol-songs.csv: asc/michael jackson                                                                      10.56     9.7±0.01ms        ? ?/sec    1.00    919.6±1.03µs        ? ?/sec
smol-songs.csv: asc/tamo                                                                                 1.03     43.3±0.18µs        ? ?/sec    1.00     42.2±0.23µs        ? ?/sec
smol-songs.csv: asc/thelonious monk                                                                      4.16     10.7±0.02ms        ? ?/sec    1.00      2.6±0.00ms        ? ?/sec
smol-songs.csv: basic filter: <=/Notstandskomitee                                                        1.00     95.7±0.20µs        ? ?/sec    1.15   109.6±10.40µs        ? ?/sec
smol-songs.csv: basic filter: <=/charles                                                                 1.00     27.8±0.15µs        ? ?/sec    1.01     27.9±0.18µs        ? ?/sec
smol-songs.csv: basic filter: <=/charles mingus                                                          1.72    119.2±0.67µs        ? ?/sec    1.00     69.1±0.13µs        ? ?/sec
smol-songs.csv: basic filter: <=/david                                                                   1.00     22.3±0.33µs        ? ?/sec    1.05     23.4±0.19µs        ? ?/sec
smol-songs.csv: basic filter: <=/david bowie                                                             1.59     86.9±0.79µs        ? ?/sec    1.00     54.5±0.31µs        ? ?/sec
smol-songs.csv: basic filter: <=/john                                                                    1.00     17.9±0.06µs        ? ?/sec    1.06     18.9±0.15µs        ? ?/sec
smol-songs.csv: basic filter: <=/marcus miller                                                           1.65    102.7±1.63µs        ? ?/sec    1.00     62.3±0.18µs        ? ?/sec
smol-songs.csv: basic filter: <=/michael jackson                                                         1.76    128.2±1.85µs        ? ?/sec    1.00     72.9±0.19µs        ? ?/sec
smol-songs.csv: basic filter: <=/tamo                                                                    1.00     17.9±0.13µs        ? ?/sec    1.05     18.7±0.20µs        ? ?/sec
smol-songs.csv: basic filter: <=/thelonious monk                                                         1.53    157.5±2.38µs        ? ?/sec    1.00    102.8±0.88µs        ? ?/sec
smol-songs.csv: basic filter: TO/Notstandskomitee                                                        1.00    100.9±4.36µs        ? ?/sec    1.04    105.0±8.25µs        ? ?/sec
smol-songs.csv: basic filter: TO/charles                                                                 1.00     28.4±0.36µs        ? ?/sec    1.03     29.4±0.33µs        ? ?/sec
smol-songs.csv: basic filter: TO/charles mingus                                                          1.71    118.1±1.08µs        ? ?/sec    1.00     68.9±0.26µs        ? ?/sec
smol-songs.csv: basic filter: TO/david                                                                   1.00     24.0±0.26µs        ? ?/sec    1.03     24.6±0.43µs        ? ?/sec
smol-songs.csv: basic filter: TO/david bowie                                                             1.72     95.2±0.30µs        ? ?/sec    1.00     55.2±0.14µs        ? ?/sec
smol-songs.csv: basic filter: TO/john                                                                    1.00     18.8±0.09µs        ? ?/sec    1.06     19.8±0.17µs        ? ?/sec
smol-songs.csv: basic filter: TO/marcus miller                                                           1.61    102.4±1.65µs        ? ?/sec    1.00     63.4±0.24µs        ? ?/sec
smol-songs.csv: basic filter: TO/michael jackson                                                         1.77    132.1±1.41µs        ? ?/sec    1.00     74.5±0.59µs        ? ?/sec
smol-songs.csv: basic filter: TO/tamo                                                                    1.00     18.2±0.14µs        ? ?/sec    1.05     19.2±0.46µs        ? ?/sec
smol-songs.csv: basic filter: TO/thelonious monk                                                         1.49    150.8±1.92µs        ? ?/sec    1.00    101.3±0.44µs        ? ?/sec
smol-songs.csv: basic placeholder/                                                                       1.00     27.3±0.07µs        ? ?/sec    1.03     28.0±0.05µs        ? ?/sec
smol-songs.csv: basic with quote/"Notstandskomitee"                                                      1.00    122.4±0.17µs        ? ?/sec    1.03    125.6±0.16µs        ? ?/sec
smol-songs.csv: basic with quote/"charles"                                                               1.00     88.8±0.30µs        ? ?/sec    1.00     88.4±0.15µs        ? ?/sec
smol-songs.csv: basic with quote/"charles" "mingus"                                                      1.00    685.2±0.74µs        ? ?/sec    1.01    689.4±6.07µs        ? ?/sec
smol-songs.csv: basic with quote/"david"                                                                 1.00    161.6±0.42µs        ? ?/sec    1.01    162.6±0.17µs        ? ?/sec
smol-songs.csv: basic with quote/"david" "bowie"                                                         1.00    731.7±0.73µs        ? ?/sec    1.02    743.1±0.77µs        ? ?/sec
smol-songs.csv: basic with quote/"john"                                                                  1.00    267.1±0.33µs        ? ?/sec    1.01    270.9±0.33µs        ? ?/sec
smol-songs.csv: basic with quote/"marcus" "miller"                                                       1.00    138.7±0.31µs        ? ?/sec    1.02    140.9±0.13µs        ? ?/sec
smol-songs.csv: basic with quote/"michael" "jackson"                                                     1.01    841.4±0.72µs        ? ?/sec    1.00    833.8±0.92µs        ? ?/sec
smol-songs.csv: basic with quote/"tamo"                                                                  1.01    189.2±0.26µs        ? ?/sec    1.00    188.2±0.71µs        ? ?/sec
smol-songs.csv: basic with quote/"thelonious" "monk"                                                     1.00   1100.5±1.36µs        ? ?/sec    1.01   1111.7±2.17µs        ? ?/sec
smol-songs.csv: basic without quote/Notstandskomitee                                                     3.40      7.9±0.02ms        ? ?/sec    1.00      2.3±0.02ms        ? ?/sec
smol-songs.csv: basic without quote/charles                                                              2.57    494.4±0.89µs        ? ?/sec    1.00    192.5±0.18µs        ? ?/sec
smol-songs.csv: basic without quote/charles mingus                                                       1.29      2.8±0.02ms        ? ?/sec    1.00      2.1±0.01ms        ? ?/sec
smol-songs.csv: basic without quote/david                                                                1.95    623.8±0.90µs        ? ?/sec    1.00    319.2±1.22µs        ? ?/sec
smol-songs.csv: basic without quote/david bowie                                                          1.12      5.9±0.00ms        ? ?/sec    1.00      5.2±0.00ms        ? ?/sec
smol-songs.csv: basic without quote/john                                                                 1.24   1340.9±2.25µs        ? ?/sec    1.00   1084.7±7.76µs        ? ?/sec
smol-songs.csv: basic without quote/marcus miller                                                        7.97     14.6±0.01ms        ? ?/sec    1.00   1826.0±6.84µs        ? ?/sec
smol-songs.csv: basic without quote/michael jackson                                                      1.19      3.9±0.00ms        ? ?/sec    1.00      3.3±0.00ms        ? ?/sec
smol-songs.csv: basic without quote/tamo                                                                 1.65    737.7±3.58µs        ? ?/sec    1.00    446.7±0.51µs        ? ?/sec
smol-songs.csv: basic without quote/thelonious monk                                                      1.16      4.5±0.02ms        ? ?/sec    1.00      3.9±0.04ms        ? ?/sec
smol-songs.csv: big filter/Notstandskomitee                                                              3.27      7.6±0.02ms        ? ?/sec    1.00      2.3±0.01ms        ? ?/sec
smol-songs.csv: big filter/charles                                                                       8.26   1957.5±1.37µs        ? ?/sec    1.00    236.8±0.34µs        ? ?/sec
smol-songs.csv: big filter/charles mingus                                                                18.49    11.2±0.06ms        ? ?/sec    1.00    607.7±3.03µs        ? ?/sec
smol-songs.csv: big filter/david                                                                         3.78      2.4±0.00ms        ? ?/sec    1.00    622.8±0.80µs        ? ?/sec
smol-songs.csv: big filter/david bowie                                                                   9.00     12.0±0.01ms        ? ?/sec    1.00   1336.0±3.17µs        ? ?/sec
smol-songs.csv: big filter/john                                                                          1.00    554.2±0.95µs        ? ?/sec    1.01    560.4±0.79µs        ? ?/sec
smol-songs.csv: big filter/marcus miller                                                                 18.09    12.0±0.01ms        ? ?/sec    1.00    664.7±0.60µs        ? ?/sec
smol-songs.csv: big filter/michael jackson                                                               8.43     12.0±0.01ms        ? ?/sec    1.00   1421.6±1.37µs        ? ?/sec
smol-songs.csv: big filter/tamo                                                                          1.00     86.3±0.14µs        ? ?/sec    1.01     87.3±0.21µs        ? ?/sec
smol-songs.csv: big filter/thelonious monk                                                               5.55     14.3±0.02ms        ? ?/sec    1.00      2.6±0.01ms        ? ?/sec
smol-songs.csv: desc + default/Notstandskomitee                                                          2.52      5.8±0.01ms        ? ?/sec    1.00      2.3±0.01ms        ? ?/sec
smol-songs.csv: desc + default/charles                                                                   3.04      2.7±0.01ms        ? ?/sec    1.00    893.4±1.08µs        ? ?/sec
smol-songs.csv: desc + default/charles mingus                                                            6.77     10.3±0.01ms        ? ?/sec    1.00   1520.8±1.90µs        ? ?/sec
smol-songs.csv: desc + default/david                                                                     1.39      5.7±0.00ms        ? ?/sec    1.00      4.1±0.00ms        ? ?/sec
smol-songs.csv: desc + default/david bowie                                                               2.34     15.8±0.02ms        ? ?/sec    1.00      6.7±0.01ms        ? ?/sec
smol-songs.csv: desc + default/john                                                                      1.00      2.5±0.00ms        ? ?/sec    1.02      2.6±0.01ms        ? ?/sec
smol-songs.csv: desc + default/marcus miller                                                             5.06     14.5±0.02ms        ? ?/sec    1.00      2.9±0.01ms        ? ?/sec
smol-songs.csv: desc + default/michael jackson                                                           2.64     14.1±0.05ms        ? ?/sec    1.00      5.4±0.00ms        ? ?/sec
smol-songs.csv: desc + default/tamo                                                                      1.00    567.0±0.65µs        ? ?/sec    1.00    565.7±0.97µs        ? ?/sec
smol-songs.csv: desc + default/thelonious monk                                                           3.55     11.6±0.02ms        ? ?/sec    1.00      3.3±0.00ms        ? ?/sec
smol-songs.csv: desc/Notstandskomitee                                                                    2.58      5.6±0.02ms        ? ?/sec    1.00      2.2±0.02ms        ? ?/sec
smol-songs.csv: desc/charles                                                                             6.04      2.1±0.00ms        ? ?/sec    1.00    348.1±0.57µs        ? ?/sec
smol-songs.csv: desc/charles mingus                                                                      14.51     9.4±0.01ms        ? ?/sec    1.00    646.7±0.99µs        ? ?/sec
smol-songs.csv: desc/david                                                                               3.86      2.4±0.00ms        ? ?/sec    1.00    620.7±2.46µs        ? ?/sec
smol-songs.csv: desc/david bowie                                                                         10.10     9.8±0.01ms        ? ?/sec    1.00    973.9±3.31µs        ? ?/sec
smol-songs.csv: desc/john                                                                                1.00    545.5±0.78µs        ? ?/sec    1.00    547.2±0.48µs        ? ?/sec
smol-songs.csv: desc/marcus miller                                                                       11.39    10.3±0.01ms        ? ?/sec    1.00    903.7±0.95µs        ? ?/sec
smol-songs.csv: desc/michael jackson                                                                     10.51     9.7±0.01ms        ? ?/sec    1.00    924.7±2.02µs        ? ?/sec
smol-songs.csv: desc/tamo                                                                                1.01     43.2±0.33µs        ? ?/sec    1.00     42.6±0.35µs        ? ?/sec
smol-songs.csv: desc/thelonious monk                                                                     4.19     10.8±0.03ms        ? ?/sec    1.00      2.6±0.00ms        ? ?/sec
smol-songs.csv: prefix search/a                                                                          1.00   1008.7±1.00µs        ? ?/sec    1.00   1005.5±0.91µs        ? ?/sec
smol-songs.csv: prefix search/b                                                                          1.00    885.0±0.70µs        ? ?/sec    1.01    890.6±1.11µs        ? ?/sec
smol-songs.csv: prefix search/i                                                                          1.00   1051.8±1.25µs        ? ?/sec    1.00   1056.6±4.12µs        ? ?/sec
smol-songs.csv: prefix search/s                                                                          1.00    724.7±1.77µs        ? ?/sec    1.00    721.6±0.59µs        ? ?/sec
smol-songs.csv: prefix search/x                                                                          1.01    212.4±0.21µs        ? ?/sec    1.00    210.9±0.38µs        ? ?/sec
smol-songs.csv: proximity/7000 Danses Un Jour Dans Notre Vie                                             18.55    48.5±0.09ms        ? ?/sec    1.00      2.6±0.03ms        ? ?/sec
smol-songs.csv: proximity/The Disneyland Sing-Along Chorus                                               8.41     56.7±0.45ms        ? ?/sec    1.00      6.7±0.05ms        ? ?/sec
smol-songs.csv: proximity/Under Great Northern Lights                                                    15.74    38.9±0.14ms        ? ?/sec    1.00      2.5±0.00ms        ? ?/sec
smol-songs.csv: proximity/black saint sinner lady                                                        11.82    40.1±0.13ms        ? ?/sec    1.00      3.4±0.02ms        ? ?/sec
smol-songs.csv: proximity/les dangeureuses 1960                                                          6.90     26.1±0.13ms        ? ?/sec    1.00      3.8±0.04ms        ? ?/sec
smol-songs.csv: typo/Arethla Franklin                                                                    14.93     5.8±0.01ms        ? ?/sec    1.00    390.1±1.89µs        ? ?/sec
smol-songs.csv: typo/Disnaylande                                                                         3.18      7.3±0.01ms        ? ?/sec    1.00      2.3±0.00ms        ? ?/sec
smol-songs.csv: typo/dire straights                                                                      5.55     15.2±0.02ms        ? ?/sec    1.00      2.7±0.00ms        ? ?/sec
smol-songs.csv: typo/fear of the duck                                                                    28.03    20.0±0.03ms        ? ?/sec    1.00    713.3±1.54µs        ? ?/sec
smol-songs.csv: typo/indochie                                                                            19.25  1851.4±2.38µs        ? ?/sec    1.00     96.2±0.13µs        ? ?/sec
smol-songs.csv: typo/indochien                                                                           14.66  1887.7±3.18µs        ? ?/sec    1.00    128.8±0.18µs        ? ?/sec
smol-songs.csv: typo/klub des loopers                                                                    37.73    18.0±0.02ms        ? ?/sec    1.00    476.7±0.73µs        ? ?/sec
smol-songs.csv: typo/michel depech                                                                       10.17     5.8±0.01ms        ? ?/sec    1.00    565.8±1.16µs        ? ?/sec
smol-songs.csv: typo/mongus                                                                              15.33  1897.4±3.44µs        ? ?/sec    1.00    123.8±0.13µs        ? ?/sec
smol-songs.csv: typo/stromal                                                                             14.63  1859.3±2.40µs        ? ?/sec    1.00    127.1±0.29µs        ? ?/sec
smol-songs.csv: typo/the white striper                                                                   10.83     9.4±0.01ms        ? ?/sec    1.00    866.0±0.98µs        ? ?/sec
smol-songs.csv: typo/thelonius monk                                                                      14.40     3.8±0.00ms        ? ?/sec    1.00    261.5±1.30µs        ? ?/sec
smol-songs.csv: words/7000 Danses / Le Baiser / je me trompe de mots                                     5.54     70.8±0.09ms        ? ?/sec    1.00     12.8±0.03ms        ? ?/sec
smol-songs.csv: words/Bring Your Daughter To The Slaughter but now this is not part of the title         3.48    119.8±0.14ms        ? ?/sec    1.00     34.4±0.04ms        ? ?/sec
smol-songs.csv: words/The Disneyland Children's Sing-Alone song                                          8.98     71.9±0.12ms        ? ?/sec    1.00      8.0±0.01ms        ? ?/sec
smol-songs.csv: words/les liaisons dangeureuses 1793                                                     11.88    37.4±0.07ms        ? ?/sec    1.00      3.1±0.01ms        ? ?/sec
smol-songs.csv: words/seven nation mummy                                                                 22.86    23.4±0.04ms        ? ?/sec    1.00   1024.8±1.57µs        ? ?/sec
smol-songs.csv: words/the black saint and the sinner lady and the good doggo                             2.76    124.4±0.15ms        ? ?/sec    1.00     45.1±0.09ms        ? ?/sec
smol-songs.csv: words/whathavenotnsuchforth and a good amount of words to pop to match the first one     2.52    107.0±0.23ms        ? ?/sec    1.00     42.4±0.66ms        ? ?/sec

group                                                                                    main-wiki                              typo-wiki
-----                                                                                    ---------                              ---------
smol-wiki-articles.csv: basic placeholder/                                               1.02     13.7±0.02µs        ? ?/sec    1.00     13.4±0.03µs        ? ?/sec
smol-wiki-articles.csv: basic with quote/"film"                                          1.02    409.8±0.67µs        ? ?/sec    1.00    402.6±0.48µs        ? ?/sec
smol-wiki-articles.csv: basic with quote/"france"                                        1.00    325.9±0.91µs        ? ?/sec    1.00    326.4±0.49µs        ? ?/sec
smol-wiki-articles.csv: basic with quote/"japan"                                         1.00    218.4±0.26µs        ? ?/sec    1.01    220.5±0.20µs        ? ?/sec
smol-wiki-articles.csv: basic with quote/"machine"                                       1.00    143.0±0.12µs        ? ?/sec    1.04    148.8±0.21µs        ? ?/sec
smol-wiki-articles.csv: basic with quote/"miles" "davis"                                 1.00     11.7±0.06ms        ? ?/sec    1.00     11.8±0.01ms        ? ?/sec
smol-wiki-articles.csv: basic with quote/"mingus"                                        1.00      4.4±0.03ms        ? ?/sec    1.00      4.4±0.00ms        ? ?/sec
smol-wiki-articles.csv: basic with quote/"rock" "and" "roll"                             1.00     43.5±0.08ms        ? ?/sec    1.01     43.8±0.06ms        ? ?/sec
smol-wiki-articles.csv: basic with quote/"spain"                                         1.00    137.3±0.35µs        ? ?/sec    1.05    144.4±0.23µs        ? ?/sec
smol-wiki-articles.csv: basic without quote/film                                         1.00    125.3±0.30µs        ? ?/sec    1.06    133.1±0.37µs        ? ?/sec
smol-wiki-articles.csv: basic without quote/france                                       1.21   1782.6±1.65µs        ? ?/sec    1.00   1477.0±1.39µs        ? ?/sec
smol-wiki-articles.csv: basic without quote/japan                                        1.28   1363.9±0.80µs        ? ?/sec    1.00   1064.3±1.79µs        ? ?/sec
smol-wiki-articles.csv: basic without quote/machine                                      1.73    760.3±0.81µs        ? ?/sec    1.00    439.6±0.75µs        ? ?/sec
smol-wiki-articles.csv: basic without quote/miles davis                                  1.03     17.0±0.03ms        ? ?/sec    1.00     16.5±0.02ms        ? ?/sec
smol-wiki-articles.csv: basic without quote/mingus                                       1.07      5.3±0.01ms        ? ?/sec    1.00      5.0±0.00ms        ? ?/sec
smol-wiki-articles.csv: basic without quote/rock and roll                                1.01     63.9±0.18ms        ? ?/sec    1.00     63.0±0.07ms        ? ?/sec
smol-wiki-articles.csv: basic without quote/spain                                        2.07    667.4±0.93µs        ? ?/sec    1.00    322.8±0.29µs        ? ?/sec
smol-wiki-articles.csv: prefix search/c                                                  1.00    343.1±0.47µs        ? ?/sec    1.00    344.0±0.34µs        ? ?/sec
smol-wiki-articles.csv: prefix search/g                                                  1.00    374.4±3.42µs        ? ?/sec    1.00    374.1±0.44µs        ? ?/sec
smol-wiki-articles.csv: prefix search/j                                                  1.00    359.9±0.31µs        ? ?/sec    1.00    361.2±0.79µs        ? ?/sec
smol-wiki-articles.csv: prefix search/q                                                  1.01    102.0±0.12µs        ? ?/sec    1.00    101.4±0.32µs        ? ?/sec
smol-wiki-articles.csv: prefix search/t                                                  1.00    536.7±1.39µs        ? ?/sec    1.00    534.3±0.84µs        ? ?/sec
smol-wiki-articles.csv: prefix search/x                                                  1.00    400.9±1.00µs        ? ?/sec    1.00    399.5±0.45µs        ? ?/sec
smol-wiki-articles.csv: proximity/april paris                                            3.86     14.4±0.01ms        ? ?/sec    1.00      3.7±0.01ms        ? ?/sec
smol-wiki-articles.csv: proximity/diesel engine                                          12.98    10.4±0.01ms        ? ?/sec    1.00    803.5±1.13µs        ? ?/sec
smol-wiki-articles.csv: proximity/herald sings                                           1.00     12.7±0.06ms        ? ?/sec    5.29     67.1±0.09ms        ? ?/sec
smol-wiki-articles.csv: proximity/tea two                                                6.48   1452.1±2.78µs        ? ?/sec    1.00    224.1±0.38µs        ? ?/sec
smol-wiki-articles.csv: typo/Disnaylande                                                 3.89      8.5±0.01ms        ? ?/sec    1.00      2.2±0.01ms        ? ?/sec
smol-wiki-articles.csv: typo/aritmetric                                                  3.78     10.3±0.01ms        ? ?/sec    1.00      2.7±0.00ms        ? ?/sec
smol-wiki-articles.csv: typo/linax                                                       8.91   1426.7±0.97µs        ? ?/sec    1.00    160.1±0.18µs        ? ?/sec
smol-wiki-articles.csv: typo/migrosoft                                                   7.48   1417.3±5.84µs        ? ?/sec    1.00    189.5±0.88µs        ? ?/sec
smol-wiki-articles.csv: typo/nympalidea                                                  3.96      7.2±0.01ms        ? ?/sec    1.00   1810.1±2.03µs        ? ?/sec
smol-wiki-articles.csv: typo/phytogropher                                                3.71      7.2±0.01ms        ? ?/sec    1.00   1934.3±6.51µs        ? ?/sec
smol-wiki-articles.csv: typo/sisan                                                       6.44   1497.2±1.38µs        ? ?/sec    1.00    232.7±0.94µs        ? ?/sec
smol-wiki-articles.csv: typo/the fronce                                                  6.92      2.9±0.00ms        ? ?/sec    1.00    418.0±1.76µs        ? ?/sec
smol-wiki-articles.csv: words/Abraham machin                                             16.63    10.8±0.01ms        ? ?/sec    1.00    649.7±1.08µs        ? ?/sec
smol-wiki-articles.csv: words/Idaho Bellevue pizza                                       27.15    25.6±0.03ms        ? ?/sec    1.00    944.2±5.07µs        ? ?/sec
smol-wiki-articles.csv: words/Kameya Tokujirō mingus monk                                26.87    40.7±0.05ms        ? ?/sec    1.00   1515.3±2.73µs        ? ?/sec
smol-wiki-articles.csv: words/Ulrich Hensel meilisearch milli                            11.99    48.8±0.10ms        ? ?/sec    1.00      4.1±0.02ms        ? ?/sec
smol-wiki-articles.csv: words/the black saint and the sinner lady and the good doggo     4.90    110.0±0.15ms        ? ?/sec    1.00     22.4±0.03ms        ? ?/sec

```

Co-authored-by: mpostma <postma.marin@protonmail.com>
Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-03-15 16:43:36 +00:00
ad hoc
3f24555c3d
custom fst automatons 2022-03-15 17:38:35 +01:00
ad hoc
628c835a22
fix tests 2022-03-15 17:38:34 +01:00
bors[bot]
8efac33b53
Merge #467
467: optimize prefix database r=Kerollmops a=MarinPostma

This pr introduces two optimizations that greatly improve the speed of computing prefix databases.

- The time that it takes to create the prefix FST has been divided by 5 by inverting the way we iterated over the words FST.
- We unconditionally and needlessly checked for documents to remove in  `word_prefix_pair`, which caused an iteration over the whole database.

Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-03-15 16:14:35 +00:00
ad hoc
d127c57f2d
review edits 2022-03-15 17:12:48 +01:00
ad hoc
d633ac5b9d
optimize word prefix pair 2022-03-15 16:37:22 +01:00
ad hoc
d68fe2b3c7
optimize word prefix fst 2022-03-15 16:36:48 +01:00
Clément Renault
0c5f4ed7de
Apply suggestions
Co-authored-by: Many <many@meilisearch.com>
2022-03-15 14:18:29 +01:00
Kerollmops
21ec334dcc
Fix the compilation error of the dependency versions 2022-03-15 11:17:45 +01:00
psvnl sai kumar
5e08fac729 fixes for rustfmt pass 2022-03-14 19:22:41 +05:30
psvnl sai kumar
92e2e09434 exporting heed to avoid having different versions of Heed in Meilisearch 2022-03-14 01:01:58 +05:30
Kerollmops
1ae13c1374
Avoid iterating on big databases when useless 2022-03-09 15:43:54 +01:00
Bruno Casali
66c6d5e1ef Add a new error message when the valid_fields is empty
> "Attribute `{}` is not sortable. This index doesn't have configured sortable attributes."
> "Attribute `{}` is not sortable. Available sortable attributes are: `{}`."

coexist in the error handling
2022-03-05 10:38:18 -03:00
Kerollmops
d5b8b5a2f8
Replace the ugly unwraps by clean if let Somes 2022-02-28 16:31:33 +01:00
Kerollmops
8d26f3040c
Remove a useless grenad file merging 2022-02-28 16:31:33 +01:00
Clément Renault
04b1bbf932
Reintroduce appending sorted entries when possible 2022-02-24 14:50:45 +01:00
bors[bot]
25123af3b8
Merge #436
436: Speed up the word prefix databases computation time r=Kerollmops a=Kerollmops

This PR depends on the fixes done in #431 and must be merged after it.

In this PR we will bring the `WordPrefixPairProximityDocids`, `WordPrefixDocids` and, `WordPrefixPositionDocids` update structures to a new era, a better era, where computing the word prefix pair proximities costs much fewer CPU cycles, an era where this update structure can use the, previously computed, set of new word docids from the newly indexed batch of documents.

---

The `WordPrefixPairProximityDocids` is an update structure, which means that it is an object that we feed with some parameters and which modifies the LMDB database of an index when asked for. This structure specifically computes the list of word prefix pair proximities, which correspond to a list of pairs of words associated with a proximity (the distance between both words) where the second word is not a word but a prefix e.g. `s`, `se`, `a`. This word prefix pair proximity is associated with the list of documents ids which contains the pair of words and prefix at the given proximity.

The origin of the performances issue that this struct brings is related to the fact that it starts its job from the beginning, it clears the LMDB database before rewriting everything from scratch, using the other LMDB databases to achieve that. I hope you understand that this is absolutely not an optimized way of doing things.

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-02-16 15:41:14 +00:00
Clément Renault
ff8d7a810d
Change the behavior of the as_cloneable_grenad by taking a ref 2022-02-16 15:40:08 +01:00