1
0
mirror of https://github.com/meilisearch/MeiliSearch synced 2025-01-23 11:47:28 +01:00

1875 Commits

Author SHA1 Message Date
Clément Renault
19b6620a92
Merge pull request from meilisearch/distinct
Implement distinct attribute
2021-04-15 16:33:49 +02:00
Marin Postma
9c4660d3d6
add tests 2021-04-15 16:25:56 +02:00
Marin Postma
75464a1baa
review fixes 2021-04-15 16:25:56 +02:00
Marin Postma
2f73fa55ae
add documentation 2021-04-15 16:25:55 +02:00
Marin Postma
45c45e11dd
implement distinct attribute
distinct can return error

facet distinct on numbers

return distinct error

review fixes

make get_facet_value more generic

fixes
2021-04-15 16:25:55 +02:00
Clément Renault
6e126c96a9
Merge pull request from meilisearch/upd-tokenizer-v0.2.1
Update Tokenizer version to v0.2.1
2021-04-14 19:02:36 +02:00
Clémentine Urquizar
2c5c79d68e
Update Tokenizer version to v0.2.1 2021-04-14 18:54:04 +02:00
Clément Renault
c2df51aa95
Merge pull request from meilisearch/stop-words
Stop words
2021-04-14 17:33:06 +02:00
tamo
dcb00b2e54
test a new implementation of the stop_words 2021-04-12 18:35:33 +02:00
tamo
da036dcc3e
Revert "Integrate the stop_words in the querytree"
This reverts commit 12fb509d8470e6d0c3a424756c9838a1efe306d2.
We revert this commit because it's causing the bug .
The initial algorithm we implemented for the stop_words was:

1. remove the stop_words from the dataset
2. keep the stop_words in the query to see if we can generate new words by
   integrating typos or if the word was a prefix
=> This was causing the bug since, in the case of “The hobbit”, we were
   **always** looking for something starting with “t he” or “th e”
   instead of ignoring the word completely.

For now we are going to fix the bug by completely ignoring the
stop_words in the query.
This could cause another problem were someone mistyped a normal word and
ended up typing a stop_word.

For example imagine someone searching for the music “Won't he do it”.
If that person misplace one space and write “Won' the do it” then we
will loose a part of the request.

One fix would be to update our query tree to something like that:

---------------------
OR
  OR
    TOLERANT hobbit # the first option is to ignore the stop_word
    AND
      CONSECUTIVE   # the second option is to do as we are doing
        EXACT t	    # currently
        EXACT he
      TOLERANT hobbit
---------------------

This would increase drastically the size of our query tree on request
with a lot of stop_words. For example think of “The Lord Of The Rings”.

For now whatsoever we decided we were going to ignore this problem and consider
that it doesn't reduce too much the relevancy of the search to do that
while it improves the performances.
2021-04-12 18:35:33 +02:00
Clément Renault
f9eab6e0de
Merge pull request from meilisearch/release-drafter
Add release drafter files
2021-04-12 10:25:52 +02:00
Clémentine Urquizar
6a128d4ec7
Add release drafter files 2021-04-12 10:18:39 +02:00
Clément Renault
5efe67f375
Merge pull request from shekhirin/shekhirin/fix-settings-serde-tests
test(http): fix and refactor settings assert_(ser|de)_tokens
2021-04-11 10:52:38 +02:00
Alexey Shekhirin
3af8fa194c
test(http): combine settings assert_(ser|de)_tokens into 1 test 2021-04-10 12:13:59 +03:00
Clément Renault
0d09c64dde
Merge pull request from shekhirin/shekhirin/setting-enum
refactor(http, update): introduce setting enum
2021-04-09 22:48:58 +02:00
Alexey Shekhirin
84c1dda39d
test(http): setting enum serialize/deserialize 2021-04-08 17:03:40 +03:00
Alexey Shekhirin
dc636d190d
refactor(http, update): introduce setting enum 2021-04-08 17:03:40 +03:00
Clément Renault
2bcdd8844c
Merge pull request from meilisearch/reorganize-criterion
reorganize criterion
2021-04-01 19:50:16 +02:00
tamo
0a4bde1f2f
update the default ordering of the criterion 2021-04-01 19:45:31 +02:00
Clément Renault
ee3f93c029
Merge pull request from shekhirin/index-fields-ids-distribution-cache
feat(index): store fields distribution in index
2021-04-01 18:36:21 +02:00
Alexey Shekhirin
2658c5c545
feat(index): update fields distribution in clear & delete operations
fixes after review

bump the version of the tokenizer

implement a first version of the stop_words

The front must provide a BTreeSet containing the stop words
The stop_words are set at None if an empty Set is provided
add the stop-words in the http-ui interface

Use maplit in the test
and remove all the useless drop(rtxn) at the end of all tests

Integrate the stop_words in the querytree

remove the stop_words from the querytree except if it was a prefix or a typo

more fixes after review
2021-04-01 19:12:35 +03:00
Alexey Shekhirin
27c7ab6e00
feat(index): store fields distribution in index 2021-04-01 18:35:19 +03:00
Clément Renault
67e25f8724
Merge pull request from meilisearch/stop-words
Stop words
2021-04-01 14:02:37 +02:00
tamo
12fb509d84
Integrate the stop_words in the querytree
remove the stop_words from the querytree except if it was a prefix or a typo
2021-04-01 13:57:55 +02:00
tamo
a2f46029c7
implement a first version of the stop_words
The front must provide a BTreeSet containing the stop words
The stop_words are set at None if an empty Set is provided
add the stop-words in the http-ui interface

Use maplit in the test
and remove all the useless drop(rtxn) at the end of all tests
2021-04-01 13:57:55 +02:00
tamo
62a8f1d707
bump the version of the tokenizer 2021-04-01 13:49:22 +02:00
Clément Renault
56777af8e4
Merge pull request from shekhirin/index-fields-ids-distribution
feat(index): introduce fields_ids_distribution
2021-03-31 17:53:45 +02:00
Alexey Shekhirin
9205b640a4 feat(index): introduce fields_ids_distribution 2021-03-31 18:44:47 +03:00
Clément Renault
f2a786ecbf
Merge pull request from meilisearch/improve_httpui
add a button to display or show the facets
2021-03-31 17:07:04 +02:00
tamo
13ce0ebb87
stop requestings the facets if the user has hidden them 2021-03-31 16:27:32 +02:00
tamo
bcc131e866
add a button to display or hide the facets 2021-03-31 16:18:53 +02:00
Clément Renault
529c8f0eb1
Merge pull request from shekhirin/criterion-asc-desc-regex
fix(criterion): compile asc/desc regex only once
2021-03-30 15:18:21 +02:00
Alexey Shekhirin
2cb32edaa9 fix(criterion): compile asc/desc regex only once
use once_cell instead of lazy_static

reorder imports
2021-03-30 16:07:14 +03:00
Clément Renault
5a1d3609a9
Merge pull request from shekhirin/main
feat(search, criteria): const candidates threshold
2021-03-30 14:07:19 +02:00
Alexey Shekhirin
1e3f05db8f use fixed number of candidates as a threshold 2021-03-30 11:57:10 +03:00
Alexey Shekhirin
a776ec9718 fix division 2021-03-29 19:16:58 +03:00
Alexey Shekhirin
522e79f2e0 feat(search, criteria): introduce a percentage threshold to the asc/desc 2021-03-29 19:08:31 +03:00
Clément Renault
9ad8b74111
Merge pull request from irevoire/pin_tokenizer
select a specific release of the tokenizer instead of using the latests git commit
2021-03-25 22:58:11 +01:00
tamo
73dcdb27f6
select a specific release of the tokenizer instead of using the latests git commit 2021-03-25 15:00:18 +01:00
Clément Renault
6b7cc0022b
Merge pull request from meilisearch/fix-offset
fix broken offset
2021-03-15 22:15:18 +01:00
mpostma
9c27183876
fix broken offset 2021-03-15 20:23:50 +01:00
Clément Renault
25f8789aa5
Merge pull request from meilisearch/update-license
Update LICENSE
2021-03-15 16:26:22 +01:00
Clémentine Urquizar
3455082458
Update LICENSE 2021-03-15 16:15:14 +01:00
Clément Renault
b7b23cd4a8
Merge pull request from meilisearch/index-metadata
add index metadata
2021-03-15 14:20:50 +01:00
mpostma
f0210453a6
add updated at on put primary key 2021-03-15 14:05:48 +01:00
mpostma
615fe095e1
update index updated at on index writes 2021-03-15 14:05:47 +01:00
mpostma
80d0f9c49d
methods to update index time metadata 2021-03-15 14:05:47 +01:00
Clément Renault
c9f9d39b54
Merge pull request from meilisearch/github-ci-use-main
Rename master into main in the Github CI
2021-03-11 20:46:06 +01:00
Kerollmops
0cc3132f5a
Rename master into main in the Github CI 2021-03-11 14:44:47 +01:00
Clément Renault
38b6e8decd
Merge pull request from meilisearch/optimize-words-typo-criteria
Optimize the words criterion
2021-03-10 11:28:46 +01:00