Kerollmops
fb79c32430
Compute the new, common and, deleted prefix words fst once
2022-01-27 11:00:18 +01:00
Clément Renault
51d1e64b23
Remove, now useless, the WriteMethod enum
2022-01-27 10:08:35 +01:00
Clément Renault
e9c02173cf
Rework the WordsPrefixPositionDocids update to compute a subset of the database
2022-01-27 10:08:35 +01:00
Clément Renault
dbba5fd461
Create a function to simplify the word prefix pair proximity docids compute
2022-01-27 10:08:35 +01:00
Clément Renault
e760e02737
Fix the computation of the newly added and common prefix pair proximity words
2022-01-27 10:08:35 +01:00
Clément Renault
d59e559317
Fix the computation of the newly added and common prefix words
2022-01-27 10:08:34 +01:00
Clément Renault
2ec8542105
Rework the WordPrefixDocids update to compute a subset of the database
2022-01-27 10:08:34 +01:00
Clément Renault
28692f65be
Rework the WordPrefixDocids update to compute a subset of the database
2022-01-27 10:08:34 +01:00
Clément Renault
5404bc02dd
Move the fst_stream_into_hashset method in the helper methods
2022-01-27 10:06:00 +01:00
Clément Renault
c90fa95f93
Only compute the word prefix pairs on the created word pair proximities
2022-01-27 10:06:00 +01:00
Clément Renault
822f67e9ad
Bring the newly created word pair proximity docids
2022-01-27 10:06:00 +01:00
Clément Renault
d28f18658e
Retrieve the previous version of the words prefixes FST
2022-01-27 10:05:59 +01:00
Clément Renault
f9b214f34e
Apply suggestions from code review
...
Co-authored-by: Many <legendre.maxime.isn@gmail.com>
2022-01-26 11:28:11 +01:00
Clément Renault
f04cd19886
Introduce a max prefix length parameter to the word prefix pair proximity update
2022-01-25 17:04:23 +01:00
Clément Renault
1514dfa1b7
Introduce a max proximity parameter to the word prefix pair proximity update
2022-01-25 17:04:23 +01:00
Clément Renault
23ea3ad738
Remove the useless threshold when computing the word prefix pair proximity
2022-01-25 17:04:23 +01:00
Clément Renault
e3c34684c6
Fix a bug where we were skipping most of the prefix pairs
2022-01-25 17:04:23 +01:00
bors[bot]
fd177b63f8
Merge #423
...
423: Remove an unused file r=irevoire a=irevoire
This empty file is not included anywhere
Co-authored-by: Tamo <tamo@meilisearch.com>
2022-01-19 14:18:05 +00:00
Marin Postma
0c84a40298
document batch support
...
reusable transform
rework update api
add indexer config
fix tests
review changes
Co-authored-by: Clément Renault <clement@meilisearch.com>
fmt
2022-01-19 12:40:20 +01:00
Tamo
01968d7ca7
ensure we get no documents and no error when filtering on an empty db
2022-01-18 11:40:30 +01:00
bors[bot]
8f4499090b
Merge #433
...
433: fix(filter): Fix two bugs. r=Kerollmops a=irevoire
- Stop lowercasing the field when looking in the field id map
- When a field id does not exist it means there is currently zero
documents containing this field thus we return an empty RoaringBitmap
instead of throwing an internal error
Will fix https://github.com/meilisearch/MeiliSearch/issues/2082 once meilisearch is released
Co-authored-by: Tamo <tamo@meilisearch.com>
2022-01-17 14:06:53 +00:00
Tamo
d1ac40ea14
fix(filter): Fix two bugs.
...
- Stop lowercasing the field when looking in the field id map
- When a field id does not exist it means there is currently zero
documents containing this field thus we returns an empty RoaringBitmap
instead of throwing an internal error
2022-01-17 13:51:46 +01:00
Samyak S Sarnayak
2d7607734e
Run cargo fmt on matching_words.rs
2022-01-17 13:04:33 +05:30
Samyak S Sarnayak
5ab505be33
Fix highlight by replacing num_graphemes_from_bytes
...
num_graphemes_from_bytes has been renamed in the tokenizer to
num_chars_from_bytes.
Highlight now works correctly!
2022-01-17 13:02:55 +05:30
Samyak S Sarnayak
e752bd06f7
Fix matching_words tests to compile successfully
...
The tests still fail due to a bug in https://github.com/meilisearch/tokenizer/pull/59
2022-01-17 11:37:45 +05:30
Samyak S Sarnayak
30247d70cd
Fix search highlight for non-unicode chars
...
The `matching_bytes` function takes a `&Token` now and:
- gets the number of bytes to highlight (unchanged).
- uses `Token.num_graphemes_from_bytes` to get the number of grapheme
clusters to highlight.
In essence, the `matching_bytes` function returns the number of matching
grapheme clusters instead of bytes. Should this function be renamed
then?
Added proper highlighting in the HTTP UI:
- requires dependency on `unicode-segmentation` to extract grapheme
clusters from tokens
- `<mark>` tag is put around only the matched part
- before this change, the entire word was highlighted even if only a
part of it matched
2022-01-17 11:37:44 +05:30
Tamo
98a365aaae
store the geopoint in three dimensions
2021-12-14 12:21:24 +01:00
Tamo
d671d6f0f1
remove an unused file
2021-12-13 19:27:34 +01:00
Clément Renault
25faef67d0
Remove the database setup in the filter_depth test
2021-12-09 11:57:53 +01:00
Clément Renault
65519bc04b
Test that empty filters return a None
2021-12-09 11:57:53 +01:00
Clément Renault
ef59762d8e
Prefer returning None instead of the Empty Filter state
2021-12-09 11:57:52 +01:00
Clément Renault
ee856a7a46
Limit the max filter depth to 2000
2021-12-07 17:36:45 +01:00
Clément Renault
32bd9f091f
Detect the filters that are too deep and return an error
2021-12-07 17:20:11 +01:00
Clément Renault
90f49eab6d
Check the filter max depth limit and reject the invalid ones
2021-12-07 16:32:48 +01:00
many
8970246bc4
Sort positions before iterating over them during word pair proximity extraction
2021-11-22 18:16:54 +01:00
Marin Postma
6e977dd8e8
change visibility of DocumentDeletionResult
2021-11-22 15:44:44 +01:00
many
35f9499638
Export tokenizer from milli
2021-11-18 16:57:12 +01:00
Marin Postma
6eb47ab792
remove update_id in UpdateBuilder
2021-11-16 13:07:04 +01:00
Marin Postma
09b4281cff
improve document addition returned metaimprove document addition
...
returned metaimprove document addition returned metaimprove document
addition returned metaimprove document addition returned metaimprove
document addition returned metaimprove document addition returned
metaimprove document addition returned meta
2021-11-10 14:08:36 +01:00
Marin Postma
721fc294be
improve document deletion returned meta
...
returns both the remaining number of documents and the number of deleted
documents.
2021-11-10 14:08:18 +01:00
Irevoire
0ea0146e04
implement deref &str on the tokens
2021-11-09 11:34:10 +01:00
Tamo
7483c7513a
fix the filterable fields
2021-11-07 01:52:19 +01:00
Tamo
e5af3ac65c
rename the filter_condition.rs to filter.rs
2021-11-06 16:37:55 +01:00
Tamo
6831c23449
merge with main
2021-11-06 16:34:30 +01:00
Tamo
b249989bef
fix most of the tests
2021-11-06 01:32:12 +01:00
Tamo
27a6a26b4b
makes the parse function part of the filter_parser
2021-11-05 10:46:54 +01:00
Tamo
76d961cc77
implements the last errors
2021-11-04 17:42:06 +01:00
Tamo
8234f9fdf3
recreate most filter error except for the geosearch
2021-11-04 17:24:55 +01:00
Tamo
07a5ffb04c
update http-ui
2021-11-04 15:52:22 +01:00
Tamo
a58bc5bebb
update milli with the new parser_filter
2021-11-04 15:02:36 +01:00