Samyak S Sarnayak
5ab505be33
Fix highlight by replacing num_graphemes_from_bytes
...
num_graphemes_from_bytes has been renamed in the tokenizer to
num_chars_from_bytes.
Highlight now works correctly!
2022-01-17 13:02:55 +05:30
Samyak S Sarnayak
c10f58b7bd
Update tokenizer to v0.2.7
2022-01-17 13:02:00 +05:30
Samyak S Sarnayak
e752bd06f7
Fix matching_words tests to compile successfully
...
The tests still fail due to a bug in https://github.com/meilisearch/tokenizer/pull/59
2022-01-17 11:37:45 +05:30
Samyak S Sarnayak
30247d70cd
Fix search highlight for non-unicode chars
...
The `matching_bytes` function takes a `&Token` now and:
- gets the number of bytes to highlight (unchanged).
- uses `Token.num_graphemes_from_bytes` to get the number of grapheme
clusters to highlight.
In essence, the `matching_bytes` function returns the number of matching
grapheme clusters instead of bytes. Should this function be renamed
then?
Added proper highlighting in the HTTP UI:
- requires dependency on `unicode-segmentation` to extract grapheme
clusters from tokens
- `<mark>` tag is put around only the matched part
- before this change, the entire word was highlighted even if only a
part of it matched
2022-01-17 11:37:44 +05:30
Tamo
0605c0ac68
apply review comments
2022-01-13 18:51:08 +01:00
Tamo
b22c80106f
add some settings to the fuzzed milli and use the published version of arbitrary json
2022-01-13 15:35:24 +01:00
Tamo
c94952e25d
update the readme + dependencies
2022-01-12 18:30:11 +01:00
Tamo
e1053989c0
add a fuzzer on milli
2022-01-12 17:57:54 +01:00
Tamo
98a365aaae
store the geopoint in three dimensions
2021-12-14 12:21:24 +01:00
Tamo
d671d6f0f1
remove an unused file
2021-12-13 19:27:34 +01:00
Clément Renault
25faef67d0
Remove the database setup in the filter_depth test
2021-12-09 11:57:53 +01:00
Clément Renault
65519bc04b
Test that empty filters return a None
2021-12-09 11:57:53 +01:00
Clément Renault
ef59762d8e
Prefer returning None instead of the Empty Filter state
2021-12-09 11:57:52 +01:00
Clément Renault
ee856a7a46
Limit the max filter depth to 2000
2021-12-07 17:36:45 +01:00
Clément Renault
32bd9f091f
Detect the filters that are too deep and return an error
2021-12-07 17:20:11 +01:00
Clément Renault
90f49eab6d
Check the filter max depth limit and reject the invalid ones
2021-12-07 16:32:48 +01:00
many
1b3923b5ce
Update all packages to 0.21.0
2021-11-29 12:17:59 +01:00
many
8970246bc4
Sort positions before iterating over them during word pair proximity extraction
2021-11-22 18:16:54 +01:00
Marin Postma
6e977dd8e8
change visibility of DocumentDeletionResult
2021-11-22 15:44:44 +01:00
many
35f9499638
Export tokenizer from milli
2021-11-18 16:57:12 +01:00
many
64ef5869d7
Update tokenizer v0.2.6
2021-11-18 16:56:05 +01:00
Marin Postma
6eb47ab792
remove update_id in UpdateBuilder
2021-11-16 13:07:04 +01:00
Marin Postma
09b4281cff
improve document addition returned metaimprove document addition
...
returned metaimprove document addition returned metaimprove document
addition returned metaimprove document addition returned metaimprove
document addition returned metaimprove document addition returned
metaimprove document addition returned meta
2021-11-10 14:08:36 +01:00
Marin Postma
721fc294be
improve document deletion returned meta
...
returns both the remaining number of documents and the number of deleted
documents.
2021-11-10 14:08:18 +01:00
Tamo
f28600031d
Rename the filter_parser crate into filter-parser
...
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-11-09 16:41:10 +01:00
Irevoire
0ea0146e04
implement deref &str on the tokens
2021-11-09 11:34:10 +01:00
Tamo
7483c7513a
fix the filterable fields
2021-11-07 01:52:19 +01:00
Tamo
e5af3ac65c
rename the filter_condition.rs to filter.rs
2021-11-06 16:37:55 +01:00
Tamo
6831c23449
merge with main
2021-11-06 16:34:30 +01:00
Tamo
b249989bef
fix most of the tests
2021-11-06 01:32:12 +01:00
Tamo
27a6a26b4b
makes the parse function part of the filter_parser
2021-11-05 10:46:54 +01:00
Tamo
76d961cc77
implements the last errors
2021-11-04 17:42:06 +01:00
Tamo
8234f9fdf3
recreate most filter error except for the geosearch
2021-11-04 17:24:55 +01:00
Tamo
07a5ffb04c
update http-ui
2021-11-04 15:52:22 +01:00
Tamo
a58bc5bebb
update milli with the new parser_filter
2021-11-04 15:02:36 +01:00
many
743ed9f57f
Bump milli version
2021-11-04 14:04:21 +01:00
many
7b3bac46a0
Change Attribute and Ranking rules errors
2021-11-04 13:19:32 +01:00
many
702589104d
Update version for the next release (v0.20.1)
2021-11-03 14:20:01 +01:00
many
0c0038488c
Change last error messages
2021-11-03 11:24:06 +01:00
Tamo
76a2adb7c3
re-enable the tests in the parser and start the creation of an error type
2021-11-02 17:35:17 +01:00
bors[bot]
5a6d22d4ec
Merge #407
...
407: Update version for the next release (v0.20.0) r=curquiza a=curquiza
Breaking because of #405 and #406
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-28 13:43:48 +00:00
bors[bot]
08ae47e475
Merge #405
...
405: Change some error messages r=ManyTheFish a=ManyTheFish
Co-authored-by: many <maxime@meilisearch.com>
2021-10-28 13:35:55 +00:00
Clémentine Urquizar
056ff13c4d
Update version for the next release (v0.20.0)
2021-10-28 14:52:57 +02:00
many
9f1e0d2a49
Refine asc/desc error messages
2021-10-28 14:47:17 +02:00
many
ed6db19681
Fix PR comments
2021-10-28 11:18:32 +02:00
marin postma
183d3dada7
return document count from builder
2021-10-28 10:33:04 +02:00
many
2be755ce75
Lower error check, already check in meilisearch
2021-10-27 19:50:41 +02:00
many
3599df77f0
Change some error messages
2021-10-27 19:33:01 +02:00
bors[bot]
d7943fe225
Merge #402
...
402: Optimize document transform r=MarinPostma a=MarinPostma
This pr optimizes the transform of documents additions in the obkv format. Instead on accepting any serializable objects, we instead treat json and CSV specifically:
- For json, we build a serde `Visitor`, that transform the json straight into obkv without intermediate representation.
- For csv, we directly write the lines in the obkv, applying other optimization as well.
Co-authored-by: marin postma <postma.marin@protonmail.com>
2021-10-26 09:55:28 +00:00
marin postma
baddd80069
implement review suggestions
2021-10-25 18:29:12 +02:00
marin postma
f9445c1d90
return float parsing error context in csv
2021-10-25 17:27:10 +02:00
bors[bot]
15c29cdd9b
Merge #401
...
401: Update version for the next release (v0.19.0) r=curquiza a=curquiza
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-25 12:49:53 +00:00
Clémentine Urquizar
208903ddde
Revert "Replacing pest with nom "
2021-10-25 11:58:00 +02:00
Clémentine Urquizar
679fe18b17
Update version for the next release (v0.19.0)
2021-10-25 11:52:17 +02:00
marin postma
3fcccc31b5
add document builder example
2021-10-25 10:26:43 +02:00
marin postma
430e9b13d3
add csv builder tests
2021-10-25 10:26:43 +02:00
marin postma
53c79e85f2
document errors
2021-10-25 10:26:43 +02:00
marin postma
2e62925a6e
fix tests
2021-10-25 10:26:42 +02:00
marin postma
0f86d6b28f
implement csv serialization
2021-10-25 10:26:42 +02:00
marin postma
8d70b01714
optimize document deserialization
2021-10-25 10:26:42 +02:00
Tamo
1327807caa
add some error messages
2021-10-22 19:00:33 +02:00
Tamo
c8d03046bf
add a check on the fid in the geosearch
2021-10-22 18:08:18 +02:00
Tamo
3942b3732f
re-implement the geosearch
2021-10-22 18:03:39 +02:00
Tamo
7cd9109e2f
lowercase value extracted from Token
2021-10-22 17:50:15 +02:00
Tamo
e25ca9776f
start updating the exposed function to makes other modules happy
2021-10-22 17:23:22 +02:00
Tamo
6c9165b6a8
provide a helper to parse the token but to not handle the errors
2021-10-22 16:52:13 +02:00
Tamo
efb2f8b325
convert the errors
2021-10-22 16:38:35 +02:00
Tamo
c27870e765
integrate a first version without any error handling
2021-10-22 14:33:18 +02:00
Tamo
01dedde1c9
update some names and move some parser out of the lib.rs
2021-10-22 01:59:38 +02:00
Tamo
c634d43ac5
add a simple test on the filters with an integer
2021-10-21 17:10:27 +02:00
Tamo
6c15f50899
rewrite the parser logic
2021-10-21 16:45:42 +02:00
Tamo
e1d81342cf
add test on the or and and operator
2021-10-21 13:01:25 +02:00
Tamo
423baac08b
fix the tests
2021-10-21 12:45:40 +02:00
Tamo
36281a653f
write all the simple tests
2021-10-21 12:40:11 +02:00
Clémentine Urquizar
f8fe9316c0
Update version for the next release (v0.18.1)
2021-10-21 11:56:14 +02:00
Tamo
661bc21af5
Fix the filter parser
...
And add a bunch of tests on the filter::from_array
2021-10-21 11:45:03 +02:00
Clémentine Urquizar
2209acbfe2
Update version for the next release (v0.18.2)
2021-10-18 13:45:48 +02:00
bors[bot]
59cc59e93e
Merge #358
...
358: Replacing pest with nom r=Kerollmops a=CNLHC
Co-authored-by: 刘瀚骋 <cn_lhc@qq.com>
2021-10-16 20:44:38 +00:00
刘瀚骋
7666e4f34a
follow the suggestions
2021-10-14 21:37:59 +08:00
刘瀚骋
2ea2f7570c
use nightly cargo to format the code
2021-10-14 16:46:13 +08:00
刘瀚骋
e750465e15
check logic for geolocation.
2021-10-14 16:12:00 +08:00
bors[bot]
aa5e099718
Merge #390
...
390: Add helper methods on the settings r=Kerollmops a=irevoire
This would be a good addition to look at the content of a setting without consuming it.
It’s useful for analytics.
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-10-13 20:36:30 +00:00
bors[bot]
c7db4176f3
Merge #384
...
384: Replace memmap with memmap2 r=Kerollmops a=palfrey
[memmap is unmaintained](https://rustsec.org/advisories/RUSTSEC-2020-0077.html ) and needs replacing. memmap2 is a drop-in replacement fork that's well maintained. Note that the version numbers got reset on fork, hence the lower values.
Co-authored-by: Tom Parker-Shemilt <palfrey@tevp.net>
2021-10-13 13:47:23 +00:00
Irevoire
a3e7c468cd
add helper methods on the settings
2021-10-13 13:05:07 +02:00
刘瀚骋
cd359cd96e
WIP: extract the error trait bound to new trait.
2021-10-13 18:04:15 +08:00
刘瀚骋
5de5dd80a3
WIP: remove '_nom' suffix/redundant error enum/...
2021-10-13 11:06:15 +08:00
刘瀚骋
2c65781d91
format
2021-10-12 22:20:22 +08:00
bors[bot]
6e3b869e6a
Merge #388
...
388: fix primary key inference r=MarinPostma a=MarinPostma
The primary key is was infered from a hashtable index of the field. For this reason the order in which the fields were interated upon was not deterministic, and the primary key was chosed ffrom the first field containing "id".
This fix sorts the the index by field_id when infering the primary key.
Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-10-12 09:25:16 +00:00
mpostma
86ead92ed5
infer primary key on sorted fields
2021-10-12 11:15:11 +02:00
mpostma
9a266a531b
test correct primary key inference
2021-10-12 11:08:53 +02:00
many
c5a6075484
Make max_position_per_attributes changable
2021-10-12 10:10:50 +02:00
many
360c5ff3df
Remove limit of 1000 position per attribute
...
Instead of using an arbitrary limit we encode the absolute position in a u32
using one strong u16 for the field id and a weak u16 for the relative position in the attribute.
2021-10-12 10:10:50 +02:00
刘瀚骋
d323e35001
add a test case
2021-10-12 13:30:40 +08:00
刘瀚骋
70f576d5d3
error handling
2021-10-12 13:30:40 +08:00
刘瀚骋
28f9be8d7c
support syntax
2021-10-12 13:30:40 +08:00
刘瀚骋
469d92c569
tweak error handling
2021-10-12 13:30:40 +08:00
刘瀚骋
7a90a101ee
reorganize parser logic
2021-10-12 13:30:40 +08:00
刘瀚骋
f7796edc7e
remove everything about pest
2021-10-12 13:30:40 +08:00
刘瀚骋
ac1df9d9d7
fix typo and remove pest
2021-10-12 13:30:40 +08:00
刘瀚骋
50ad750ec1
enhance error handling
2021-10-12 13:30:40 +08:00