f3r10
fd60a39f1c
Format code
2023-01-31 11:28:05 +01:00
f3r10
d97fb6117e
Extract and index data
2023-01-31 11:28:05 +01:00
ManyTheFish
d1fc42b53a
Use compatibility decomposition normalizer in facets
2023-01-18 15:02:13 +01:00
Loïc Lecrenier
8d0ace2d64
Avoid creating a MatchingWord for words that exceed the length limit
2022-11-28 10:20:13 +01:00
Loïc Lecrenier
ac3baafbe8
Truncate facet values that are too long before indexing them
2022-11-17 11:29:42 +01:00
Loïc Lecrenier
d95d02cb8a
Fix Facet Indexing bugs
...
1. Handle keys with variable length correctly
This fixes https://github.com/meilisearch/meilisearch/issues/3042 and
is easily reproducible with the updated fuzz tests, which now generate
keys with variable lengths.
2. Prevent adding facets to the database if their encoded value does
not satisfy `valid_lmdb_key`.
This fixes an indexing failure when a document had a filterable
attribute containing a value whose length is higher than ~500 bytes.
2022-11-17 11:29:42 +01:00
unvalley
70465aa5ce
Execute cargo fmt
2022-11-04 08:59:58 +09:00
unvalley
3009981d31
Fix clippy errors
...
Add clippy job
Add clippy job to CI
2022-11-04 08:58:14 +09:00
unvalley
c7322f704c
Fix cargo clippy errors
...
Dont apply clippy for tests for now
Fix clippy warnings of filter-parser package
parent 8352febd646ec4bcf56a44161e5c4dce0e55111f
author unvalley <38400669+unvalley@users.noreply.github.com> 1666325847 +0900
committer unvalley <kirohi.code@gmail.com> 1666791316 +0900
Update .github/workflows/rust.yml
Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
Allow clippy lint too_many_argments
Allow clippy lint needless_collect
Allow clippy lint too_many_arguments and type_complexity
Fix for clippy warnings comparison_chains
Fix for clippy warnings vec_init_then_push
Allow clippy lint should_implement_trait
Allow clippy lint drop_non_drop
Fix lifetime clipy warnings in filter-paprser
Execute cargo fmt
Fix clippy remaining warnings
Fix clippy remaining warnings again and allow lint on each place
2022-10-27 01:04:23 +09:00
Loïc Lecrenier
54c0cf93fe
Merge remote-tracking branch 'origin/main' into facet-levels-refactor
2022-10-26 15:13:34 +02:00
Loïc Lecrenier
a034a1e628
Move StrRefCodec and ByteSliceRefCodec to their own files
2022-10-26 13:47:46 +02:00
Loïc Lecrenier
51961e1064
Polish some details
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
b1ab09196c
Remove outdated TODOs
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
9026867d17
Give same interface to bulk and incremental facet indexing types
...
+ cargo fmt, oops, sorry for the bad history :(
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
485a72306d
Refactor facet-related codecs
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
afdf87f6f7
Fix bugs in asc/desc criterion and facet indexing
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
a7201ece04
cargo fmt
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
61252248fb
Fix some facet indexing bugs
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
85824ee203
Try to make facet indexing incremental
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
39a4a0a362
Reintroduce filter range search and facet extractors
2022-10-26 13:46:14 +02:00
Loïc Lecrenier
c3f49f766d
Prepare refactor of facets database
...
Prepare refactor of facets database
2022-10-26 13:46:14 +02:00
Ewan Higgs
6b2fe94192
Fixes for clippy bringing us down to 18 remaining issues.
...
This brings us a step closer to enforcing clippy on each build.
2022-10-25 20:49:02 +02:00
Loïc Lecrenier
d76d0cb1bf
Merge branch 'main' into word-pair-proximity-docids-refactor
2022-10-24 15:23:00 +02:00
Loïc Lecrenier
a7de4f5b85
Don't add swapped word pairs to the word_pair_proximity_docids db
2022-10-18 10:37:34 +02:00
Loïc Lecrenier
bdeb47305e
Change encoding of word_pair_proximity DB to (proximity, word1, word2)
...
Same for word_prefix_pair_proximity
2022-10-18 10:37:34 +02:00
Ewan Higgs
beb987d3d1
Fixing piles of clippy errors.
...
Most of these are calling clone when the struct supports Copy.
Many are using & and &mut on `self` when the function they are called
from already has an immutable or mutable borrow so this isn't needed.
I tried to stay away from actual changes or places where I'd have to
name fresh variables.
2022-10-13 22:02:54 +02:00
msvaljek
762e320c35
Add proximity calculation for the same word
2022-10-07 12:59:12 +02:00
vishalsodani
00c02d00f3
Add missing logging timer to extractors
2022-09-30 22:17:06 +05:30
Loïc Lecrenier
3794962330
Use an unstable algorithm for grenad::Sorter when possible
2022-09-13 14:49:53 +02:00
Kerollmops
fe3973a51c
Make sure that long words are correctly skipped
2022-09-07 15:03:32 +02:00
Loïc Lecrenier
306593144d
Refactor word prefix pair proximity indexation
2022-08-17 11:59:00 +02:00
Loïc Lecrenier
07003704a8
Merge branch 'filter/field-exist'
2022-07-21 14:51:41 +02:00
Loïc Lecrenier
1506683705
Avoid using too much memory when indexing facet-exists-docids
2022-07-19 14:42:35 +02:00
Loïc Lecrenier
aed8c69bcb
Refactor indexation of the "facet-id-exists-docids" database
...
The idea is to directly create a sorted and merged list of bitmaps
in the form of a BTreeMap<FieldId, RoaringBitmap> instead of creating
a grenad::Reader where the keys are field_id and the values are docids.
Then we send that BTreeMap to the thing that handles TypedChunks, which
inserts its content into the database.
2022-07-19 10:07:33 +02:00
Loïc Lecrenier
80b962b4f4
Run cargo fmt
2022-07-19 10:07:33 +02:00
Loïc Lecrenier
30bd4db0fc
Simplify indexing task for facet_exists_docids database
2022-07-19 10:07:33 +02:00
Loïc Lecrenier
392472f4bb
Apply suggestions from code review
...
Co-authored-by: Tamo <tamo@meilisearch.com>
2022-07-19 10:07:33 +02:00
Loïc Lecrenier
453d593ce8
Add a database containing the docids where each field exists
2022-07-19 10:07:33 +02:00
Kerollmops
2eec290424
Check the validity of the latitute and longitude numbers
2022-07-12 15:14:06 +02:00
Kerollmops
d1a4da9812
Generate a real UUIDv4 when ids are auto-generated
2022-07-12 15:14:06 +02:00
Kerollmops
fcfc4caf8c
Move the Object type in the lib.rs file and use it everywhere
2022-07-12 14:55:51 +02:00
Kerollmops
0146175fe6
Introduce the validate_documents_batch function
2022-07-12 14:55:51 +02:00
ManyTheFish
86ac8568e6
Use Charabia in milli
2022-06-02 16:59:11 +02:00
Tamo
0af399a6d7
fix the mixed dataset geosearch indexing bug
2022-05-16 17:37:45 +02:00
Tamo
c55368ddd4
apply code suggestion
...
Co-authored-by: Kerollmops <kero@meilisearch.com>
2022-05-04 14:11:03 +02:00
Tamo
3cb1f6d0a1
improve geosearch error messages
2022-05-02 19:20:47 +02:00
Irevoire
4f3ce6d9cd
nested fields
2022-04-07 16:58:46 +02:00
ad hoc
201fea0fda
limit extract_word_docids memory usage
2022-04-05 14:14:15 +02:00
ad hoc
b85cd4983e
remove field_id_from_position
2022-04-05 09:50:34 +02:00
ad hoc
b7694c34f5
remove println
2022-04-04 21:00:07 +02:00