10441 Commits

Author SHA1 Message Date
ManyTheFish
bb7a503e5d Compute prefix databases
We are now computing the prefix FST and a prefix delta in the Merger thread,
after all the databases are written, the main thread will recompute the prefix databases based on the prefix delta without needing any grenad temporary file anymore
2024-10-01 09:57:06 +02:00
F. Levi
eabc14c268 Refactor, handle more cases for phrases 2024-09-30 21:24:41 +03:00
meili-bors[bot]
e78da35287
Merge #4930
4930: Return `UserError::InvalidDocumentId` for primary keys with a length greater than 512 bytes r=curquiza a=flevi29

# Pull Request

## Related issue
Fixes #4843

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: F. Levi <55688616+flevi29@users.noreply.github.com>
2024-09-30 15:55:05 +00:00
Louis Dureuil
64589278ac
Appease *some* of clippy warnings 2024-09-30 16:08:29 +02:00
ManyTheFish
8df6daf308 Remove fid_wordcount_docids.rs 2024-09-30 11:52:31 +02:00
ManyTheFish
5b552caf42 Fix position in insertions 2024-09-30 11:46:32 +02:00
ManyTheFish
2b51a63418 Remove dead code 2024-09-30 11:42:36 +02:00
Louis Dureuil
3d8024fb2b
write the weighted fields ids map 2024-09-30 11:35:03 +02:00
Louis Dureuil
4b0da0ff24
Fix inversion of field_id and position 2024-09-30 11:34:50 +02:00
Louis Dureuil
079f2b5de0
Format error messages consistently 2024-09-30 11:34:31 +02:00
Timon Jurschitsch
84b4219a4f test: improve delete_index.rs 2024-09-29 10:16:31 +02:00
Timon Jurschitsch
5539a1904a test: improve performance of create_index.rs 2024-09-28 11:05:52 +02:00
F. Levi
00ccf53ffa Merge branch 'main' into change-matches-position-phrase-search 2024-09-27 15:52:05 +03:00
F. Levi
d20a39b959 Refactor find_best_match_interval 2024-09-27 15:44:30 +03:00
meili-bors[bot]
71b364286b
Merge #4957
4957: Update charabia feature flags r=dureuill a=ManyTheFish

# Pull Request

Add charabia's `turkish` feature flag into Meilisearch default tokenization flag



[All tests pipeline](https://github.com/meilisearch/meilisearch/actions/runs/11030036031)

Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-09-26 20:19:21 +00:00
meili-bors[bot]
86183e0807
Merge #4960
4960: Update rhai r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4956

A fix has been implemented in https://github.com/rhaiscript/rhai/issues/916

## What does this PR do?
- Use the latest version of rhai containing the fix

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-09-26 15:03:01 +00:00
Tamo
78a4b7949d update rhai to a version that shouldn’t panic 2024-09-26 15:04:03 +02:00
ManyTheFish
960060ebdf Fix fst builder when their is no previous FST 2024-09-25 16:53:00 +02:00
Clément Renault
3d244451df
Reduce the lru key size from 8 to 12 bytes 2024-09-25 16:14:13 +02:00
Clément Renault
5f53935c8a
Fix a bug in the Lru 2024-09-25 16:09:34 +02:00
Clément Renault
29a7623c3f
Fxi some logs 2024-09-25 15:57:50 +02:00
Clément Renault
e97041f7d0
Replace the Lru free list by a simple increment 2024-09-25 15:55:52 +02:00
Clément Renault
52d7f3ed1c
Reduce the lru key size from 20 to 8 bytes 2024-09-25 15:37:13 +02:00
Clément Renault
86d5e6d9ff
Use the new Lru 2024-09-25 14:54:56 +02:00
Clément Renault
759b9b1546
Introduce a new custom Lru 2024-09-25 14:49:12 +02:00
ManyTheFish
3f7a500f3b Build prefix fst 2024-09-25 14:36:06 +02:00
ManyTheFish
dc2cb58cf1 use charabia default for all-tokenization 2024-09-25 11:12:30 +02:00
ManyTheFish
e9580fe619 Add turkish normalization 2024-09-25 11:03:17 +02:00
meili-bors[bot]
8205254f4c
Merge #4955
4955: Upgrade "batch failed" log to error level r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4916 


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-25 08:18:44 +00:00
ManyTheFish
974272f2e9 Merge branch 'main' into indexer-edition-2024 2024-09-25 07:41:16 +02:00
Clément Renault
7ad037841f
Move the tracing info to eprintln 2024-09-24 18:21:58 +02:00
Clément Renault
e0c7067355
Expose an IndexedParallelIterator to the index function 2024-09-24 17:24:59 +02:00
meili-bors[bot]
efdc5739d7
Merge #4953
4953: Move the multi arroy index logic to the arroy wrapper r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4948

## What does this PR do?
- Make the `ArroyWrapper` we introduced in the last PR handle all the embedded for a specific docid itself.


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-09-24 15:02:24 +00:00
Tamo
b31e9bea26 while retrieving the readers on an arroywrapper, stops at the first empty reader 2024-09-24 16:33:17 +02:00
ManyTheFish
6e87332410 Change the way the FST is built 2024-09-24 16:28:31 +02:00
Clément Renault
2d1caf27df
Use eprintln to log 2024-09-24 15:59:50 +02:00
Clément Renault
92678383d6
Update charabia 2024-09-24 15:37:56 +02:00
Clément Renault
7f148c127c
Measure the SmallVec efficacity 2024-09-24 15:32:15 +02:00
Tamo
7f048b9732 early exit in the clear and contains 2024-09-24 15:02:38 +02:00
Tamo
8b4e2c7b17 Remove now unused method 2024-09-24 15:00:25 +02:00
Tamo
645a55317a merge the build and quantize method 2024-09-24 14:54:24 +02:00
meili-bors[bot]
8caf97db86
Merge #4954
4954: Fix bench by adding embedder r=ManyTheFish a=dureuill

Fix benchmark workloads following breaking change on embedders

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-24 12:53:34 +00:00
Tamo
b8a74e0464 fix comments 2024-09-24 10:59:15 +02:00
Tamo
fd8447c521 fix the del items thing 2024-09-24 10:52:05 +02:00
Tamo
f2d187ba3e rename the index method to embedder_index 2024-09-24 10:39:40 +02:00
Tamo
79d8a7a51a rename the embedder index for clarity 2024-09-24 10:36:28 +02:00
Louis Dureuil
86da0e83fe
Upgrade "batch failed" log to ERROR level 2024-09-24 10:02:53 +02:00
Louis Dureuil
0704fb71e9
Fix bench by adding embedder 2024-09-24 09:56:47 +02:00
Clément Renault
4ce5d3d66d
Do not check before pushing in bitmaps 2024-09-24 09:43:16 +02:00
Tamo
1e4d4e69c4 finish the arroywrapper 2024-09-23 18:56:15 +02:00