Commit Graph

98 Commits

Author SHA1 Message Date
Clément Renault
0d4482625a
Make the changes to use heed v0.20-alpha.6 2023-11-23 11:43:58 +01:00
ManyTheFish
d3575fb028 Make into_del_add_obkv parameters more human readable 2023-11-20 16:10:39 +01:00
Louis Dureuil
772964125d
Factor removal of document from DB 2023-11-13 13:51:22 +01:00
Louis Dureuil
264b10ec20
Fixup documentation 2023-11-09 16:23:20 +01:00
Louis Dureuil
3053e01c05
Batch::remove_documents_from_db_no_batch 2023-11-09 14:23:02 +01:00
Louis Dureuil
1ad1fcc8c8
Remove all warnings 2023-11-06 10:31:14 +01:00
ManyTheFish
bf0651f23c Implement iter method on ExternalDocumentsIds 2023-11-02 15:38:00 +01:00
ManyTheFish
5b20e625f3 fix merge 2023-11-02 15:31:37 +01:00
ManyTheFish
bc51d6157a Fix transform reindexing path 2023-11-02 15:26:20 +01:00
ManyTheFish
12323d610e Change the original document sorter key from the internal docid to a concatenation of the internal and the external docid 2023-11-02 15:26:20 +01:00
Clément Renault
4d864f0702
Always sort internal Sorter entries in parallel 2023-11-02 14:47:43 +01:00
Clément Renault
c71b1d33ae
Sort entries using rayon in the transform sorters 2023-11-01 11:07:16 +01:00
Clément Renault
0fc446c62f
Add more timing logs to the Transform 2023-11-01 11:07:16 +01:00
Louis Dureuil
de10f20732
Fix field distribution again 2023-10-30 17:47:22 +01:00
Louis Dureuil
54d07a8da3
Update field distribution taking into account both deletions and additions 2023-10-30 14:47:51 +01:00
Clément Renault
dfab6293c9
Use an LMDB database to store the external documents ids 2023-10-30 11:41:23 +01:00
Louis Dureuil
113527f466
Remove soft-deleted related methods from Index 2023-10-30 11:41:22 +01:00
Louis Dureuil
c6b3c18c85
WIP: Comment out document deletion in other pipelines than update
TODO: fix calls to DELETE route
2023-10-30 11:40:20 +01:00
ManyTheFish
313b16bec2
Support diff indexing on extract_docid_word_positions 2023-10-30 11:24:19 +01:00
ManyTheFish
1dd97578a8
Make the transform struct return diff-based documents obkvs 2023-10-30 11:22:07 +01:00
Tamo
c0f2724c2d get rids of the new introduced error code in favor of an io::Error 2023-10-10 15:12:23 +02:00
Tamo
d772073dfa use a bufreader everytime there is a grenad<file> 2023-10-10 15:00:30 +02:00
Kerollmops
eef95de30e
First iteration on exposing puffin profiling 2023-07-18 17:38:13 +02:00
Tamo
602ad98cb8 improve the way we handle the fsts 2023-05-22 11:15:14 +02:00
Tamo
4391cba6ca
fix the addition + deletion bug 2023-05-17 18:28:57 +02:00
Tamo
895ab2906c apply review suggestions 2023-02-16 18:42:47 +01:00
Tamo
74dcfe9676
Fix a bug when you update a document that was already present in the db, deleted and then inserted again in the same transform 2023-02-14 19:09:40 +01:00
Tamo
1b1703a609
make a small optimization to merge obkvs a little bit faster 2023-02-14 18:32:41 +01:00
Tamo
fb5e4957a6
fix and test the early exit in case a grenad ends with a deletion 2023-02-14 18:23:57 +01:00
Tamo
8de3c9f737
Update milli/src/update/index_documents/transform.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-02-14 17:57:14 +01:00
Tamo
43a19d0709
document the operation enum + the grenads 2023-02-14 17:55:26 +01:00
Tamo
746b31c1ce
makes clippy happy 2023-02-09 12:23:01 +01:00
Tamo
93f130a400
fix all warnings 2023-02-08 20:57:35 +01:00
Tamo
421a9cf05e
provide a new method on the transform to remove documents 2023-02-08 16:06:09 +01:00
Tamo
8f64fba1ce
rewrite the current transform to handle a new byte specifying the kind of operation it's merging 2023-02-08 12:53:38 +01:00
Louis Dureuil
89675e5f15
clippy: Replace seek 0 by rewind 2023-01-31 09:32:40 +01:00
Louis Dureuil
13c95d25aa
Remove uses of UserError::MissingPrimaryKey not related to inference 2022-12-21 15:13:36 +01:00
Loïc Lecrenier
67d8cec209 Fix bug in handling of soft deleted documents when updating settings 2022-12-06 15:09:19 +01:00
Kerollmops
37b3c5c323
Fix transform to use all_documents and ignore soft_deleted documents 2022-11-08 14:23:16 +01:00
unvalley
3009981d31 Fix clippy errors
Add clippy job

Add clippy job to CI
2022-11-04 08:58:14 +09:00
bors[bot]
c8f16530d5
Merge #616
616: Introduce an indexation abortion function when indexing documents r=Kerollmops a=Kerollmops



Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-10-26 11:41:18 +00:00
Kerollmops
6603437cb1
Introduce an indexation abortion function when indexing documents 2022-10-17 17:28:03 +02:00
Ewan Higgs
beb987d3d1 Fixing piles of clippy errors.
Most of these are calling clone when the struct supports Copy.

Many are using & and &mut on `self` when the function they are called
from already has an immutable or mutable borrow so this isn't needed.

I tried to stay away from actual changes or places where I'd have to
name fresh variables.
2022-10-13 22:02:54 +02:00
Loïc Lecrenier
3794962330 Use an unstable algorithm for grenad::Sorter when possible 2022-09-13 14:49:53 +02:00
ManyTheFish
2668f841d1 Fix update indexing 2022-08-17 15:03:37 +02:00
Tamo
7fc35c5586
remove the useless prints 2022-08-02 10:31:22 +02:00
Tamo
f156d7dd3b
Stop reindexing already indexed documents 2022-08-02 10:31:20 +02:00
Loïc Lecrenier
fc9f3f31e7 Change DocumentsBatchReader to access cursor and index at same time
Otherwise it is not possible to iterate over all documents while
using the fields index at the same time.
2022-07-18 16:08:14 +02:00
Loïc Lecrenier
ab1571cdec Simplify Transform::read_documents, enabled by enriched documents reader 2022-07-18 12:45:47 +02:00
Kerollmops
5d149d631f
Remove tests for a function that no more exists 2022-07-12 15:14:06 +02:00