Commit Graph

414 Commits

Author SHA1 Message Date
Clément Renault
f544cfa444
Remove tasks and content file on the s3 2023-09-12 15:19:45 +02:00
Kerollmops
a53a0fdb77
Store content files into the S3 2023-09-11 18:17:22 +02:00
Clément Renault
719fdd701b
Fix and crash when the tasks path is unknown 2023-09-07 11:31:18 +02:00
Kerollmops
01c13c98ac
Mastering minio 2023-09-06 17:54:21 +02:00
Tamo
5b89276fcc starts using s3 2023-09-05 19:25:09 +02:00
Kerollmops
41697c4d65
Introduce the zk-tasks folder 2023-09-04 18:24:34 +02:00
Kerollmops
7d85753573
Make the snapshot download work 2023-09-04 17:38:56 +02:00
Kerollmops
76657af1f9
Add the options into the IndexScheduler 2023-09-04 16:38:05 +02:00
Tamo
966cbdab69 make the tests compile again 2023-09-04 15:39:54 +02:00
Clément Renault
0c68b9ed4c
WIP making the final snapshot swap 2023-08-31 15:56:42 +02:00
Clément Renault
d7233ecdb8
Make things to compile again 2023-08-31 14:55:14 +02:00
Clément Renault
95a011af13
Wrap the IndexScheduler fields into an inner struct 2023-08-31 10:36:33 +02:00
Clément Renault
e257710961
WIP fix the tests 2023-08-30 18:03:24 +02:00
Clément Renault
8c3ad57ef9
React to changes towards the cluster members 2023-08-30 17:40:12 +02:00
Clément Renault
2d1434da81
Keep the ZK flow when enqueuing tasks 2023-08-30 17:15:15 +02:00
Clément Renault
c488a4a351
Fixup a lot of small issues on the ZK config 2023-08-30 16:42:55 +02:00
Kerollmops
0c7d7c68bc
WIP moving to the sync zookeeper API 2023-08-30 15:06:12 +02:00
Tamo
854745c670 wip: starts working on importing the snapshots 2023-08-16 18:41:05 +02:00
Tamo
777eebb759 starts creating snapshot, the import is still missing 2023-08-10 15:00:25 +02:00
Tamo
61ccfaf9bc wake up after registering a task 2023-08-10 09:39:39 +02:00
Tamo
f0c4d36ff7 implement the deletion of tasks after processing a batch
add a lot of comments and logs
2023-08-10 09:36:43 +02:00
Tamo
8c20d6e2fe fix the leader election 2023-08-09 17:23:13 +02:00
ManyTheFish
8e437ed76c Start leader election and task processing (WIP) 2023-08-09 16:52:38 +02:00
Tamo
1191ec5939 fix the register task watcher 2023-08-08 13:18:55 +02:00
Tamo
0d20d08daf fix a few warnings 2023-08-08 11:39:48 +02:00
ManyTheFish
b66bf049b5 Create a task on zookeeper side when a task is created locally 2023-08-07 17:02:51 +02:00
ManyTheFish
b45c36cd71 Merge branch 'main' into tmp-release-v1.3.0 2023-08-01 15:05:17 +02:00
Kerollmops
eef95de30e
First iteration on exposing puffin profiling 2023-07-18 17:38:13 +02:00
Clément Renault
22762808ab
Fix the tests 2023-07-06 12:13:29 +02:00
Clément Renault
86b834c9e4
Display the total number of tasks in the tasks route 2023-07-06 10:05:18 +02:00
meili-bors[bot]
aae099e330
Merge #3851
3851: Expose lastUpdate and isIndexing in /stats endpoint r=dureuill a=gentcys

# Pull Request

## Related issue
Fixes #3843

## What does this PR do?
- expose lastUpdate in `/stats` endpoint
- expose isIndex in `stats` endpoint
- add a method `is_task_processing` in index-scheduler/src/lib.rs.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Cong Chen <cong.chen@ocrlabs.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-07-03 13:41:04 +00:00
ManyTheFish
71500a4e15 Update tests 2023-07-03 11:20:43 +02:00
Louis Dureuil
324d448236
Format let-else ❤️ 🎉 2023-07-03 10:20:28 +02:00
Cong Chen
9859e65d2f fix tests 2023-07-01 09:32:50 +08:00
Cong Chen
3bdf01bc1c Fix failed test 2023-06-30 17:39:23 +08:00
Cong Chen
a5a31667b0 fix converse result of is_task_processing() 2023-06-30 11:28:18 +08:00
Cong Chen
e3fc7112bc use RoaringBitmap::is_empty instead 2023-06-29 11:46:47 +08:00
Kerollmops
816d7ed174
Update the Vector Store product feature link 2023-06-27 12:32:42 +02:00
Louis Dureuil
13e9b4c2e5
Add dump support 2023-06-26 16:29:43 +02:00
Louis Dureuil
072d81843f
Persistently save to DB the status of experimental features 2023-06-26 16:29:43 +02:00
Cong Chen
6d4981ec25 Expose lastUpdate and isIndexing in /stats endpoint 2023-06-23 07:24:25 +08:00
meili-bors[bot]
040b5a5b6f
Merge #3842
3842: fix some typos r=dureuill a=cuishuang

# Pull Request

## Related issue
Fixes #<issue_number>

## What does this PR do?
- fix some typos

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: cui fliter <imcusg@gmail.com>
2023-06-22 18:01:10 +00:00
cui fliter
530a3e2df3 fix some typos
Signed-off-by: cui fliter <imcusg@gmail.com>
2023-06-22 21:59:00 +08:00
meili-bors[bot]
45636d315c
Merge #3670
3670: Fix addition deletion bug r=irevoire a=irevoire

The first commit of this PR is a revert of https://github.com/meilisearch/meilisearch/pull/3667. It re-enable the auto-batching of addition and deletion of tasks. No new changes have been introduced outside of `milli`. So all the changes you see on the autobatcher have actually already been reviewed.

It fixes https://github.com/meilisearch/meilisearch/issues/3440.

### What was happening?

The issue was that the `external_documents_ids` generated in the `transform` were used in a very strange way that wasn’t compatible with the deletion of documents.
Instead of doing a clear merge between the external document IDs of the DB and the one returned by the transform + writing it on disk, we were doing some weird tricks with the soft-deleted to avoid writing the fst on disk as much as possible.
The new algorithm may be a bit slower but is way more straightforward and doesn’t change depending on if the soft deletion was used or not. Here is a list of the changes introduced:
1. We now do a clear distinction between the `new_external_documents_ids` coming from the transform and only held on RAM and the `external_documents_ids` coming from the DB.
2. The `new_external_documents_ids` (coming out of the transform) are now represented as an `fst`. We don't need to struggle with the hard, soft distinction + the soft_deleted => That's easier to understand
3. When indexing documents, we merge the `external_documents_ids` coming from the DB and the `new_external_documents_ids` coming from the transform.

### Other things introduced in this  PR

Since we constantly have to write small, very specialized fuzzers for this kind of bug, we decided to push the one used to reproduce this bug.
It's not perfect, but it's easy to improve in the future.
It'll also run for as long as possible on every merge on the main branch.

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Loïc Lecrenier <loic.lecrenier@icloud.com>
2023-06-19 09:09:30 +00:00
meili-bors[bot]
c1e3cc04b0
Merge #3811
3811: Bring back changes from `release-v1.2.0` to `main` r=Kerollmops a=curquiza



Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Filip Bachul <filipbachul@gmail.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-06-06 13:10:24 +00:00
Tamo
4a3405afec
comment the stats method 2023-06-06 12:59:58 +02:00
Tamo
3cfd653db1
Apply suggestions from code review
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-06-06 11:38:41 +02:00
Tamo
2acc3ec5ee
fix the type of the document deletion by filter tasks 2023-05-30 15:18:52 +02:00
Tamo
c9b65677bf
return the on disk size actually used by meilisearch 2023-05-25 18:30:30 +02:00
Tamo
c433bdd1cd add a view for the task queue in the metrics 2023-05-25 12:58:13 +02:00