MeiliSearch

mirror of https://github.com/meilisearch/MeiliSearch synced 2025-06-24 15:38:40 +02:00

Author	SHA1	Message	Date
Tamo	ed3dfbe729	add error codes and tests	2023-05-04 15:34:08 +02:00
meili-bors[bot]	a95128df6b	Merge #3550 3550: Delete documents by filter r=irevoire a=dureuill # Prototype `prototype-delete-by-filter-0` Usage: A new route is available under `POST /indexes/{index_uid}/documents/delete` that allows you to delete your documents by filter. The expected payload looks like that: ```json { "filter": "doggo = bernese", } ``` It'll then enqueue a task in your task queue that'll delete all the documents matching this filter once it's processed. Here is an example of the associated details; ```json "details": { "deletedDocuments": 53, "originalFilter": "\"doggo = bernese\"" } ``` ---------- # Pull Request ## Related issue Related to https://github.com/meilisearch/meilisearch/issues/3477 ## What does this PR do? ### User standpoint - Modifies the `/indexes/{:indexUid}/documents/delete-batch` route to accept either the existing array of documents ids, or a JSON object with a `filter` field representing a filter to apply. If that latter variant is used, any document matching the filter will be deleted. ### Implementation standpoint - (processing time version) Adds a new BatchKind that is not autobatchable and that performs the delete by filter - Reuse the `documentDeletion` task with a new `originalFilter` detail that replaces the `providedIds` detail. ## Example <details> <summary>Sample request, response and task result</summary> Request: ``` curl \ -X POST 'http://localhost:7700/indexes/index-10/documents/delete-batch' \ -H 'Content-Type: application/json' \ --data-binary '{ "filter" : "mass = 600"}' ``` Response: ``` { "taskUid": 3902, "indexUid": "index-10", "status": "enqueued", "type": "documentDeletion", "enqueuedAt": "2023-02-28T20:50:31.667502Z" } ``` Task log: ```json { "uid": 3906, "indexUid": "index-12", "status": "succeeded", "type": "documentDeletion", "canceledBy": null, "details": { "deletedDocuments": 3, "originalFilter": "\"mass = 600\"" }, "error": null, "duration": "PT0.001819S", "enqueuedAt": "2023-03-07T08:57:20.11387Z", "startedAt": "2023-03-07T08:57:20.115895Z", "finishedAt": "2023-03-07T08:57:20.117714Z" } ``` </details> ## Draft status - [ ] Error handling - [ ] Analytics - [ ] Do we want to reuse the `delete-batch` route in this way, or create a new route instead? - [ ] Should the filter be applied at request time or when the deletion task is processed? - The first commit in this PR applies the filter at request time, meaning that even if a document is modified in a way that no longer matches the filter in a later update, it will be deleted as long as the deletion task is processed after that update. - The other commits in this PR apply the filter only when the asynchronous deletion task is processed, meaning that documents that match the filter at processing time are deleted even if they didn't match the filter at request time. - [ ] If keeping the filter at request time, find a more elegant way to recover the user document ids from the internal document ids. The current way implemented in the first commit of this PR involves getting all the documents matching the filter, looking for the value of their primary key, and turning it into a string by copy-pasting routines found in milli... - [ ] Security consideration, if any - [ ] Fix the tests (but waiting until product questions are resolved) - [ ] Add delete by filter specific tests Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: Tamo <tamo@meilisearch.com>	2023-05-04 10:44:41 +00:00
Tamo	f9ddd32545	implement the auto-deletion of tasks	2023-05-04 00:06:49 +02:00
Tamo	d2d2bacaf2	add a test on the complex filter	2023-05-03 20:07:08 +02:00
Louis Dureuil	84e7bd9342	Fix test after rebase on filter additions	2023-05-03 17:51:28 +02:00
Louis Dureuil	2b74e4d116	Fix test	2023-05-03 17:41:50 +02:00
Tamo	b5fe0b2b07	fix the details	2023-05-03 17:41:50 +02:00
Tamo	0548ab9038	create and use the error code	2023-05-03 17:41:50 +02:00
Tamo	143acb9cdc	update the tests	2023-05-03 17:41:49 +02:00
Tamo	4b92f1b269	wip	2023-05-03 17:41:49 +02:00
Tamo	c12a1cd956	test all the error messages	2023-05-03 17:41:49 +02:00
Tamo	8af8aa5a33	add a test	2023-05-03 17:41:49 +02:00
bors[bot]	414b3fae89	Merge #3571 3571: Introduce two filters to select documents with `null` and empty fields r=irevoire a=Kerollmops # Pull Request ## Related issue This PR implements the `X IS NULL`, `X IS NOT NULL`, `X IS EMPTY`, `X IS NOT EMPTY` filters that [this comment](https://github.com/meilisearch/product/discussions/539#discussioncomment-5115884) is describing in a very detailed manner. ## What does this PR do? ### `IS NULL` and `IS NOT NULL` This PR will be exposed as a prototype for now. Below is the copy/pasted version of a spec that defines this filter. - `IS NULL` matches fields that `EXISTS` AND `= IS NULL` - `IS NOT NULL` matches fields that `NOT EXISTS` OR `!= IS NULL` 1. `{"name": "A", "price": null}` 2. `{"name": "A", "price": 10}` 3. `{"name": "A"}` `price IS NULL` would match 1 `price IS NOT NULL` or `NOT price IS NULL` would match 2,3 `price EXISTS` would match 1, 2 `price NOT EXISTS` or `NOT price EXISTS` would match 3 common query : `(price EXISTS) AND (price IS NOT NULL)` would match 2 ### `IS EMPTY` and `IS NOT EMPTY` - `IS EMPTY` matches Array `[]`, Object `{}`, or String `""` fields that `EXISTS` and are empty - `IS NOT EMPTY` matches fields that `NOT EXISTS` OR are not empty. 1. `{"name": "A", "tags": null}` 2. `{"name": "A", "tags": [null]}` 3. `{"name": "A", "tags": []}` 4. `{"name": "A", "tags": ["hello","world"]}` 5. `{"name": "A", "tags": [""]}` 6. `{"name": "A"}` 7. `{"name": "A", "tags": {}}` 8. `{"name": "A", "tags": {"t1":"v1"}}` 9. `{"name": "A", "tags": {"t1":""}}` 10. `{"name": "A", "tags": ""}` `tags IS EMPTY` would match 3,7,10 `tags IS NOT EMPTY` or `NOT tags IS EMPTY` would match 1,2,4,5,6,8,9 `tags IS NULL` would match 1 `tags IS NOT NULL` or `NOT tags IS NULL` would match 2,3,4,5,6,7,8,9,10 `tags EXISTS` would match 1,2,3,4,5,7,8,9,10 `tags NOT EXISTS` or `NOT tags EXISTS` would match 6 common query : `(tags EXISTS) AND (tags IS NOT NULL) AND (tags IS NOT EMPTY)` would match 2,4,5,8,9 ## What should the reviewer do? - Check that I tested the filters - Check that I deleted the ids of the documents when deleting documents Co-authored-by: Clément Renault <clement@meilisearch.com> Co-authored-by: Kerollmops <clement@meilisearch.com>	2023-04-27 13:14:00 +00:00
Clément Renault	cfd1b2cc97	Fix the clippy warnings	2023-04-25 16:40:32 +02:00
Kerollmops	2d8060df80	Fix the tests	2023-04-24 17:50:57 +02:00
bors[bot]	654a3a9e19	Merge #3688 3688: Following release v1.1.1: bring back changes into `main` r=curquiza a=curquiza `@meilisearch/engine-team` ensure the changes we bring to `main` are the ones you want Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com> Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: dureuill <dureuill@users.noreply.github.com>	2023-04-24 11:38:23 +00:00
Louis Dureuil	c2f4b6ced0	Test: await for the deletion task to complete before trying to add another task	2023-04-13 18:22:42 +02:00
Louis Dureuil	1e6cbcaf12	Update test comment Co-authored-by: Tamo <tamo@meilisearch.com>	2023-04-13 17:27:12 +02:00
Louis Dureuil	066c6bd875	test task db full now checks that a task can be successfully added after deleting tasks	2023-04-13 17:20:06 +02:00
Tamo	b3f60ee805	try to fix the ci	2023-04-13 10:18:58 +02:00
Tamo	b4fabce36d	update the error message + update the task db size to 20GiB with a limit at 50%	2023-04-12 18:54:11 +02:00
Tamo	9350a7b017	improve the test and try to understand the issue happening on windows	2023-04-12 18:54:11 +02:00
Tamo	be69ab320d	stops receiving tasks once the task queue is full	2023-04-12 18:54:11 +02:00
Tamo	4d308d5237	Improve the health route by ensuring lmdb is not down And refactorize slightly the auth controller.	2023-04-06 15:31:42 +02:00
bors[bot]	b4c01581cd	Merge #3641 3641: Bring back changes from `release v1.1.0` into `main` after v1.1.0 release r=curquiza a=curquiza Replace https://github.com/meilisearch/meilisearch/pull/3637 since we don't want to pull commits from `main` into `release-v1.1.0` when fixing git conflicts Co-authored-by: ManyTheFish <many@meilisearch.com> Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com> Co-authored-by: Charlotte Vermandel <charlottevermandel@gmail.com> Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: curquiza <clementine@meilisearch.com> Co-authored-by: Clément Renault <clement@meilisearch.com> Co-authored-by: Many the fish <many@meilisearch.com>	2023-04-06 12:37:54 +00:00
Tamo	67fd3b08ef	wait until all tasks are processed before running our dump integration tests	2023-04-05 18:35:43 +02:00
filip	f267bed352	remove a unnecessary comment Co-authored-by: Tamo <irevoire@protonmail.ch>	2023-04-05 13:44:55 +02:00
Tamo	597d57bf1d	Merge branch 'main' into bring-back-changes-v1.1.0	2023-04-05 11:32:14 +02:00
Filip Bachul	0fba08cd72	fmt	2023-04-03 20:18:26 +02:00
Filip Bachul	189d4c3b70	add geoPoint integration tests	2023-04-03 20:18:26 +02:00
Filip Bachul	52b4090286	update integration tests	2023-04-03 20:18:26 +02:00
ManyTheFish	6592746337	Fix other unrelated tests	2023-03-29 14:36:17 +02:00
ManyTheFish	b744f33530	Add test	2023-03-29 12:01:52 +02:00
Tamo	a2b151e877	ensure that the task queue is correctly imported reduce the size of the snapshots file	2023-03-21 14:41:46 +01:00
Clément Renault	ea016d97af	Implementing an IS EMPTY filter	2023-03-15 14:12:34 +01:00
Clément Renault	fa2ea4a379	Update the test to accept the new IS syntax	2023-03-14 10:31:27 +01:00
Clément Renault	6da54d0cb6	Add a test to fix a diacritic issue	2023-03-09 14:57:38 +01:00
Tamo	c5f22be6e1	add boolean support for csv documents	2023-03-09 11:12:49 +01:00
Tamo	d34faa8f9c	put back the sleep as it was and fix the from	2023-03-06 18:09:09 +01:00
Tamo	e5d0bef6d8	update a comment	2023-03-06 17:04:24 +01:00
Tamo	e704728ee7	fix the snapshots permissions on unix system	2023-03-06 16:28:40 +01:00
bors[bot]	4f1ccbc495	Merge #3525 3525: Fix phrase search containing stop words r=ManyTheFish a=ManyTheFish # Summary A search with a phrase containing only stop words was returning an HTTP error 500, this PR filters the phrase containing only stop words dropping them before the search starts, a query with a phrase containing only stop words now behaves like a placeholder search. fixes https://github.com/meilisearch/meilisearch/issues/3521 related v1.0.2 PR on milli: https://github.com/meilisearch/milli/pull/779 Co-authored-by: ManyTheFish <many@meilisearch.com>	2023-03-02 10:55:37 +00:00
Louis Dureuil	d4d4702f1b	Rephrase hint message	2023-02-27 13:46:16 +01:00
Tamo	7ae10abb6b	fix the auth tests	2023-02-23 17:27:42 +01:00
bors[bot]	89ac1015f3	Merge #3524 3524: Update the metrics route r=irevoire a=irevoire Fixes #3523 Make the metrics available by default without a feature flag. + Rename the cli-flag to `experimental-enable-metrics`. Co-authored-by: Tamo <tamo@meilisearch.com>	2023-02-23 15:11:10 +00:00
bors[bot]	ca25904c26	Merge #3331 3331: Limit the number of concurrently opened indexes r=dureuill a=dureuill # Pull Request ## Related issue Relevant to #1841, fixes #3382 ## What does this PR do? ### User standpoint - Limit the number of concurrently opened indexes (currently, the number of indexes that can be concurrently opened is computed at startup) - When too many an index is opened, the least recently used one is closed and its virtual memory released. - This allows a user to have an arbitrary number of indexes of an arbitrary size ### Implementation standpoint - Added a LRU cache map in `index-scheduler::lru`. A more complete implementation (eg with helper functions not used here) is available but would better fit a dedicated crate. - Use the LRU cache map in the `IndexScheduler`. To simplify the lifecycle of indexes, they are never removed from the cache when they are in the middle of a resize or delete operation. To achieve this, an intermediate `Vec` stores the UUIDs of the indexes that are in the middle of such an operation. - Upon creating the index scheduler object, compute the total virtual memory that is adressable by using a dichotomic search on the max size of an index. Use this as a base to compute the number of indexes that can be open with 2TiB per index. If the virtual memory address space is lower than 2TiB, then only allow for 1 index of a fraction of that size. Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2023-02-23 14:20:52 +00:00
Tamo	8a1b1a95f3	comment the right of the metrics	2023-02-23 13:59:01 +01:00
Tamo	88a18677d0	rename the metrics cli flag	2023-02-23 12:26:22 +01:00
Tamo	68e30214ca	remove the feature flag and reorganize the module slightly	2023-02-23 12:26:21 +01:00
bors[bot]	b985b96e4e	Merge #3530 3530: Fix highlighter bug r=Kerollmops a=ManyTheFish # Pull Request There was a highlighting issue on CJK's character, we were highlighting too many characters and these additional characters were duplicated after the highlight tag. ## Related issue Fixes #3517 Fixes #3526 ## What does this PR do? - add a test showcasing the bug - fix the bug by activating the char_map creation of the tokenizer during the highlighting process Co-authored-by: ManyTheFish <many@meilisearch.com>	2023-02-23 10:59:43 +00:00

1 2 3

145 Commits