MeiliSearch

mirror of https://github.com/meilisearch/MeiliSearch synced 2025-06-23 06:58:30 +02:00

Author	SHA1	Message	Date
meili-bors[bot]	2491db8746	Merge #4366 4366: Fix geo error message r=dureuill a=irevoire # Pull Request I backported #4337 from `main` to the current release so it could make it in the next patch version. ## Related issue Fixes https://github.com/meilisearch/meilisearch/issues/4333 Re-implement: #4337 after it was reverted in #4364 ## What does this PR do? - Add tests for the enrich pipeline on malformed documents with `null` value - Reproduce the issue when updating the settings while there is malformed documents in the DB - Fix the bug Co-authored-by: Tamo <tamo@meilisearch.com>	2024-01-29 10:37:51 +00:00
Tamo	b9f365a965	make clippy happy	2024-01-25 18:57:22 +01:00
Tamo	3f21daf2e7	add a bunch of tests and fix the error message when adding the geosearch as filterable/sortable while there is malformed documents in the DB	2024-01-25 18:57:21 +01:00
meili-bors[bot]	d77df4ecdb	Merge #4352 4352: Restore highlighting when possible for hybrid search r=ManyTheFish a=dureuill # Pull Request ## Related issue Fixes #4351 ## What does this PR do? - Use `MatchingWords` from keyword search instead of the one from vector search - New: When `semanticRatio < 1.0`, all words from the query are now highlighted in all results, regardless of their source (keyword or semantic) - No change: When `semanticRatio == 1.0`, no highlighting is applied, like before this PR ## Draft status Should we merge this in a v1.6.1 version? Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-01-24 12:32:16 +00:00
meili-bors[bot]	fdac97e3c8	Merge #4353 4353: Update version for the next release (v1.6.1) in Cargo.toml r=curquiza a=meili-bot ⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging. Co-authored-by: curquiza <curquiza@users.noreply.github.com>	2024-01-23 17:50:52 +00:00
curquiza	bbdfbd8ea1	Update version for the next release (v1.6.1) in Cargo.toml	2024-01-23 15:13:04 +00:00
Louis Dureuil	da7c796be1	Add test	2024-01-23 15:09:49 +01:00
Louis Dureuil	014eaea428	Use MatchingWords from keyword search instead of the one from vector search	2024-01-23 14:47:28 +01:00
meili-bors[bot]	a6fa0b97ec	Merge #4318 4318: Hide embedders r=ManyTheFish a=dureuill Hides `embedders` when it is an empty dictionary. Manual tests: - getting settings with empty embedders: not displayed - getting settings with non-empty embedders: displayed like before - dump with empty embedders: can be imported - dump with non-empty embedders: can be imported Co-authored-by: Louis Dureuil <louis@meilisearch.com> v1.6.0	2024-01-15 09:37:31 +00:00
Louis Dureuil	38abfec611	Fix tests	2024-01-11 21:35:30 +01:00
Louis Dureuil	84a5c304fc	Don't display the embedders setting when it is an empty dict	2024-01-11 21:35:06 +01:00
meili-bors[bot]	e93d36d5b9	Merge #4313 4313: Fix document formatting performances r=Kerollmops a=ManyTheFish reduce the formatted option list to the attributes that should be formatted, instead of all the attributes to display. The time to compute the `format` list scales with the number of fields to format; cumulated with `map_leaf_values` that iterates over all the nested fields, it gives a quadratic complexity: `d*f` where `d` is the total number of fields to display and `f` is the total number of fields to format. Co-authored-by: ManyTheFish <many@meilisearch.com> v1.6.0-rc.8	2024-01-11 14:19:44 +00:00
ManyTheFish	95f8e21533	fix typos	2024-01-11 15:07:08 +01:00
meili-bors[bot]	68f197624e	Merge #4314 4314: Fix proximity precision telemetry r=Kerollmops a=ManyTheFish The proximity precision telemetry was partially missing in the global setting route. This PR adds the missing field and return the default value when the value is not set. Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-01-11 13:50:03 +00:00
ManyTheFish	b79b03d4e2	Fix proximity precision telemetry	2024-01-11 13:24:26 +01:00
ManyTheFish	86270e6878	Transform fields contained into _format into strings	2024-01-11 12:44:56 +01:00
ManyTheFish	81b6128b29	Update tests	2024-01-11 12:28:32 +01:00
ManyTheFish	5f5a486895	Reduce formatting time	2024-01-11 11:36:41 +01:00
ManyTheFish	5f4fc6c955	Add timer logs	2024-01-11 09:44:16 +01:00
meili-bors[bot]	1f5e8fc072	Merge #4311 4311: Limit the number of values returned by the facet search r=dureuill a=Kerollmops This PR fixes a bug where the number of values per facet returned by the `indexes/{index}/facet-search` route was not tacking the `faceting.maxValuePerFacet` setting. It also adds a test. Co-authored-by: Clément Renault <clement@meilisearch.com> v1.6.0-rc.7	2024-01-10 16:04:06 +00:00
Clément Renault	3f3462ab62	Limit the number of values returned by the facet search	2024-01-10 16:54:08 +01:00
meili-bors[bot]	93363b0201	Merge #4308 4308: Fix hang on `/indexes` and `/stats` routes r=Kerollmops a=dureuill # Pull Request ## Related issue Fixes #4218 ## Context - A previous fix added a field to the `IndexScheduler` to memorize the `currently_updating_index`, so that accessing it through the search would return the handle without trying to open it. This resolved a hang on the search, but #4218 reported further hangs on the `/indexes` and `/stats` routes - These routes were shunting the `IndexScheduler` and using internal `IndexMapper` logic to access the indexes, again trying to reopen the updating index. ## What does this PR do? - Moves the logic relative to the `currently_updating_index` from the `IndexScheduler` to the `IndexMapper`, so that any index request to the `IndexMapper` can benefit from it. ## Test 1. Follow reproducer from #4218 2. Before this PR, notice a hang on `/stats` and `/indexes`, but not on `/indexes/<updating_index>/search` 3. After this PR, notice no hang on either of `/stats`, `/indexes` or `/indexes/<updating_index>/search` Co-authored-by: Louis Dureuil <louis@meilisearch.com> v1.6.0-rc.6	2024-01-10 10:46:20 +00:00
Louis Dureuil	97bb1ff9e2	Move `currently_updating_index` to IndexMapper	2024-01-09 15:37:27 +01:00
meili-bors[bot]	5ee1378856	Merge #4303 4303: Display default value when proximityPrecision is not set r=dureuill a=ManyTheFish # Pull Request ## Related Issue: #4187 Spec change requests: https://github.com/meilisearch/specifications/pull/261#discussion_r1441725272 ## What does this PR do? - Display default value when proximityPrecision is not set instead of Null Co-authored-by: ManyTheFish <many@meilisearch.com> v1.6.0-rc.5	2024-01-08 14:29:57 +00:00
ManyTheFish	e27b850b09	move the default display strategy on setting getter function	2024-01-08 14:03:47 +01:00
ManyTheFish	f75f22e026	Display default value when proximityPrecision is not set	2024-01-08 11:09:37 +01:00
meili-bors[bot]	6203f4acef	Merge #4296 4296: Fix single element search r=irevoire a=dureuill # Pull Request Before this PR, indexing a single vector in a single document would result in the vector not being found by the vector search. This PR adds a test case for this condition, and resolves it by bumping arroy to a version containing the fix. # Test case Output of the test before and after this PR: ```diff diff --git a/meilisearch/tests/search/hybrid.rs b/meilisearch/tests/search/hybrid.rs index 2cd4b83e7..79819cab2 100644 --- a/meilisearch/tests/search/hybrid.rs on release-v1.6.0 +++ b/meilisearch/tests/search/hybrid.rs on fix-single-element-search `@@` -171,5 +171,5 `@@` async fn single_document() { .await; snapshot!(code, `@"200` OK"); - snapshot!(response["hits"][0], `@r###"{"title":"Shazam!","desc":"a` Captain Marvel ersatz","id":"1","_vectors":{"default":[1.0,3.0]},"_rankingScore":0.0}"###); + snapshot!(response["hits"][0], `@r###"{"title":"Shazam!","desc":"a` Captain Marvel ersatz","id":"1","_vectors":{"default":[1.0,3.0]},"_rankingScore":1.0,"_semanticScore":1.0}"###); } ``` Co-authored-by: Louis Dureuil <louis@meilisearch.com> v1.6.0-rc.4	2024-01-03 15:01:43 +00:00
Louis Dureuil	12edc2c20a	Update arroy to a fixed version	2024-01-03 15:59:37 +01:00
Louis Dureuil	94b9f3b310	Add test	2024-01-03 15:56:20 +01:00
meili-bors[bot]	da99a04eb3	Merge #4294 4294: fix compilation warnings for release v1.6 r=curquiza a=irevoire # Pull Request ## Related issue Fixes #4292 ## What does this PR do? - Removed unused imports #4295 fixes the issue no main Co-authored-by: Tamo <tamo@meilisearch.com>	2024-01-02 15:00:40 +00:00
Tamo	54ae6951eb	fix warning	2024-01-02 15:19:30 +01:00
meili-bors[bot]	658ec6e0a4	Merge #4279 4279: Check experimental feature on setting update query rather than in the task. r=ManyTheFish a=dureuill Improve the UX by checking for the vector store feature and returning an error synchronously when sending a setting update, rather than in the indexing task. Co-authored-by: Louis Dureuil <louis@meilisearch.com> v1.6.0-rc.3	2023-12-22 11:36:12 +00:00
meili-bors[bot]	43e822e802	Merge #4238 4238: Task queue webhook r=dureuill a=irevoire # Prototype `prototype-task-queue-webhook-1` The prototype is available through Docker by using the following command: ```bash docker run -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:prototype-task-queue-webhook-1 ``` # Pull Request Implements the task queue webhook. ## Related issue Fixes https://github.com/meilisearch/meilisearch/issues/4236 ## What does this PR do? - Provide a new cli and env var for the webhook, respectively called `--task-webhook-url` and `MEILI_TASK_WEBHOOK_URL` - Also supports sending the requests with a custom `Authorization` header by specifying the optional `--task-webhook-authorization-header` CLI parameter or `MEILI_TASK_WEBHOOK_AUTHORIZATION_HEADER` env variable. - Throw an error if the specified URL is invalid - Every time a batch is processed, send all the finished tasks into the webhook with our public `TaskView` type as a JSON Line GZIPed body. - Add one test. ## PR checklist ### Before becoming ready to review - [x] Add a test - [x] Compress the data we send - [x] Chunk and stream the data we send - [x] Remove the unwrap in the index-scheduler when sending the data fails - [x] The analytics are missing ### Before merging - [x] Release a prototype Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: Clément Renault <clement@meilisearch.com>	2023-12-21 14:43:46 +00:00
Louis Dureuil	ee54d3171e	Check experimental feature at query time	2023-12-21 15:26:12 +01:00
meili-bors[bot]	a0e713c4e7	Merge #4277 4277: Update mini-dashboard to v0.2.12 r=curquiza a=mdubus # Pull Request ## Related issue Fixes #4276 ## What does this PR do? Upgrade mini-dashboard to version 0.2.12 ([see changes](https://github.com/meilisearch/mini-dashboard/releases/tag/v0.2.12)) ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>	2023-12-21 11:03:46 +00:00
meili-bors[bot]	d4cb0a885b	Merge #4275 4275: Flatten settings r=dureuill a=dureuill # Pull Request ## Related issue Initial internal feedback seems to indicate that the current shape of the `embedders` setting is undesirable: it has too much depth. This PR changes this by flattening the structure of the embedders to the following: ```json5 // NEW structure "embedders": { // still starts with the embedder name "default": { "source": "huggingFace", // now a string // properties of the source are all at the same level as the source "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2", "revision": "a9c555277f9bcf24f28fa5e092e665fc6f7c49cd", "documentTemplate": "A product titled '{{doc.title}}'" // now a string } } ``` By comparison, the old structure was: ```json5 // PREVIOUS version, no longer working with this PR "embedders": { // still starts with the embedder name "default": { "source": { "huggingFace": { "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2", "revision": "a9c555277f9bcf24f28fa5e092e665fc6f7c49cd" }, "documentTemplate": { "template": "A product titled '{{doc.title}}'" // now a string } } } ``` The fields that are accepted in the new version of the `embedders` setting are depending on the value of the `source` field: ```json5 // huggingFace "embedders": { "default": { "source": "huggingFace", "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2", "revision": "a9c555277f9bcf24f28fa5e092e665fc6f7c49cd", "documentTemplate": "A product titled '{{doc.title}}'" } } // openAi "embedders": { "default": { "source": "openAi", "model": "text-embedding-ada-002", "apiKey": "open_ai_api_key", "documentTemplate": "A product titled '{{doc.title}}'" } } // userProvided "embedders": { "default": { "source": "userProvided", "dimensions": 42, // mandatory } } ``` ## What does this PR do? - Flatten the settings structure - Validate the prompt earlier to return a synchronous error on setting change rather than in the failing task - Make it an error to pass a field for the wrong source (see above for allowed fields for each source) - Not changed: It is still an error not to pass `dimensions` to the `userProvided` embedder - If `source` was specified in the settings, validate the setting early to return a synchronous error in case of a missing mandatory field for the userProvided source (dimensions) or a forbidden field for the specified source. - If `source` was not specified in the settings, still validate the setting, but only at indexing time, by using the source stored in the DB. - Resets all values if the source changes, even if the user did not reset them explicitly. ## PR checklist Please check if your PR fulfills the following requirements: - [ ] Change the public facing guide for using the API - [ ] Change examples of use in the changelog Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2023-12-21 09:58:01 +00:00
Morgane Dubus	f52dee2b3b	Update Cargo.toml Update mini-dashboard with v0.2.12	2023-12-21 09:53:13 +01:00
Louis Dureuil	0bf879fb88	Fix warning on rust stable	2023-12-20 17:48:09 +01:00
Louis Dureuil	6ff81de401	Fix tests	2023-12-20 17:16:46 +01:00
Louis Dureuil	2e4c9651df	Validate settings in route	2023-12-20 17:16:46 +01:00
Louis Dureuil	ec9649c922	Add function to validate settings in Meilisearch, to be used in the routes	2023-12-20 17:16:46 +01:00
Louis Dureuil	9123370e90	Validate fused settings in settings task after fusing with existing setting	2023-12-20 17:16:46 +01:00
Louis Dureuil	14b396d302	Add new errors	2023-12-20 17:16:45 +01:00
Louis Dureuil	393216bf30	Flatten embedders settings	2023-12-20 17:16:43 +01:00
Louis Dureuil	e249e4db7b	Change Setting::apply function signature	2023-12-20 17:15:24 +01:00
meili-bors[bot]	de2ca7006e	Merge #4272 4272: Don't pass default revision when the model is explicitly set in config r=Kerollmops a=dureuill # Pull Request ## Related issue Fixes #4271 ## What does this PR do? - When the `model` is explicitly set in the `embedders` setting, we reset the `revision` to `None`, such that if the user doesn't specify a revision, the head of the model repository is chosen. - Not changed: If the user specifies a revision, it applies, like previously. - Not changed: If the user doesn't specify a model, the default model with the default revision applies, like previously. ## Manual testing on a fresh DB 1. Enable experimental feature: ```sh curl \ -X PATCH 'http://localhost:7700/experimental-features/' \ -H 'Content-Type: application/json' -H 'Authorization: Bearer foo' \ --data-binary '{ "vectorStore": true }' ``` 2. Send settings with a specified model but no specified revision: ```sh curl \ -X PATCH 'http://localhost:7700/indexes/products/settings' \ -H 'Content-Type: application/json' --data-binary \ '{ "embedders": { "default": { "source": { "huggingFace": { "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2" } }, "documentTemplate": { "template": "A product titled '{{doc.title}}'"} } } }' ``` 3. Check that the task was successful: ```sh curl 'http://localhost:7700/tasks/0' {"uid":0,"indexUid":"products","status":"succeeded","type":"settingsUpdate","canceledBy":null,"details":{"embedders":{"default":{"source":{"huggingFace":{"model":"sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"}},"documentTemplate":{"template":"A product titled {{doc.title}}"}}}},"error":null,"duration":"PT0.001892S","enqueuedAt":"2023-12-20T09:17:01.73789Z","startedAt":"2023-12-20T09:17:01.73854Z","finishedAt":"2023-12-20T09:17:01.740432Z"} ``` 4. Send documents to index: ```sh curl 'https://localhost:7700/indexes/products/documents' -H 'Content-Type: application/json' --data-binary '{"id": 0, "title": "Best product"}' ``` Co-authored-by: Louis Dureuil <louis@meilisearch.com> v1.6.0-rc.2	2023-12-20 14:27:51 +00:00
Louis Dureuil	333ce12eb2	Fixed issue where the default revision is always the one we picked for the default model	2023-12-20 10:17:49 +01:00
meili-bors[bot]	fb9db1eba6	Merge #4269 4269: Remove dependency that requires libstdc++ r=dureuill a=dureuill Removes the dependency that caused the additional runtime dependency on libstdc++ by disabling the default features of the hf tokenizer. ## Discussion - This removes a feature that is using a C++ dependency and is supposed to accelerate the tokenizer. As the tokenizer is likely to be a significant bottleneck for embedding texts using a HF model, this is an issue. - We should at least rerun the movies vector indexing and check that it still works correctly and that it has a runtime in the ballpark of what it used to be. Co-authored-by: Louis Dureuil <louis.dureuil@xinra.net> v1.6.0-rc.1	2023-12-19 12:26:48 +00:00
Clément Renault	fa2b96b9a5	Add an Authorization Header along with the webhook calls prototype-task-queue-webhook-1	2023-12-19 12:18:45 +01:00
Tamo	19736cefe8	add the analytics	2023-12-19 10:36:04 +01:00

1 2 3 4 5 ...

8888 Commits