MeiliSearch

mirror of https://github.com/meilisearch/MeiliSearch synced 2024-11-12 16:08:55 +01:00

Author	SHA1	Message	Date
Louis Dureuil	a9013ed683	Fix comment mistake Co-authored-by: Tamo <tamo@meilisearch.com>	2024-04-04 17:21:47 +02:00
Louis Dureuil	4564a38ae7	Bail earlier when the experimental feature is not enabled	2024-04-04 15:58:19 +02:00
Louis Dureuil	6ebb6b55a6	Lazily embed, don't fail hybrid search on embedding failure	2024-04-04 15:58:17 +02:00
meili-bors[bot]	fa9748cc99	Merge #4536 4536: Limit concurrent search requests r=ManyTheFish a=irevoire # Pull Request ## Related issue Fixes https://github.com/meilisearch/meilisearch/issues/4489 ## What does this PR do? - Adds a « search queue » that limits the number of search requests we can process at the same time and stores search requests to be processed - Process only one search request per core/thread (we use available_parallelism) - When the search queue is full, new search requests replace old ones randomly. The reason is that: - If we serve the oldest one first, like Typesense, we give the worst performances to everyone - If we serve the latest one, it gets too easy to DoS us (you just need to fill the queue with as many search requests as we can process simultaneously to ensure no other request will ever be processed) - By picking the search request randomly, we give a chance to recent search requests to be processed while ensuring that we can't be owned unless they fill our queue entirely and we start returning errors 5xx - Adds an experimental parameter to control the size of the queue - Adds a bunch of tests to ensure the search queue works correctly - Ensure the loop consuming the search queue is running in the health route and crashes if it’s not the case Co-authored-by: Tamo <tamo@meilisearch.com>	2024-03-28 15:01:52 +00:00
Tamo	e433fd53e6	rename the method to get a permit and use it in all search requests	2024-03-26 17:28:03 +01:00
Louis Dureuil	f649f58013	embed no longer async	2024-03-25 11:23:03 +01:00
Tamo	7bd881b9bc	adds the degraded searches to the prometheus dashboard	2024-03-19 10:35:47 +01:00
Tamo	1502382316	use debug instead of debug_span	2024-02-08 15:04:06 +01:00
Tamo	08af0e690c	Structures a bunch of logs	2024-02-08 15:04:06 +01:00
Tamo	7ff722b72e	get rids of the log dependencies everywhere	2024-02-08 15:04:05 +01:00
Louis Dureuil	87bba98bd8	Various changes - fixed seed for arroy - check vector dimensions as soon as it is provided to search - don't embed whitespace	2023-12-14 16:08:42 +01:00
Louis Dureuil	217105b7da	hybrid search uses semantic ratio, error handling	2023-12-14 16:08:42 +01:00
ManyTheFish	f3f3944469	Fix error checking	2023-12-14 16:08:42 +01:00
ManyTheFish	93dcbf598d	Deserialize semantic ratio	2023-12-14 16:08:42 +01:00
Louis Dureuil	e0cc775dc4	Various changes - DistributionShift in Search object (to be set from model in embed?) - Fix issue where embedder index wasn't computed at search time - Accept as default embedder either the "default" one, or the only embedder when there is only one	2023-12-14 16:08:41 +01:00
Louis Dureuil	12940d79a9	WIP - manual embedder - multi embedders OK - clippy + tests OK	2023-12-14 16:08:41 +01:00
Louis Dureuil	922a640188	WIP multi embedders fixed template bugs	2023-12-14 16:08:41 +01:00
Louis Dureuil	13c2c6c16b	Small commit to add hybrid search and autoembedding	2023-12-14 16:07:48 +01:00
Louis Dureuil	cf8dad1ca0	index_scheduler.features() is no longer fallible	2023-10-23 10:38:56 +02:00
Louis Dureuil	d59e969c16	Allow a comma-separated value to the `vector` argument in GET search	2023-07-10 16:16:34 +02:00
ManyTheFish	7a80c0dfb3	Fix invalid attributeToSearchOn error code to be consistent with the others search parameters error codes	2023-07-03 11:52:43 +02:00
meili-bors[bot]	d4f10800f2	Merge #3834 3834: Define searchable fields at runtime r=Kerollmops a=ManyTheFish ## Summary This feature allows the end-user to search in one or multiple attributes using the search parameter `attributesToSearchOn`: ```json { "q": "Captain Marvel", "attributesToSearchOn": ["title"] } ``` This feature act like a filter, forcing Meilisearch to only return the documents containing the requested words in the attributes-to-search-on. Note that, with the matching strategy `last`, Meilisearch will only ensure that the first word is in the attributes-to-search-on, but, the retrieved documents will be ordered taking into account the word contained in the attributes-to-search-on. ## Trying the prototype A dedicated docker image has been released for this feature: #### last prototype version: ```bash docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-1 ``` #### others prototype versions: ```bash docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-0 ``` ## Technical Detail The attributes-to-search-on list is given to the search context, then, the search context uses the `fid_word_docids`database using only the allowed field ids instead of the global `word_docids` database. This is the same for the prefix databases. The database cache is updated with the merged values, meaning that the union of the field-id-database values is only made if the requested key is missing from the cache. ### Relevancy limits Almost all ranking rules behave as expected when ordering the documents. Only `proximity` could miss-order documents if all the searched words are in the restricted attribute but a better proximity is found in an ignored attribute in a document that should be ranked lower. I put below a failing test showing it: ```rust #[actix_rt::test] async fn proximity_ranking_rule_order() { let server = Server::new().await; let index = index_with_documents( &server, &json!([ { "title": "Captain super mega cool. A Marvel story", // Perfect distance between words in an ignored attribute "desc": "Captain Marvel", "id": "1", }, { "title": "Captain America from Marvel", "desc": "a Shazam ersatz", "id": "2", }]), ) .await; // Document 2 should appear before document 1. index .search(json!({"q": "Captain Marvel", "attributesToSearchOn": ["title"], "attributesToRetrieve": ["id"]}), \|response, code\| { assert_eq!(code, 200, "{}", response); assert_eq!( response["hits"], json!([ {"id": "2"}, {"id": "1"}, ]) ); }) .await; } ``` Fixing this would force us to create a `fid_word_pair_proximity_docids` and a `fid_word_prefix_pair_proximity_docids` databases which may multiply the keys of `word_pair_proximity_docids` and `word_prefix_pair_proximity_docids` by the number of attributes in the searchable_attributes list. If we think we should fix this test, I'll suggest doing it in another PR. ## Related Fixes #3772 Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: ManyTheFish <many@meilisearch.com>	2023-06-28 08:19:23 +00:00
Kerollmops	eecf20f109	Introduce a new invalid_vector_store	2023-06-27 12:32:42 +02:00
Clément Renault	cad90e8cbc	Add a vector field to the search routes	2023-06-27 12:32:38 +02:00
Louis Dureuil	6196a53668	Gate score_details behind a runtime experimental feature flag	2023-06-26 16:29:43 +02:00
ManyTheFish	114f878205	Rename restrictSearchableAttributes into attributesToSearchOn	2023-06-26 14:55:57 +02:00
ManyTheFish	461b5118bd	Add API search setting	2023-06-26 14:55:14 +02:00
Louis Dureuil	da833eb095	Expose the scores and detailed scores in the API	2023-06-22 12:39:14 +02:00
Louis Dureuil	a23fbf6c7b	multi-search: Add search with an array of indexes	2023-02-22 17:04:12 +01:00
Louis Dureuil	c8c5944094	Authentication: is_index_authorized takes into account API key indexes even with a tenant token	2023-02-22 16:35:52 +01:00
Tamo	a43765d454	use the pre-defined deserr extractors	2023-02-14 20:05:30 +01:00
Tamo	8fb7b1d10f	bump deserr	2023-02-14 20:04:30 +01:00
Loïc Lecrenier	e225608337	Use invalid_index_uid error code in more places	2023-01-17 15:28:06 +01:00
Loïc Lecrenier	b781f9a0f9	cargo fmt	2023-01-17 11:07:07 +01:00
Loïc Lecrenier	9194508a0f	Refactor query parameter deserialisation logic	2023-01-17 11:07:07 +01:00
Loïc Lecrenier	766dd830ae	Update deserr to latest version + add new error codes for missing fields - missing_api_key_indexes - missing_api_key_actions - missing_api_key_expires_at - missing_swap_indexes_indexes	2023-01-17 09:43:07 +01:00
Loïc Lecrenier	436ae4e466	Improve error messages generated by deserr Split Json and Query Parameter error types	2023-01-17 09:43:07 +01:00
Loïc Lecrenier	1fc11264e8	Refactor deserr integration	2023-01-11 19:08:39 +01:00
Tamo	50ce0409bc	Integrate deserr on the most important routes	2023-01-05 20:48:29 +01:00
Colby Allen	ad2b1467da	Renames meilisearch-http to meilisearch	2022-12-08 08:22:53 -07:00

40 Commits