MeiliSearch

mirror of https://github.com/meilisearch/MeiliSearch synced 2024-11-09 22:48:54 +01:00

Author	SHA1	Message	Date
Tamo	ce08dc509b	add more tests and improve the location of the error	2024-06-27 11:51:45 +02:00
Tamo	1daaed163a	Make _vectors.:embedding.regenerate mandatory + tests + error messages	2024-06-27 11:04:58 +02:00
Louis Dureuil	e35ef31738	Small changes following review	2024-06-13 14:20:48 +02:00
Louis Dureuil	3bc8f81abc	user_provided => regenerate	2024-06-12 18:12:20 +02:00
Louis Dureuil	d0b05ae691	Add EmbedderAction to settings	2024-06-12 14:50:54 +02:00
Louis Dureuil	e9bf4eb100	Reformulate ParsedVectorsDiff in terms of VectorState	2024-06-12 14:11:44 +02:00
Louis Dureuil	b368105272	Add EmbedderConfigs::into_inner	2024-06-12 14:11:44 +02:00
Tamo	31a793d226	fix the regeneration of the embeddings in the search	2024-06-06 11:39:29 +02:00
Tamo	d85ab23b82	rename all occurences of user_defined to user_provided for consistency	2024-06-06 11:39:29 +02:00
Tamo	b7349910d9	implements mor review comments	2024-06-06 11:39:29 +02:00
Tamo	b867829ef1	remove useless dbg	2024-06-06 11:39:29 +02:00
Tamo	5d50850e12	always push the user defined vectors in arroy	2024-06-06 11:39:29 +02:00
Tamo	04f6523f3c	expose a new parameter to retrieve the embedders at search time	2024-06-06 11:36:11 +02:00
Tamo	84e498299b	Remove the vectors from the documents database	2024-06-06 11:36:11 +02:00
Louis Dureuil	d35278320e	Add support functions for accessing arroy writers and readers	2024-05-28 15:27:43 +02:00
Louis Dureuil	3412e7fbcf	"[]" is deserialized as 0 embedding rather than 1 embedding of dim 0	2024-05-22 12:25:21 +02:00
Louis Dureuil	16037e2169	Don't remove embedders that are not in the config from the document DB	2024-05-22 12:24:51 +02:00
Louis Dureuil	b17cb56dee	Test array of vectors	2024-05-20 14:44:10 +02:00
Louis Dureuil	52d9cb6e5a	Refactor vector indexing - use the parsed_vectors module - only parse `_vectors` once per document, instead of once per embedder per document	2024-05-20 10:36:17 +02:00
Louis Dureuil	98c811247e	Add parsed vectors module	2024-05-20 10:25:59 +02:00
Louis Dureuil	f4dd73ec8c	Destructure EmbedderOptions so we don't miss some options	2024-05-02 15:39:36 +02:00
meili-bors[bot]	c793b6ef6d	Merge #4600 4600: Fix embedders api r=ManyTheFish a=ManyTheFish # Pull Request ## Related issue Fixes #4594 Fixes #4595 Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-04-25 13:16:33 +00:00
Clément Renault	d4aeff92d0	Introduce the ThreadPoolNoAbort wrapper	2024-04-24 16:40:12 +02:00
ManyTheFish	9b76501875	Display set API key for Ollama embedder	2024-04-24 12:33:07 +02:00
meili-bors[bot]	b1844b0c27	Merge #4548 4548: v1.8 hybrid search changes r=dureuill a=dureuill Implements the search changes from the [usage page](https://meilisearch.notion.site/v1-8-AI-search-API-usage-135552d6e85a4a52bc7109be82aeca42#40f24df3da694428a39cc8043c9cfc64) ### ⚠️ Breaking changes in an experimental feature: - Removed the `_semanticScore`. Use the `_rankingScore` instead. - Removed `vector` in the response of the search (output was too big). - Removed all the vectors from the `vectorSort` ranking score details - target vector appearing in the name of the rule - matched vector appearing in the details of the rule ### Other user-facing changes - Added `semanticHitCount`, indicating how many hits were returned from the semantic search. This is especially useful in the hybrid search. - Embed lazily: Meilisearch no longer generates an embedding when the keyword results are "good enough". - Graceful embedding failure in hybrid search: when doing hybrid search (`semanticRatio in ]0.0, 1.0[`), an embedding failure no longer causes the search request to fail. Instead, only the keyword search is performed. When doing a full vector search (`semanticRatio==1.0`), a failure to embed will still result in failing that search. Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-04-04 16:00:20 +00:00
Louis Dureuil	fabc9cf14a	milli: add Embedder::embed_one	2024-04-04 15:57:29 +02:00
Louis Dureuil	00c4ed3bc2	milli: refactor getting embedder and embedder name	2024-04-04 15:57:29 +02:00
meili-bors[bot]	339a5e3431	Merge #4549 4549: Hugging Face embedder improvements r=dureuill a=dureuill Architectural changes/Internal improvements ### 1. Prefer safetensors weights over pytorch weights when available safetensors weights are memory mapped, which reduces memory usage of supported models. ### 2. Update candle Updates candle to `0.4.1`, now targeting crates.io and the tokenizers to `v0.15.2` (still on github). This might fix https://github.com/meilisearch/meilisearch/issues/4399 thanks to the now included https://github.com/huggingface/candle/issues/1454 Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-04-04 13:47:18 +00:00
Louis Dureuil	a1eccc762a	Prefer safetensors to pytorch when both are available	2024-04-03 11:05:59 +02:00
Louis Dureuil	572fb3a51d	Finer granularity for embedder needs reindex	2024-03-27 12:01:34 +01:00
Louis Dureuil	4ff0255783	remove unused function	2024-03-27 11:51:14 +01:00
Louis Dureuil	a25456120d	Expose distribution in settings	2024-03-27 11:51:04 +01:00
Louis Dureuil	168ded3b9d	Deserr for distribution	2024-03-27 11:50:33 +01:00
Louis Dureuil	afd1da5642	Add distribution to all embedders	2024-03-27 11:50:22 +01:00
Louis Dureuil	817ccc089a	also allow `api_key`	2024-03-25 11:50:00 +01:00
Louis Dureuil	58972f35cb	Allow `url` parameter for ollama embedder	2024-03-25 11:32:55 +01:00
Louis Dureuil	a1db342f01	Expose REST embedder to the API	2024-03-25 11:23:15 +01:00
Louis Dureuil	f87747f4d3	Remove unwraps	2024-03-25 11:23:04 +01:00
Louis Dureuil	ac52c857e8	Update ollama and openai impls to use the rest embedder internally	2024-03-25 11:23:03 +01:00
Louis Dureuil	8708cbef25	Add RestEmbedder	2024-03-25 11:23:03 +01:00
Louis Dureuil	c3d02f092d	OpenAI sync	2024-03-25 11:23:03 +01:00
Louis Dureuil	bc58e8a310	Documentation for the vector module	2024-03-25 11:23:03 +01:00
Tamo	c5322df519	Revert "Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1""	2024-03-20 10:08:28 +01:00
Tamo	567194b925	Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1" This reverts commit `bd74cce86a`, reversing changes made to `d2f77e88bd`.	2024-03-19 16:56:21 +01:00
Louis Dureuil	a302e258bd	Don't display dimensions as 0 when it is not set	2024-03-18 16:10:12 +01:00
meili-bors[bot]	5ed7b6a0b2	Merge #4456 4456: Add Ollama as an embeddings provider r=dureuill a=jakobklemm # Pull Request ## Related issue [Related Discord Thread](https://discord.com/channels/1006923006964154428/1211977150316683305) ## What does this PR do? - Adds Ollama as a provider of Embeddings besides HuggingFace and OpenAI under the name `ollama` - Adds the environment variable `MEILI_OLLAMA_URL` to set the embeddings URL of an Ollama instance with a default value of `http://localhost:11434/api/embeddings` if no variable is set - Changes some of the structs and functions in `openai.rs` to be public so that they can be shared. - Added more error variants for Ollama specific errors - It uses the model `nomic-embed-text` as default, but any string value is allowed, however it won't automatically check if the model actually exists or is an embedding model Tested against Ollama version `v0.1.27` and the `nomic-embed-text` model. ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Co-authored-by: Jakob Klemm <jakob@jeykey.net> Co-authored-by: Louis Dureuil <louis.dureuil@gmail.com>	2024-03-13 08:48:47 +00:00
Louis Dureuil	ae67d5eef0	Update milli/src/vector/error.rs Fix Meilisearch capitalization	2024-03-13 09:45:04 +01:00
Jakob Klemm	88bc9556a9	Add Ollama dimension inference and add clearer errors Instead of the user manually specifying the model dimensions it will now automatically get determined Just like with hf.rs the word "test" gets embedded to determine the dimensions of the output Add a dedicated error type for if the model doesn't exist (don't automatically pull it though) and set the fault of that error to be the user	2024-03-12 19:59:11 +01:00
Louis Dureuil	0c216048b5	Cap timeout duration	2024-03-05 12:19:25 +01:00
Louis Dureuil	36d17110d8	openai: Handle BAD_GETAWAY, be more resilient to failure	2024-03-05 12:18:54 +01:00

1 2

77 Commits