MeiliSearch

mirror of https://github.com/meilisearch/MeiliSearch synced 2024-11-27 07:14:26 +01:00

Author	SHA1	Message	Date
shuangcui	5c95b5c933	chore: remove repetitive words Signed-off-by: shuangcui <fliter@qq.com>	2024-03-14 21:28:55 +08:00
meili-bors[bot]	0b7bebeeb6	Merge #4483 4483: Workflows: Fix reason param when benches are triggered from a comment. r=irevoire a=dureuill Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-03-13 17:05:30 +00:00
meili-bors[bot]	d2f77e88bd	Merge #4479 4479: Skip reindexing when modifying unknown faceted fields r=dureuill a=Kerollmops This PR improves Meilisearch's decision to reindex when a faceted field is added to the settings, but not a single document contains this field. It is effectively a waste of time to reindex documents when the engine needs to know a field. This is related to a conversation [we have with our biggest customer (internal link)](https://discord.com/channels/1006923006964154428/1101213808627830794/1217112918857089187). They have 170 million documents, so reindexing this amount would be problematic. --- The image is available by using the following Docker command. You can see the advancement of the image's build [on the GitHub CI page](https://github.com/meilisearch/meilisearch/actions/runs/8251688778). ``` docker pull getmeili/meilisearch:prototype-no-reindex-unknown-fields-0 ``` Here is the hand-made test that shows that when modifying unknown filterable attributes, here `lol`, it doesn't reindex. However, when modifying the known `genre` field, it does reindex. You can see all that by looking at the time spent processing the update. ```json { "uid": 3, "indexUid": "movies", "status": "succeeded", "type": "settingsUpdate", "canceledBy": null, "details": { "filterableAttributes": [ "genres" ] }, "error": null, "duration": "PT9.237703S", "enqueuedAt": "2024-03-12T15:34:26.836083Z", "startedAt": "2024-03-12T15:34:26.836374Z", "finishedAt": "2024-03-12T15:34:36.074077Z" }, { "uid": 2, "indexUid": "movies", "status": "succeeded", "type": "settingsUpdate", "canceledBy": null, "details": { "filterableAttributes": [ "lol" ] }, "error": null, "duration": "PT0.000751S", "enqueuedAt": "2024-03-12T15:33:53.563923Z", "startedAt": "2024-03-12T15:33:53.565259Z", "finishedAt": "2024-03-12T15:33:53.56601Z" }, { "uid": 0, "indexUid": "movies", "status": "succeeded", "type": "documentAdditionOrUpdate", "canceledBy": null, "details": { "receivedDocuments": 31944, "indexedDocuments": 31944 }, "error": null, "duration": "PT3.120723S", "enqueuedAt": "2024-02-17T10:35:55.042864Z", "startedAt": "2024-02-17T10:35:55.043505Z", "finishedAt": "2024-02-17T10:35:58.164228Z" } ``` Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-03-13 16:23:32 +00:00
meili-bors[bot]	1d8c13f595	Merge #4487 4487: Update version for the next release (v1.7.1) in Cargo.toml r=Kerollmops a=meili-bot ⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging. Co-authored-by: Kerollmops <Kerollmops@users.noreply.github.com>	2024-03-13 15:41:10 +00:00
Kerollmops	7f3c495f5c	Update version for the next release (v1.7.1) in Cargo.toml	2024-03-13 14:49:21 +00:00
meili-bors[bot]	abd954755d	Merge #4476 4476: Make the `/facet-search` route use the `sortFacetValuesBy` setting r=irevoire a=Kerollmops This PR fixes #4423 by ensuring that the `/facet-search` route uses the `sortFacetValuesBy` setting. Note for the documentation team (to be moved in the tracking issue): Using the new `sortFacetValuesBy` setting can slow down the facet-search requests as Meilisearch iterates over the whole list of facet values and computes the count of documents on every entry. That is hardly or even impossible to optimize correctly. ### TODO - [x] Create a custom HashMap wrapper for the facet `OrderBy` settings. This wrapper will return the `OrderBy` setting of the facet, if not defined will use the default `*` one, and if not there either (strange) will fall back on the lexicographic one. - [x] Create a `ValuesCollection` wrapper that implements the logic for the lexicographic and count order by. - [x] Use it when there is no search query. - [x] Use it when there is a search query with and without allowed typos. - [x] Do not change the original logic, only use a wrapper. - [x] Add tests Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-03-13 14:36:14 +00:00
Clément Renault	f3fc2bd01f	Address some issues with preallocations	2024-03-13 15:22:14 +01:00
Louis Dureuil	6fa3872268	Workflows: Fix reason param when benches are triggered from a comment.	2024-03-13 13:46:43 +01:00
Clément Renault	6c9823d7bb	Add tests to sortFacetValuesBy count	2024-03-13 11:59:39 +01:00
Clément Renault	e0dac5a22f	Simplify the algorithm by using the new facet values collection wrapper	2024-03-13 11:31:34 +01:00
Clément Renault	b918b55c6b	Introduce a new facet value collection wrapper to simply the usage	2024-03-13 11:31:34 +01:00
meili-bors[bot]	07b1d0edaf	Merge #4475 4475: Allow running benchmarks without sending results to the dashboard r=irevoire a=dureuill Adds a `--no-dashboard` option to avoid sending results to the dashboard. Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-03-13 09:59:52 +00:00
Clément Renault	306b25ad3a	Move the searchForFacetValues struct into a dedicated module	2024-03-13 10:24:21 +01:00
Clément Renault	9f7a4fbfeb	Return the facets of a placeholder facet-search sorted by count	2024-03-13 10:09:01 +01:00
meili-bors[bot]	5ed7b6a0b2	Merge #4456 4456: Add Ollama as an embeddings provider r=dureuill a=jakobklemm # Pull Request ## Related issue [Related Discord Thread](https://discord.com/channels/1006923006964154428/1211977150316683305) ## What does this PR do? - Adds Ollama as a provider of Embeddings besides HuggingFace and OpenAI under the name `ollama` - Adds the environment variable `MEILI_OLLAMA_URL` to set the embeddings URL of an Ollama instance with a default value of `http://localhost:11434/api/embeddings` if no variable is set - Changes some of the structs and functions in `openai.rs` to be public so that they can be shared. - Added more error variants for Ollama specific errors - It uses the model `nomic-embed-text` as default, but any string value is allowed, however it won't automatically check if the model actually exists or is an embedding model Tested against Ollama version `v0.1.27` and the `nomic-embed-text` model. ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Co-authored-by: Jakob Klemm <jakob@jeykey.net> Co-authored-by: Louis Dureuil <louis.dureuil@gmail.com>	2024-03-13 08:48:47 +00:00
Louis Dureuil	ae67d5eef0	Update milli/src/vector/error.rs Fix Meilisearch capitalization	2024-03-13 09:45:04 +01:00
Jakob Klemm	88bc9556a9	Add Ollama dimension inference and add clearer errors Instead of the user manually specifying the model dimensions it will now automatically get determined Just like with hf.rs the word "test" gets embedded to determine the dimensions of the output Add a dedicated error type for if the model doesn't exist (don't automatically pull it though) and set the fault of that error to be the user	2024-03-12 19:59:11 +01:00
Clément Renault	ca4876fd10	Do not reindex when modifying unknown faceted field	2024-03-12 16:18:58 +01:00
Clément Renault	d3a95ea2f6	Introduce a new OrderByMap struct to simplify the sort by usage	2024-03-12 13:56:56 +01:00
Louis Dureuil	88d27949cd	Add documentation for benchmarks	2024-03-12 10:56:16 +01:00
Clément Renault	69c118ef76	Extract the facet order before extracting the facets values	2024-03-12 10:35:39 +01:00
meili-bors[bot]	d44e20aa89	Merge #4474 4474: Update cargo version r=irevoire a=curquiza Fixes #4417 Co-authored-by: curquiza <clementine@meilisearch.com>	2024-03-12 09:27:22 +00:00
Louis Dureuil	7b670a4afa	Allow dry runs for benchmarks where reports are generated but not sent to the dashboard	2024-03-12 10:26:13 +01:00
curquiza	fde209b7b6	Update cargo version	2024-03-12 10:20:07 +01:00
meili-bors[bot]	904b82a61d	Merge #4473 4473: Bring back changes from v1.7.0 to main r=curquiza a=curquiza Co-authored-by: ManyTheFish <many@meilisearch.com> Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: Many the fish <many@meilisearch.com> Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>	2024-03-11 15:02:47 +00:00
Tamo	8ec3e30d2b	Merge branch 'main' into tmp-release-v1.7.0	2024-03-11 15:39:51 +01:00
meili-bors[bot]	0a59cb9734	Merge #4463 4463: Add tests when the field limit is reached r=Kerollmops a=irevoire # Pull Request ## Related issue Related to https://github.com/meilisearch/meilisearch/discussions/4429#discussioncomment-8689101 This user found out that the error message we’re supposed to return when the maximum number of attributes is reached is _not_ returned in some cases ## What does this PR do? - This PR adds four tests around the maximum number of attributes: 1. Add a document with u16::MAX + 1 fields - Meilisearch panics 2. Add two documents which together adds up to u16::MAX + 1 fields - Meilisearch returns the expected error 3. Add a document with u16::MAX + 1 nested fields - No error message but the document isn’t indexed 4. Add two documents which together add up to u16::MAX + 1 nested fields - Meilisearch doesn’t return any error but doesn’t index the document ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: Tamo <tamo@meilisearch.com>	2024-03-07 10:36:54 +00:00
Tamo	f053c280e1	add tests when the field limit is reached	2024-03-06 18:42:41 +01:00
meili-bors[bot]	ee3076d5ba	Merge #4462 4462: Divide threshold by ten r=dureuill a=ManyTheFish Change the facet incremental vs bulk indexing threshold to better fit our user needs, it might be changed in the future if we have more insights Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-03-06 13:05:38 +00:00
meili-bors[bot]	ab1224bfa7	Merge #4458 4458: Replace logging timer by spans r=Kerollmops a=dureuill - Remove logging timer dependency. - Remplace last uses in search by spans Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-03-05 16:43:23 +00:00
meili-bors[bot]	eefc1c421e	Merge #4459 4459: Put a bound on OpenAI timeout r=dureuill a=dureuill # Pull Request ## Related issue Fixes #4460 ## What does this PR do? - Makes sure that the timeout of the openai embedder is limited to max 1min, rather than the prior 15min+ Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-03-05 15:18:51 +00:00
meili-bors[bot]	4d42a7af7c	Merge #4445 4445: Add subcommand to run benchmarks r=irevoire a=dureuill # Pull Request ## Related issue Not user-facing, no issue ## What does this PR do? - Adds a new `cargo xtask bench` subcommand that can run one or multiple workload files and report the results to a server - A workload file is a JSON file with a specific schema - Refactor our use of the `vergen` crate: - update to the beta `vergen-git2` crate - VERGEN_GIT_SEMVER_LIGHTWEIGHT => VERGEN_GIT_DESCRIBE - factor logic in a single `build-info` crate that is used both by meilisearch and xtask (prevents vergen variables from overriding themselves) - checked that defining the variables by hand when no git repo is available (docker build case) still works. - Add CI to run `cargo xtask bench` Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-03-05 14:03:57 +00:00
Louis Dureuil	7408db2a46	Meilisearch: fix date formatting	2024-03-05 14:56:48 +01:00
Louis Dureuil	663629a9d6	Remove unused build dependency from xtask Co-authored-by: Tamo <tamo@meilisearch.com>	2024-03-05 14:45:06 +01:00
Louis Dureuil	15c38dca78	Output RFC 3339 dates where we can Co-authored-by: Tamo <tamo@meilisearch.com>	2024-03-05 14:44:48 +01:00
Louis Dureuil	7ee20b0895	Refactor xtask bench	2024-03-05 14:42:06 +01:00
Louis Dureuil	0c216048b5	Cap timeout duration	2024-03-05 12:19:25 +01:00
Louis Dureuil	36d17110d8	openai: Handle BAD_GETAWAY, be more resilient to failure	2024-03-05 12:18:54 +01:00
meili-bors[bot]	bdd428c22e	Merge #4450 4450: Add the content type in the webhook + improve the test r=Kerollmops a=irevoire # Pull Request ## Related issue Fixes https://github.com/meilisearch/meilisearch/issues/4436 ## What does this PR do? - Specify the content type of the webhook - Ensure it’s the case in the test Co-authored-by: Tamo <tamo@meilisearch.com>	2024-03-05 10:36:53 +00:00
Tamo	b130917933	add the content type in the webhook + improve the test	2024-03-05 11:22:29 +01:00
Louis Dureuil	25f64ce7df	Replace logging timer by spans	2024-03-05 11:05:42 +01:00
Louis Dureuil	adcd848809	CI: Add bench workflows	2024-03-05 11:02:05 +01:00
meili-bors[bot]	84ae0cd456	Merge #4457 4457: Bump mio from 0.8.9 to 0.8.11 r=Kerollmops a=dependabot[bot] Bumps [mio](https://github.com/tokio-rs/mio) from 0.8.9 to 0.8.11. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/tokio-rs/mio/blob/master/CHANGELOG.md">mio's changelog</a>.</em></p> <blockquote> <h1>0.8.11</h1> <ul> <li>Fix receiving IOCP events after deregistering a Windows named pipe (<a href="https://redirect.github.com/tokio-rs/mio/pull/1760">tokio-rs/mio#1760</a>, backport pr: <a href="https://redirect.github.com/tokio-rs/mio/pull/1761">tokio-rs/mio#1761</a>).</li> </ul> <h1>0.8.10</h1> <h2>Added</h2> <ul> <li>Solaris support (<a href="https://redirect.github.com/tokio-rs/mio/pull/1724">tokio-rs/mio#1724</a>).</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`0328bdef90`"><code>0328bde</code></a> Release v0.8.11</li> <li><a href="`7084498512`"><code>7084498</code></a> Fix warnings</li> <li><a href="`90d4fe00df`"><code>90d4fe0</code></a> named-pipes: fix receiving IOCP events after deregister</li> <li><a href="`c710a307f8`"><code>c710a30</code></a> Add v0.8.x to the CI</li> <li><a href="`c29e21c244`"><code>c29e21c</code></a> Release v0.8.10</li> <li><a href="`f6a20da1c8`"><code>f6a20da</code></a> Add Solaris operating system support (<a href="https://redirect.github.com/tokio-rs/mio/issues/1724">#1724</a>)</li> <li>See full diff in <a href="https://github.com/tokio-rs/mio/compare/v0.8.9...v0.8.11">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=mio&package-manager=cargo&previous-version=0.8.9&new-version=0.8.11)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - ``@dependabot` rebase` will rebase this PR - ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it - ``@dependabot` merge` will merge this PR after your CI passes on it - ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it - ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging - ``@dependabot` reopen` will reopen this PR if it is closed - ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meilisearch/meilisearch/network/alerts). </details> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-03-05 09:35:17 +00:00
Louis Dureuil	eee46b7537	Add first workloads	2024-03-05 10:13:11 +01:00
Louis Dureuil	55f60a3638	Update .gitignore - Ignore `/bench` directory for git purposes - Ignore benchmark DB	2024-03-05 10:12:52 +01:00
Louis Dureuil	c608b3f9b5	Factor vergen stuff to a build-info crate	2024-03-05 10:11:43 +01:00
Louis Dureuil	86ce843f3d	Add cargo xtask bench	2024-03-05 10:11:43 +01:00
Louis Dureuil	b11df7ec34	Meilisearch: fix some wrong spans	2024-03-05 10:11:43 +01:00
Louis Dureuil	6862caef64	Span Stats compute self-time	2024-03-05 10:11:43 +01:00
Louis Dureuil	f75c7ac979	Compile xtask in --release	2024-03-05 10:11:43 +01:00

... 18 19 20 21 22 ...

10086 Commits