4535: Support Negative Keywords r=ManyTheFish a=Kerollmops
This PR fixes#4422 by supporting `-` before any word in the query.
The minus symbol `-`, from the ASCII table, is not the only character that can be considered the negative operator. You can see the two other matching characters under the `Based on "-" (U+002D)` section on [this unicode reference website](https://www.compart.com/en/unicode/U+002D).
It's important to notice the strange behavior when a query includes and excludes the same word; only the derivative ( synonyms and split) will be kept:
- If you input `progamer -progamer`, the engine will still search for `pro gamer`.
- If you have the synonym `like = love` and you input `like -like`, it will still search for `love`.
## TODO
- [x] Add analytics
- [x] Add support to the `-` operator
- [x] Make sure to support spaces around `-` well
- [x] Support phrase negation
- [x] Add tests
Co-authored-by: Clément Renault <clement@meilisearch.com>
4546: Fix some typos in conments r=curquiza a=redistay
# Pull Request
## What does this PR do?
- fix some typos in conments
## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: redistay <wujunjing@outlook.com>
4543: Bring back changes from v1.7.4 into main r=Kerollmops a=dureuill
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: dureuill <dureuill@users.noreply.github.com>
4536: Limit concurrent search requests r=ManyTheFish a=irevoire
# Pull Request
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4489
## What does this PR do?
- Adds a « search queue » that limits the number of search requests we can process at the same time and stores search requests to be processed
- Process only one search request per core/thread (we use available_parallelism)
- When the search queue is full, new search requests replace old ones **randomly**. The reason is that:
- If we serve the oldest one first, like Typesense, we give the worst performances to everyone
- If we serve the latest one, it gets too easy to DoS us (you just need to fill the queue with as many search requests as we can process simultaneously to ensure no other request will ever be processed)
- By picking the search request randomly, we give a chance to recent search requests to be processed while ensuring that we can't be owned unless they fill our queue entirely and we start returning errors 5xx
- Adds an experimental parameter to control the size of the queue
- Adds a bunch of tests to ensure the search queue works correctly
- Ensure the loop consuming the search queue is running in the health route and crashes if it’s not the case
Co-authored-by: Tamo <tamo@meilisearch.com>
4532: Add `url` and `api_key` to ollama r=ManyTheFish a=dureuill
See [Usage page](https://meilisearch.notion.site/v1-8-AI-search-API-usage-135552d6e85a4a52bc7109be82aeca42#5c77ef49e78e43388c1d3d5429151357)
### Motivation
- Before this PR, the url for ollama is only read from the environment. This is a needless restriction that will be troublesome in settings where passing an environment variable is complex or impossible (e.g., the Cloud)
- Before this PR, ollama did not support an api_key. While ollama does not natively support API keys, [a common practice](https://github.com/ollama/ollama/issues/849) is to put a publicly accessible ollama server behind a proxy to support authentication.
### Skip changelog
ollama embedder was added to v1.8
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
4520: Add automation to create openAPI issue r=dureuill a=curquiza
Create automatically an issue to remind us to update open-api file when opening a milestone
Co-authored-by: curquiza <clementine@meilisearch.com>
4541: Update version for the next release (v1.7.4) in Cargo.toml r=Kerollmops a=meili-bot
⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.
Co-authored-by: dureuill <dureuill@users.noreply.github.com>
4539: Don't optimize reindexing when fields contain dots r=Kerollmops a=dureuill
# Pull Request
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4525
## What does this PR do?
- Don't try to optimize the amount of reindexing operation when nested fields are used anywhere in:
- the field distribution (e.g. a key actually contains a `.`)
- the old faceted fields
- the new faceted fields
This is because the facet distribution is not reporting on existing nested fields.
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
4516: Update sprint_issue.md r=Kerollmops a=curquiza
Following decision made about specification
Also
- removed useless parts of the template
- add automatic labels -> better to forget to remove them rather than forgetting to add them (some mistakes happened in the past)
Co-authored-by: Clémentine U. - curqui <clementine@meilisearch.com>
4509: Rest embedder r=ManyTheFish a=dureuill
Fixes#4531
See [Usage page](https://meilisearch.notion.site/v1-8-AI-search-API-usage-135552d6e85a4a52bc7109be82aeca42?pvs=25#e6f58c3b742c4effb4ddc625ce12ee16)
### Implementation changes
- Remove tokio, futures, reqwests
- Add a new `milli::vector::rest::Embedder` embedder
- Update OpenAI and Ollama embedders to use the REST embedder internally
- Make Embedder::embed a sync method
- Add the new embedder source as described in the usage
Co-authored-by: Louis Dureuil <louis@meilisearch.com>