Add a new job to the Rust workflow to run 'cargo build' and 'cargo
test' (on the cron schedule only) with the '--all-features' flag.
This will execute across all three environments: Linux, macOS,
Windows.
Autoformat the Rust workflow file via the Red Hat YAML extension for
Visual Studio Code:
https://marketplace.visualstudio.com/items?itemName=redhat.vscode-yaml
This straightens out whitespace and string quoting for safer parsing.
Fixes#3506.
3529: Add an analytics on the geo bounding box feature r=ManyTheFish a=irevoire
Fixes#3527
[The specification of the geoBoundingBox](https://github.com/meilisearch/specifications/pull/223) feature has been updated and now introduces a new analytics to follow the usage of the geoBoundingBox feature in the search requests.
Co-authored-by: Tamo <tamo@meilisearch.com>
3525: Fix phrase search containing stop words r=ManyTheFish a=ManyTheFish
# Summary
A search with a phrase containing only stop words was returning an HTTP error 500,
this PR filters the phrase containing only stop words dropping them before the search starts, a query with a phrase containing only stop words now behaves like a placeholder search.
fixes https://github.com/meilisearch/meilisearch/issues/3521
related v1.0.2 PR on milli: https://github.com/meilisearch/milli/pull/779
Co-authored-by: ManyTheFish <many@meilisearch.com>
3544: Attempt to use default vram budget for faster startup r=Kerollmops a=dureuill
# Pull Request
## Related issue
Follow-up to #3382: addresses the added startup time on Windows/macOS.
## What does this PR do?
- Attempt to skip budget calculation by using "known good values" instead
- Perform dichotomic budget calculation as fallback only when the known value is not actually good.
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
3543: config: case `experimental_enable_metrics` in snake_case r=dureuill a=dureuill
# Pull Request
Avoids "Error: unknown field `experimental-enable-metrics` at line 1 column 1" error when using the default config file.
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
3538: Improve the api key of the metrics r=dureuill a=irevoire
Related to https://github.com/meilisearch/meilisearch/pull/3524#discussion_r1115903998
Update: https://github.com/meilisearch/meilisearch/issues/3523
Right after merging the PR, we changed our minds and decided to update the way we handle the API keys on the metrics route.
Now instead of bypassing all the applied rules of the API key, we forbid the usage of the `/metrics` route if you have any restrictions on the indexes.
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
3537: fix a bug where the filestore could try to parse its own tmp file and fail (main) r=irevoire a=curquiza
Co-authored-by: Tamo <tamo@meilisearch.com>
3524: Update the metrics route r=irevoire a=irevoire
Fixes#3523
Make the metrics available by default without a feature flag.
+ Rename the cli-flag to `experimental-enable-metrics`.
Co-authored-by: Tamo <tamo@meilisearch.com>
3331: Limit the number of concurrently opened indexes r=dureuill a=dureuill
# Pull Request
## Related issue
Relevant to #1841, fixes#3382
## What does this PR do?
### User standpoint
- Limit the number of concurrently opened indexes (currently, the number of indexes that can be concurrently opened is computed at startup)
- When too many an index is opened, the least recently used one is closed and its virtual memory released.
- This allows a user to have an arbitrary number of indexes of an arbitrary size
### Implementation standpoint
- Added a LRU cache map in `index-scheduler::lru`. A more complete implementation (eg with helper functions not used here) is available but would better fit a dedicated crate.
- Use the LRU cache map in the `IndexScheduler`. To simplify the lifecycle of indexes, they are never removed from the cache when they are in the middle of a resize or delete operation. To achieve this, an intermediate `Vec` stores the UUIDs of the indexes that are in the middle of such an operation.
- Upon creating the index scheduler object, compute the total virtual memory that is adressable by using a dichotomic search on the max size of an index. Use this as a base to compute the number of indexes that can be open with 2TiB per index. If the virtual memory address space is lower than 2TiB, then only allow for 1 index of a fraction of that size.
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
3530: Fix highlighter bug r=Kerollmops a=ManyTheFish
# Pull Request
There was a highlighting issue on CJK's character, we were highlighting too many characters and these additional characters were duplicated after the highlight tag.
## Related issue
Fixes#3517Fixes#3526
## What does this PR do?
- add a test showcasing the bug
- fix the bug by activating the char_map creation of the tokenizer during the highlighting process
Co-authored-by: ManyTheFish <many@meilisearch.com>
3534: Update the csv error code from InvalidIndexCsvDelimiter to InvalidDocumentCsvDelimiter r=Kerollmops a=irevoire
Fixes#3533
Co-authored-by: Tamo <tamo@meilisearch.com>
3417: Allow multiple searches in a single request r=irevoire a=dureuill
# Pull Request
## Related issue
Fixes#3427
## What does this PR do?
### User standpoint
- Adds a new `/multi-search` entry point (not to be confused with the existing `/{index_uid}/search` entry points) that accepts a POST whose body is an object containing an array of queries.
- Each query must specify on which index it acts by providing its `indexUid`. Other parameters are identical to the one in the existing search routes (`q`, `limit`, etc.).
- The response is a JSON object containing an array of the results for each search query as if it had been performed using the `/{index_uid}/search` routes.
### Implementation standpoint
- Refactor authentication module:
- Allow tenant token to be checked even without an index in URL
- Add `meilisearch-auth` as a dependency to `index-scheduler` so as to have a working method of checking if the indexes are authorized there that takes into account both the API key and the tenant token (existing method relied on a behavior that was returning the allowed indexes from the API key as long as there weren't any tenant token)
- Make `AuthFilter` an object with invariants and so its fields are now private
- Use the methods of `AuthFilter` to know if an index is authorized rather than relying on its internal search rules.
- Make tenant token search rules optional and `None` when the `AuthFilter` was not built with a tenant token.
- Add a new `routes::index::search::multiple_search` module containing a post handler that performs the same work as the existing `routes::index::search` post handler, but in a loop.
- Add various tests
- Add authentication test suite
### Sample request
<details>
<summary>
Click to see request/response
</summary>
```json
~/datasets
❯ curl \
-X POST 'http://localhost:7700/multi-search' \
-H 'Content-Type: application/json' \
--data-binary '{"queries": [{ "indexUid": "index-0", "q": "toto", "limit": 1 }, {"indexUid": "index-1", "q": "titi", "limit": 1}]}' | jsonxf
{ "results": [
{
"indexUid": "index-0",
"hits": [
{
"id": 20480,
"title": "Toto - 25th Anniversary - Live in Amsterdam",
"overview": "Filmed in High Definition in Amsterdam on Toto's 25th Anniversary Tour in 2003, this stunning concert captures the band at their very best, reunited with original vocalist Bobby Kimball. The set combines all their hits with tracks from their latest album \"Through the Looking Glass\" and other live favorites, performed in front of a wildly enthusastic sell-out crowd. Extras include 35 minute behind-the-scenes film following the band through various stages of their world tour including footage from Japan, Thailand, South Korea, and France. Toto celebrate their 25th anniversary with this blistering live concert, filmed in Amsterdam on February 25th, 2003. Proving they've still got exactly what it takes to move a crowd, the band perform a mixture of medley's, solo spots, and huge hits. Tracks include \"Rosanna,\" \"Africa,\" \"Hold The Line,\" a cover of the Beatles' \"While My Guitar Gently Weeps,\" and many more.",
"genres": [
"Music"
],
"poster": "https://image.tmdb.org/t/p/w500/7SCbUPwoB8Z7VUIA1Rn1WWwjNiT.jpg",
"release_date": 1064275200
}
],
"query": "toto",
"processingTimeMs": 1,
"limit": 1,
"offset": 0,
"estimatedTotalHits": 17
},
{
"indexUid": "index-1",
"hits": [
{
"id": 41212,
"title": "Titicut Follies",
"overview": "The film is a stark and graphic portrayal of the conditions that existed at the State Prison for the Criminally Insane at Bridgewater, Massachusetts. TITICUT FOLLIES documents the various ways the inmates are treated by the guards, social workers and psychiatrists.",
"genres": [
"Documentary"
],
"poster": "https://image.tmdb.org/t/p/w500/2Ju5hn1ofOPeP1eRJtQWakiHuhW.jpg",
"release_date": -70934400
}
],
"query": "titi",
"processingTimeMs": 0,
"limit": 1,
"offset": 0,
"estimatedTotalHits": 7
}
]}
```
</details>
## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: Louis Dureuil <louis@meilisearch.com>