3164: Improve the way we receive the documents payload r=Kerollmops a=jiangbo212
# Pull Request
## Related issue
Fixes#3037
## What does this PR do?
- writing the playload to a temporary file via BufWritter
- deserialising the json tempporary file to an array of Objects by means of a memory map
- deserialising thie csv tempporary file by means of a memory map
- Adapted some read_json tests
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: jiangbo212 <peiyaoliukuan@gmail.com>
Co-authored-by: jiangbo212 <peiyaoliukuan@126.com>
737: Fix typo initial candidates computation r=Kerollmops a=ManyTheFish
When `Typo` criterion was after a different criterion than `Words` and the previous criterion wasn't returning any candidates at the first iteration of the bucket sort, then the `initial_candidates` were lost.
Now, `Typo`ensure to keep the `initial_candidates` between iterations.
related to https://github.com/meilisearch/meilisearch/issues/3200#issuecomment-1345179578
related to https://github.com/meilisearch/meilisearch/issues/3228
Co-authored-by: ManyTheFish <many@meilisearch.com>
734: Fix bug 2945/3021 (missing key in documents database) r=Kerollmops a=loiclec
# Pull Request
## Related issue
Fixes (partially, until merged into meilisearch) https://github.com/meilisearch/meilisearch/issues/2945 (until we integrate the new milli bump into meilisearch).
**Note that a dump will not be sufficient to upgrade from meilisearch v0.30.2 to meilisearch v0.30.3 due to this fix** because the bug could have caused the `documents` database to be corrupted. Instead, a full manual reimport of the documents will be necessary.
## What does this PR do?
There was a bug happening when:
1. A few documents are added to the index
2. Some of these documents are soft-deleted
3. New documents are added, replacing existing ones and triggering a hard-deletion
The `IndexDocuments::execute` method would then perform the hard-deletion but forget to change the `external_document_ids` structure appropriately. As a result, the `external_document_ids` would contain keys corresponding to documents that do no exist anymore.
To fix this bug, I split the `DeleteDocuments::execute` method into two: `execute_inner` and `execute`.
- `execute_inner` returns a `DetailedDocumentDeletionResult` which says whether soft-deletion was used or not
- `execute` keeps the exact same signature and behaviour
Then, when deleting replaced documents inside `IndexDocuments::execute`, we call `DeleteDocuments::execute_inner` instead of `DeleteDocuments::execute`. If soft-deletion was used, nothing more is done. But if hard-deletion was used, we remove every reference to soft-deleted documents in the new `external_documents_ids` structure.
## Correctness
- Every other test still passes
- The reproduction test case now passes
- In a different branch ([`update-fuzz-test`](https://github.com/meilisearch/milli/pull/735)), I created a fuzz-test that reproduces the past two bugs. This fuzz test cannot find this bug through any combination of some hand-selected `DocumentAddition / DocumentDeletion / DocumentClear / SettingsUpdate` operations. In that test, each relevant operations can be executed with or without soft-deletion, and document additions can be done in batches, replacing or updating existing documents.
Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
3223: Bring back release-v0.30.2 changes into main r=irevoire a=curquiza
Only bring back the necessary changes from `release-v0.30.2` to `main`, following v0.30.2 release
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: curquiza <clementine@meilisearch.com>
3224: Fix update-cargo-toml-version.yml r=curquiza a=mohitsaxenaknoldus
# Pull Request
## Related issue
Fixes#3219
## What does this PR do?
- ...
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: Mohit Saxena <76725454+mohitsaxenaknoldus@users.noreply.github.com>
3229: Add a nightly CI: create every day a `nightly` Docker tag based on the latest commit on `main` r=Kerollmops a=curquiza
Also, fixes#3195
Easy to follow with the commits
- In the Docker CI:
- create every day a `nightly` Docker tag based on the latest commit on `main`
- check if the release is the latest one, before creating the `latest` Docker tag. A script has been added.
- add the `worflow_dispatch` event to trigger the CI to build the `nightly` tag when we want (always on the latest commit on `main`)
- In multiple CIs: replace the `released` type by `published`, see [here](https://stackoverflow.com/questions/59319281/github-action-different-between-release-created-and-published) why. Will not impact anything, but will prevent to fail our future automation
- Remove a useless CI (code coverage, not used for 1 year)
- Remove useless lines (comments and CI logic) that don't have any impact
Co-authored-by: curquiza <clementine@meilisearch.com>
We were writing the instance-uid as bytes instead of string in the data.ms and thus we were unable to parse it later.
Also it was less practical for our user to retrieve it and send it to us.
3112: Rename meilisearch-http r=Kerollmops a=colbsmcdolbs
# Pull Request
## Related issue
Fixes#3073
## What does this PR do?
- Renames all references of `meilisearch-http` to `meilisearch`
- Might need to be rebased before the 1.0.0 release
## PR checklist
Please check if your PR fulfills the following requirements:
- [X] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [X] Have you read the contributing guidelines?
- [X] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: Colby Allen <colbyjayallen@gmail.com>
3221: Update README to reference Meilisearch Cloud r=curquiza a=davelarkan
# Pull Request
## Related issue
Fixes#3220
## What does this PR do?
- Updates the README to link to the Pricing page where people can choose a Meilisearch Cloud plan
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: Dave Larkan <davelarkan@gmail.com>
3216: Update version for the next release (v1.0.0) in Cargo.toml files r=curquiza a=meili-bot
⚠️ This PR is automatically generated. Check the new version is the expected one before merging.
Co-authored-by: curquiza <curquiza@users.noreply.github.com>
3215: Use nightly in cargo fmt r=curquiza a=curquiza
Discussed with `@Kerollmops,` needs this change
Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
729: Fix distincted exhaustive hits r=Kerollmops a=ManyTheFish
This PR changes the name and behavior of `bucket_candidates`:
- `bucket_candidates` become `initial_candidates` that is less confusing
- `initial_candidates` is no more a simple `RoaringBitmap` but an enum allowing us to precise if the candidates are exhaustive or not
- this enum ensures that any modification is allowed only if the candidates are not already exhaustive.
The bug occurred because `initial_candidates` are modified during the bucket sort allowing the estimation to be more and more precise along the search, and this was an issue when the `initial_candidates` were already exhaustive, now, if candidates are exhaustive, then no modifications are made.
Co-authored-by: ManyTheFish <many@meilisearch.com>
727: Fix bug in filter search r=Kerollmops a=loiclec
# Pull Request
## Related issue
Fixes (partially, until merged into meilisearch) https://github.com/meilisearch/meilisearch/issues/3178
## What does this PR do?
The most important change is this one:
```rust
// in milli/src/search/facet/facet_range_search.rs, line 239
let should_stop = {
match self.right {
Bound::Included(right) => right < previous_key.left_bound,
Bound::Excluded(right) => right <= previous_key.left_bound,
Bound::Unbounded => false,
}
};
```
where the operations `<` and `<=` between the two branches were switched. This caused (very few) documents to be missing from filter results.
The second change is a simplification of the algorithm for filters such as `field = value`, where we now perform a direct query into the "Level 0" of the facet db to retrieve the docids instead of invoking the full facet search algorithm. This change is done in `milli/src/search/facet/filter.rs`.
I have added yet more insta-snapshot tests, rechecked the content of the snapshots, and added some integration tests as well.
This is purely a fix in the search algorithms. Based on this PR alone, a dump will not be necessary to switch from v0.30.1 (where this bug is present) to v0.30.2 (where this PR is merged).
Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>