Commit Graph

9976 Commits

Author SHA1 Message Date
bors[bot]
e0a8f8cb5a
Merge #734
734: Fix bug 2945/3021 (missing key in documents database) r=Kerollmops a=loiclec

# Pull Request

## Related issue
Fixes (partially, until merged into meilisearch) https://github.com/meilisearch/meilisearch/issues/2945 (until we integrate the new milli bump into meilisearch).

**Note that a dump will not be sufficient to upgrade from meilisearch v0.30.2 to meilisearch v0.30.3 due to this fix** because the bug could have caused the `documents` database to be corrupted. Instead, a full manual reimport of the documents will be necessary.

## What does this PR do?
There was a bug happening when:
1. A few documents are added to the index
2. Some of these documents are soft-deleted
3. New documents are added, replacing existing ones and triggering a hard-deletion

The `IndexDocuments::execute` method would then perform the hard-deletion but forget to change the `external_document_ids` structure appropriately. As a result, the `external_document_ids` would contain keys corresponding to documents that do no exist anymore.

To fix this bug, I split the `DeleteDocuments::execute` method into two: `execute_inner` and `execute`. 
- `execute_inner` returns a `DetailedDocumentDeletionResult` which says whether soft-deletion was used or not
- `execute` keeps the exact same signature and behaviour

Then, when deleting replaced documents inside `IndexDocuments::execute`, we call `DeleteDocuments::execute_inner` instead of `DeleteDocuments::execute`. If soft-deletion was used, nothing more is done. But if hard-deletion was used, we remove every reference to soft-deleted documents in the new `external_documents_ids` structure.

## Correctness

- Every other test still passes
- The reproduction test case now passes
- In a different branch ([`update-fuzz-test`](https://github.com/meilisearch/milli/pull/735)), I created a fuzz-test that reproduces the past two bugs. This fuzz test cannot find this bug through any combination of some hand-selected `DocumentAddition / DocumentDeletion / DocumentClear / SettingsUpdate` operations. In that test, each relevant operations can be executed with or without soft-deletion, and document additions can be done in batches, replacing or updating existing documents.



Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
2022-12-13 09:45:57 +00:00
Loïc Lecrenier
be3b00350c Apply review suggestions: naming and documentation 2022-12-13 10:15:22 +01:00
jiangbo212
23c1b223b3 Merge branch 'fix-3037' of github.com:jiangbo212/meilisearch into fix-3037 2022-12-13 10:41:50 +08:00
jiangbo212
87ae0032bf review change 2022-12-13 10:41:43 +08:00
jiangbo212
7c24fea9f2 Merge branch 'main' into fix-3037 2022-12-13 05:16:03 +08:00
ManyTheFish
80d34a4169 Fix typo initial candiddates computation 2022-12-12 19:02:48 +01:00
jiangbo212
27d1bee0bb Merge branch 'main' into fix-3037-new 2022-12-12 22:16:22 +08:00
jiangbo212
b1c3174061 fix fmt 2022-12-12 22:06:24 +08:00
jiangbo212
fa46dfb7bb fmt fix 2022-12-12 22:02:56 +08:00
bors[bot]
40d9b73aaf
Merge #3223
3223: Bring back release-v0.30.2 changes into main r=irevoire a=curquiza

Only bring back the necessary changes from `release-v0.30.2` to `main`, following v0.30.2 release

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: curquiza <clementine@meilisearch.com>
2022-12-12 13:49:01 +00:00
jiangbo212
169682d3ec Merge branch 'main' into fix-3037-new 2022-12-12 21:36:10 +08:00
bors[bot]
21b926cb00
Merge #3224
3224: Fix update-cargo-toml-version.yml r=curquiza a=mohitsaxenaknoldus

# Pull Request

## Related issue
Fixes #3219 

## What does this PR do?
- ...

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Mohit Saxena <76725454+mohitsaxenaknoldus@users.noreply.github.com>
2022-12-12 13:27:46 +00:00
Loïc Lecrenier
e3ee553dcc Remove soft deleted ids from ExternalDocumentIds during document import
If the document import replaces a document using hard deletion
2022-12-12 14:16:09 +01:00
bors[bot]
34a6f2598b
Merge #3229
3229: Add a nightly CI: create every day a `nightly` Docker tag based on the latest commit on `main` r=Kerollmops a=curquiza

Also, fixes #3195

Easy to follow with the commits
- In the Docker CI:
  - create every day a `nightly` Docker tag based on the latest commit on `main`
  - check if the release is the latest one, before creating the `latest` Docker tag. A script has been added.
  - add the `worflow_dispatch` event to trigger the CI to build the `nightly` tag when we want (always on the latest commit on `main`)
- In multiple CIs: replace the `released` type by `published`, see [here](https://stackoverflow.com/questions/59319281/github-action-different-between-release-created-and-published) why. Will not impact anything, but will prevent to fail our future automation
- Remove a useless CI (code coverage, not used for 1 year)
- Remove useless lines (comments and CI logic) that don't have any impact

Co-authored-by: curquiza <clementine@meilisearch.com>
2022-12-12 10:46:33 +00:00
curquiza
14824cee86 Remove obsolete comment line 2022-12-11 21:46:48 +01:00
curquiza
796e61ec7e Remove useless CI 2022-12-11 21:29:23 +01:00
curquiza
9a3f9577b8 Remove useless line in CI 2022-12-11 21:26:05 +01:00
curquiza
2c8eb92537 Check before publish latest 2022-12-11 21:24:52 +01:00
Mohit Saxena
1bf5c0edb9
Update update-cargo-toml-version.yml 2022-12-10 23:04:26 +05:30
curquiza
b1ffbe561e Add nightly for docker CI 2022-12-09 20:06:59 +01:00
curquiza
84204b8cd5 Replace the released type by published 2022-12-09 19:27:58 +01:00
Mohit Saxena
346fca5608
Update update-cargo-toml-version.yml 2022-12-09 00:20:51 +05:30
Loïc Lecrenier
bebd050961 Add new test for bug 3021 2022-12-08 19:19:40 +01:00
curquiza
4631f4d97f Bump milli to v0.37.2 2022-12-08 18:16:48 +01:00
Tamo
6f1c30b247 Fix the instance-uid in the data.ms
We were writing the instance-uid as bytes instead of string in the data.ms and thus we were unable to parse it later.
Also it was less practical for our user to retrieve it and send it to us.
2022-12-08 18:16:43 +01:00
bors[bot]
abba54e913
Merge #3112
3112: Rename meilisearch-http r=Kerollmops a=colbsmcdolbs

# Pull Request

## Related issue
Fixes #3073 

## What does this PR do?
- Renames all references of `meilisearch-http` to `meilisearch`
- Might need to be rebased before the 1.0.0 release

## PR checklist
Please check if your PR fulfills the following requirements:
- [X] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [X] Have you read the contributing guidelines?
- [X] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Colby Allen <colbyjayallen@gmail.com>
2022-12-08 16:32:08 +00:00
bors[bot]
c426fa1478
Merge #3212
3212: Setup COMMIT_SHA and COMMIT_DATE build args in the Docker image r=curquiza a=brunoocasali

GitHub auto-closed my PR when I synced changes with my remote 🤷‍♂️  https://github.com/meilisearch/meilisearch/pull/2550
The last PR #3205 were closed to help `@curquiza` test the CI.

In any case, the summary of changes is quite similar:

- Fix `git` usage from my last attempt (when you use `actions/checkout`) you get the `git` command to use.
- Add the `build-args` definition from https://github.com/docker/build-push-action#inputs, which is supposed to work precisely as docker build `--build-arg`. 

Fixes https://github.com/meilisearch/meilisearch/issues/2028

The result will be like this:

<img width="556" alt="image" src="https://user-images.githubusercontent.com/4116980/206019608-2713559a-1f58-4ff3-9fec-7720783993ac.png">

Co-authored-by: Bruno Casali <brunoocasali@gmail.com>
2022-12-08 16:07:50 +00:00
bors[bot]
574be942cd
Merge #3221
3221: Update README to reference Meilisearch Cloud r=curquiza a=davelarkan

# Pull Request

## Related issue
Fixes #3220

## What does this PR do?
- Updates the README to link to the Pricing page where people can choose a Meilisearch Cloud plan

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Dave Larkan <davelarkan@gmail.com>
2022-12-08 15:43:43 +00:00
Colby Allen
2262766494 chore: run fmt nightly on project 2022-12-08 08:31:15 -07:00
Colby Allen
ad2b1467da Renames meilisearch-http to meilisearch 2022-12-08 08:22:53 -07:00
Dave Larkan
ee37d5e724 Update README to reference Meilisearch Cloud 2022-12-08 15:02:34 +00:00
bors[bot]
ded2a50d14
Merge #3216
3216: Update version for the next release (v1.0.0) in Cargo.toml files r=curquiza a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2022-12-08 13:49:50 +00:00
Bruno Casali
58327979f1 Use correct env vars "VERGEN_*" on Dockerfile 2022-12-08 10:48:16 -03:00
Bruno Casali
50d9fe036e Setup COMMIT_SHA and COMMIT_DATE build args in the Docker image 2022-12-08 10:48:16 -03:00
curquiza
026cf223b3 Update version for the next release (v1.0.0) in Cargo.toml files 2022-12-08 12:20:17 +00:00
bors[bot]
af6f7f8462
Merge #3215
3215: Use nightly in cargo fmt r=curquiza a=curquiza

Discussed with `@Kerollmops,` needs this change

Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-12-08 10:53:24 +00:00
Clémentine Urquizar - curqui
5023d36ee7
Use nightly in cargo fmt 2022-12-08 11:51:13 +01:00
bors[bot]
1f1beae077
Merge #729
729: Fix distincted exhaustive hits r=Kerollmops a=ManyTheFish

This PR changes the name and behavior of `bucket_candidates`:
- `bucket_candidates` become `initial_candidates` that is less confusing
- `initial_candidates` is no more a simple `RoaringBitmap` but an enum allowing us to precise if the candidates are exhaustive or not
- this enum ensures that any modification is allowed only if the candidates are not already exhaustive.

The bug occurred because `initial_candidates` are modified during the bucket sort allowing the estimation to be more and more precise along the search, and this was an issue when the `initial_candidates` were already exhaustive, now, if candidates are exhaustive, then no modifications are made.

Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-12-08 09:26:34 +00:00
ManyTheFish
55724f2412 Introduce an initial candidates set that makes the difference between an exhaustive count and an estimation 2022-12-08 09:41:34 +01:00
ManyTheFish
6d50ea0830 add tests 2022-12-08 08:56:57 +01:00
bors[bot]
f4dc4c5d8d
Merge #3210
3210: Fix `MDB_PAGE_FULL` by bumping LMDB r=Kerollmops a=Kerollmops

This PR fixes #3062 by upgrading LMDB to the latest version.

The changes were made in https://github.com/meilisearch/lmdb/pull/1 and https://github.com/meilisearch/lmdb-rs/pull/12. As heed directly depends on the latest main commit of https://github.com/meilisearch/lmdb-rs, we can bump the `lmdb-rkv-sys` dependency in the Meilisearch _Cargo.lock_ by doing a:

```
cargo update -p lmdb-rkv-sys
```

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-12-07 16:21:23 +00:00
Loïc Lecrenier
f37c86e0b2 Add some integration tests on the sort criterion 2022-12-07 15:59:33 +01:00
jiangbo212
717dd36547 Merge branch 'fix-3037' of github.com:jiangbo212/meilisearch into fix-3037 2022-12-07 22:54:16 +08:00
jiangbo212
538030c2da change NameTempFile to tempfile() 2022-12-07 22:47:32 +08:00
bors[bot]
098c410612
Merge #727
727: Fix bug in filter search r=Kerollmops a=loiclec

# Pull Request

## Related issue
Fixes (partially, until merged into meilisearch) https://github.com/meilisearch/meilisearch/issues/3178

## What does this PR do?
The most important change is this one:
```rust
    // in milli/src/search/facet/facet_range_search.rs, line 239
    let should_stop = {
        match self.right {
            Bound::Included(right) => right < previous_key.left_bound,
            Bound::Excluded(right) => right <= previous_key.left_bound,
            Bound::Unbounded => false,
        }
    };
```
where the operations `<` and `<=` between the two branches were switched. This caused (very few) documents to be missing from filter results.

The second change is a simplification of the algorithm for filters such as `field = value`, where we now perform a direct query into the "Level 0" of the facet db to retrieve the docids instead of invoking the full facet search algorithm. This change is done in `milli/src/search/facet/filter.rs`.

I have added yet more insta-snapshot tests, rechecked the content of the snapshots, and added some integration tests as well. 

This is purely a fix in the search algorithms. Based on this PR alone, a dump will not be necessary to switch from v0.30.1 (where this bug is present) to v0.30.2 (where this PR is merged).


Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
2022-12-07 14:34:59 +00:00
Kerollmops
1d5294d11a
Bump lmdb version 2022-12-07 15:29:56 +01:00
bors[bot]
ee10cb8c87
Merge #726
726: Update the contributing.md r=curquiza a=irevoire



Co-authored-by: Tamo <tamo@meilisearch.com>
2022-12-07 13:59:04 +00:00
Loïc Lecrenier
d38cc73630 Add one more filter "integration" test 2022-12-07 14:38:25 +01:00
Loïc Lecrenier
e688581c36 Add tests for facet range search on different field ids 2022-12-07 14:38:21 +01:00
Loïc Lecrenier
4ac8f96342 Simplify implementation of equality condition in filters 2022-12-07 14:38:18 +01:00