Commit Graph

8189 Commits

Author SHA1 Message Date
bors[bot]
1f1beae077
Merge #729
729: Fix distincted exhaustive hits r=Kerollmops a=ManyTheFish

This PR changes the name and behavior of `bucket_candidates`:
- `bucket_candidates` become `initial_candidates` that is less confusing
- `initial_candidates` is no more a simple `RoaringBitmap` but an enum allowing us to precise if the candidates are exhaustive or not
- this enum ensures that any modification is allowed only if the candidates are not already exhaustive.

The bug occurred because `initial_candidates` are modified during the bucket sort allowing the estimation to be more and more precise along the search, and this was an issue when the `initial_candidates` were already exhaustive, now, if candidates are exhaustive, then no modifications are made.

Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-12-08 09:26:34 +00:00
ManyTheFish
55724f2412 Introduce an initial candidates set that makes the difference between an exhaustive count and an estimation 2022-12-08 09:41:34 +01:00
ManyTheFish
6d50ea0830 add tests 2022-12-08 08:56:57 +01:00
bors[bot]
f4dc4c5d8d
Merge #3210
3210: Fix `MDB_PAGE_FULL` by bumping LMDB r=Kerollmops a=Kerollmops

This PR fixes #3062 by upgrading LMDB to the latest version.

The changes were made in https://github.com/meilisearch/lmdb/pull/1 and https://github.com/meilisearch/lmdb-rs/pull/12. As heed directly depends on the latest main commit of https://github.com/meilisearch/lmdb-rs, we can bump the `lmdb-rkv-sys` dependency in the Meilisearch _Cargo.lock_ by doing a:

```
cargo update -p lmdb-rkv-sys
```

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-12-07 16:21:23 +00:00
Loïc Lecrenier
f37c86e0b2 Add some integration tests on the sort criterion 2022-12-07 15:59:33 +01:00
jiangbo212
717dd36547 Merge branch 'fix-3037' of github.com:jiangbo212/meilisearch into fix-3037 2022-12-07 22:54:16 +08:00
jiangbo212
538030c2da change NameTempFile to tempfile() 2022-12-07 22:47:32 +08:00
bors[bot]
098c410612
Merge #727
727: Fix bug in filter search r=Kerollmops a=loiclec

# Pull Request

## Related issue
Fixes (partially, until merged into meilisearch) https://github.com/meilisearch/meilisearch/issues/3178

## What does this PR do?
The most important change is this one:
```rust
    // in milli/src/search/facet/facet_range_search.rs, line 239
    let should_stop = {
        match self.right {
            Bound::Included(right) => right < previous_key.left_bound,
            Bound::Excluded(right) => right <= previous_key.left_bound,
            Bound::Unbounded => false,
        }
    };
```
where the operations `<` and `<=` between the two branches were switched. This caused (very few) documents to be missing from filter results.

The second change is a simplification of the algorithm for filters such as `field = value`, where we now perform a direct query into the "Level 0" of the facet db to retrieve the docids instead of invoking the full facet search algorithm. This change is done in `milli/src/search/facet/filter.rs`.

I have added yet more insta-snapshot tests, rechecked the content of the snapshots, and added some integration tests as well. 

This is purely a fix in the search algorithms. Based on this PR alone, a dump will not be necessary to switch from v0.30.1 (where this bug is present) to v0.30.2 (where this PR is merged).


Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
2022-12-07 14:34:59 +00:00
Kerollmops
1d5294d11a
Bump lmdb version 2022-12-07 15:29:56 +01:00
bors[bot]
ee10cb8c87
Merge #726
726: Update the contributing.md r=curquiza a=irevoire



Co-authored-by: Tamo <tamo@meilisearch.com>
2022-12-07 13:59:04 +00:00
Loïc Lecrenier
d38cc73630 Add one more filter "integration" test 2022-12-07 14:38:25 +01:00
Loïc Lecrenier
e688581c36 Add tests for facet range search on different field ids 2022-12-07 14:38:21 +01:00
Loïc Lecrenier
4ac8f96342 Simplify implementation of equality condition in filters 2022-12-07 14:38:18 +01:00
Loïc Lecrenier
1c9555566e Fix bug in facet range search 2022-12-07 14:38:14 +01:00
Loïc Lecrenier
303d740245 Prepare fix within facet range search
By creating snapshots and updating the format of the existing
snapshots. The next commit will apply the fix, which will show
its effects cleanly on the old and new snapshot tests
2022-12-07 14:38:10 +01:00
bors[bot]
34c3e5ec5e
Merge #3208
3208: Stop snapshotting the version of meilisearch in the dump r=Kerollmops a=irevoire

It might change, and we don't want to update this test every time we make a new release.


Co-authored-by: Tamo <tamo@meilisearch.com>
2022-12-07 12:54:55 +00:00
Tamo
1c3a326199
stop snapshotting the version of meilisearch in the dump
It might change and we don't want to update this test everytime we make a new release.
2022-12-07 13:26:02 +01:00
Tamo
250743885d
add a sentence about installing rust-nightly 2022-12-07 12:31:43 +01:00
bors[bot]
34c0f11c26
Merge #3207
3207: Add release check when starting latest CI r=Kerollmops a=curquiza

Adding this to have the same kind of check before starting to move the latest tag

<img width="737" alt="Capture d’écran 2022-12-07 à 12 18 33" src="https://user-images.githubusercontent.com/20380692/206165868-18a2be7c-78ec-48c9-acb9-d7f60797c2e3.png">

Also, removing an un-unused script

Co-authored-by: curquiza <clementine@meilisearch.com>
2022-12-07 11:27:47 +00:00
Tamo
5eecb8489d
Update CONTRIBUTING.md
Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-12-07 12:23:12 +01:00
Tamo
0e5c3b1f64
Update CONTRIBUTING.md
Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-12-07 12:23:06 +01:00
curquiza
be300138e4 Add release check when starting latest CI 2022-12-07 12:22:44 +01:00
bors[bot]
2ed6017603
Merge #3204
3204: Bring back v0.30.1 changes to `main` r=curquiza a=curquiza

I was not able to just import `release-v0.30.1` to `main`, see:
<img width="1371" alt="Capture d’écran 2022-12-06 à 20 03 50" src="https://user-images.githubusercontent.com/20380692/206000844-b39b3063-7da2-475f-b3e4-1791c39a7c2f.png">

So I cherry-picked the commits.

⚠️ ⚠️ ⚠️ I had a git conflict here

<img width="730" alt="Capture d’écran 2022-12-06 à 20 09 04" src="https://user-images.githubusercontent.com/20380692/206001007-f56bc28f-c0b1-46a0-bb60-cce4e73b9584.png">


⚠️ ⚠️ ⚠️ Check out carefully how I fixed it


Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-12-07 11:08:37 +00:00
Kerollmops
c1337f9e08
Update dump snap to new version 2022-12-07 11:48:29 +01:00
bors[bot]
9acac28574
Merge #3128
3128: Bumps cargo_toml version to most up to date r=curquiza a=colbsmcdolbs

# Pull Request

## Related issue
Fixes #3127

## What does this PR do?
- The README of this repository declares that one package is not up to date. In order to ensure Due Diligence, I have bumped the version number of the package. No test failures running on Windows.

## PR checklist
Please check if your PR fulfills the following requirements:
- [X] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [X] Have you read the contributing guidelines?
- [X] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Colby Allen <colbyjayallen@gmail.com>
2022-12-07 10:31:25 +00:00
jiangbo212
cb1d184904 fmt fix 2022-12-07 17:04:24 +08:00
jiangbo212
2841b09789
Merge branch 'meilisearch:main' into fix-3037 2022-12-07 16:30:21 +08:00
jiangbo212
35f3dd68b6 error change and tokio file use change 2022-12-07 16:20:36 +08:00
Kerollmops
f1de3aa75a Make the tests use MB to trigger page size issues 2022-12-06 20:10:10 +01:00
Kerollmops
e4e4370a3c Clamp the databases size to the page size 2022-12-06 20:09:49 +01:00
Kerollmops
24c79b79f9 Bump milli to v0.37.1 2022-12-06 20:05:52 +01:00
curquiza
5db7c4057c Update version for the next release (v0.30.1) in Cargo.toml files 2022-12-06 20:05:46 +01:00
Tamo
f53bdc4320
update the contributing.md 2022-12-06 17:41:05 +01:00
bors[bot]
0a301b5f88
Merge #723
723: Fix bug in handling of soft deleted documents when updating settings r=Kerollmops a=loiclec

# Pull Request

## Related issue
Fixes (partially, until merged into meilisearch) https://github.com/meilisearch/meilisearch/issues/3021

## What does this PR do?
This PR fixes the bug where a `missing key in documents database` internal error message could appear when indexing documents.

When updating the settings, before clearing the database and before creating the transform output, we now modify the `ExternalDocumentsIds` structure to get rid of all references to soft deleted document ids in its FSTs.

It used to be that updating the settings would clear the soft-deleted document ids, but keep the original `ExternalDocumentsIds` structure. As a consequence of this, when processing a future document addition, we could wrongly believe that a document was being replaced when, in fact, it was a completely new document. See the tests `bug_3021_first`, `bug_3021_second`, and `bug_3021` for a minimal test case that would have reproduced the issue.
 
We need to take special care to:
- evaluate how users should update to v0.30.1 (containing this fix): dump? reimporting all documents from scratch?
- understand IF/HOW this bug could have caused duplicate documents to be returned 
- and evaluate the correctness of the fix, of course :)


Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
2022-12-06 14:37:38 +00:00
Loïc Lecrenier
a993b68684 Cargo fmt >:-( 2022-12-06 15:22:10 +01:00
Loïc Lecrenier
80c7a00567 Fix compilation error in tests of settings update 2022-12-06 15:19:26 +01:00
Loïc Lecrenier
67d8cec209 Fix bug in handling of soft deleted documents when updating settings 2022-12-06 15:09:19 +01:00
bors[bot]
2867d2e91a
Merge #3190
3190: Fix the dump date-import of the dumpv4 r=irevoire a=irevoire

# Pull Request
After merging https://github.com/meilisearch/meilisearch/pull/3012 I realized that the tests on the date of the dump-v4 were still ignored, thus, I fixed them and then noticed #3012 wasn't working properly.

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/2987 a second time


`@funilrys` since you wrote most of the code you might be interested, but don't feel obligated to review this code. 
Someone from the team will double-check it works 😁 

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-12-06 10:47:00 +00:00
bors[bot]
2a846aaae7
Merge #719
719: Add more members of `filter_parser` to `milli::` & `From<&str>` implementation for `Token` r=Kerollmops a=GregoryConrad

## What does this PR do?
The current `milli::Filter` and `milli::FilterCondition` APIs require working with some members of `filter_parser` directly that `milli::` does *not* re-export to its users (at least when not parsing input using `parse`). Also, using `filter_parser` does not make sense when using milli from an embedded context where there is no query to parse.

Instead of reworking `milli::Filter` and `milli::FilterCondition`, this PR adds two non-breaking changes that ease the use of milli:
- Re-exports more members of the dependent version of `filter_parser` in `milli`
- Implements `From<&str>` for `filter_parser::Token`
  - This will also allow some basic tests that need to create a `Token` from a string to avoid some boilerplate.

In conjunction, both of these will allow milli users to easily create a `Token` from a `&str` without needing to add `filter_parser` as an extra dependency.

Note: I wanted to use `FromStr` for the `From` implementation; however, it requires returning a `Result` which is not needed for the conversion. Thus, I just left it as `From<&str>`.

Co-authored-by: Gregory Conrad <gregorysconrad@gmail.com>
2022-12-06 10:36:00 +00:00
bors[bot]
1458a12531
Merge #3197
3197: Revert "Upgrade alpine 3.16 to 3.17" r=irevoire a=curquiza

Reverts meilisearch/meilisearch#3189

Because `rust:alpine3.17` does not exist, and our scheduled CI failed: https://github.com/meilisearch/meilisearch/actions/runs/3626327181

`@ivanionut` for your information, I'm sorry I should have better checked before accepting the PR, this is my bad


Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-12-06 10:25:11 +00:00
Clémentine Urquizar - curqui
cbb8d0f97b
Revert "Upgrade alpine 3.16 to 3.17" 2022-12-06 11:09:57 +01:00
Tamo
bef81065f9 return the same time in case we didn't found a created or updated at 2022-12-06 11:03:23 +01:00
Tamo
180511795b
Update dump/src/reader/v4/mod.rs fix typo
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2022-12-06 10:53:43 +01:00
bors[bot]
3bef6e6690
Merge #3175
3175: Rename dump command from --dumps-dir to --dump-dir r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #3132 

## What does this PR do?
- Rename the dump commands, env variables and default config

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2022-12-06 09:49:42 +00:00
bors[bot]
d6eacb2aac
Merge #722
722: Geosearch for zero radius r=irevoire a=amab8901

# Pull Request

## Related issue
Fixes #3167 (https://github.com/meilisearch/meilisearch/issues/3167)

## What does this PR do?
- allows Geosearch with zero radius to return the specified location when the coordinates match perfectly (instead of returning nothing). See link for more details.
- new attempt on https://github.com/meilisearch/milli/pull/713

## PR checklist
Please check if your PR fulfills the following requirements:
- [ X ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ X ] Have you read the contributing guidelines?
- [ X ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: amab8901 <amab8901@protonmail.com>
Co-authored-by: Tamo <irevoire@protonmail.ch>
2022-12-05 19:57:08 +00:00
Tamo
212dbfa3b5
Update milli/src/search/facet/filter.rs 2022-12-05 20:56:21 +01:00
amab8901
456da5de9c Geosearch for zero radius 2022-12-05 20:11:46 +01:00
bors[bot]
46e26ab550
Merge #720
720: Make soft deletion optional in document addition and deletion + add lots of tests r=irevoire a=loiclec

# Pull Request

## What does this PR do?
When debugging recent issues, I created a few unit tests in the hopes reproducing the bugs I was looking for. In the end, I didn't find any, but I thought it would still be good to keep those tests. 

More importantly, I added a field to the `DeleteDocuments` and `IndexDocuments` builders, called `disable_soft_deletion`. If set to `true`, the indexing/deletion will never add documents to the `soft_deleted_documents_ids` and instead perform a real deletion of the documents from the databases.

For the new tests, I have:
- Improved the insta-snapshot format of the `external_documents_ids` structure
- Added more tests for the facet DB indexing, deletion, and search algorithms, making sure to test them when the facet DB contains strings (instead of numbers) as well.
- Added more tests for the incremental indexing of the prefix proximity databases. For example, to see if documents are replaced correctly and if common prefixes are deleted correctly.
- Added tests that mix soft deletion and hard deletion, including when processing batches of document updates. 


Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
2022-12-05 18:26:01 +00:00
bors[bot]
9b23885e85
Merge #3188
3188: re-enable the dump test on the dates r=irevoire a=irevoire

I just noticed that we have the real date in the dump-v1 contrarily to the dump-v2/3/4/5, thus we can ensure it doesn't change unexpectedly 👍 

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-12-05 18:10:22 +00:00
bors[bot]
8b46093117
Merge #3189
3189: Upgrade alpine 3.16 to 3.17 r=curquiza a=ivanionut

Upgrade alpine 3.16 to 3.17

Co-authored-by: Ivan Ionut <ivan.ionut@gmail.com>
2022-12-05 17:39:10 +00:00