2245: Add test to validate cli r=irevoire a=MarinPostma
followup on #2242 and #2243
Add a test to make sure the cli is valid, and add a CI task to run the tests in debug to make sure we hit debug assertions.
FYI `@curquiza,` because of CI changes
Co-authored-by: ad hoc <postma.marin@protonmail.com>
2246: Release v0.26.1: bring "fix panic at start" commit (#2243) into stable r=MarinPostma a=curquiza
2 commits
- cherry pick `32843f30d973349122ec5a37469ee09e6002b6f3` (merged in #2243)
- change the version in the Cargo toml files (from v0.26.0 to v0.26.1)
Co-authored-by: ad hoc <postma.marin@protonmail.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2244: chore(all): bump milli r=curquiza a=MarinPostma
continues the work initiated by `@psvnlsaikumar` in #2228
Co-authored-by: Sai Kumar <psvnlsaikumar@gmail.com>
2243: bug(http): fix panic on startup r=MarinPostma a=MarinPostma
this seems to fix#2242
I am not sure why this doesn't reproduce on v0.26.0, so we should remain vigilant.
`@curquiza` FYI
Co-authored-by: ad hoc <postma.marin@protonmail.com>
472: Remove useless variables in proximity r=Kerollmops a=ManyTheFish
Was passing by plane sweep algorithm to find some inspiration, and I discover that we have useless variables that were not detected because of the recursive function.
Co-authored-by: ManyTheFish <many@meilisearch.com>
468: Add a new error message when the filterableAttributes are empty r=Kerollmops a=brunoocasali
Fixes https://github.com/meilisearch/meilisearch/issues/2140
Is there a good way to reduce de duplication here? Maybe adding a shared function? I don't know the best and idiomatic way to do that, I appreciate any tip!
Another doubt is related to the duplication of the calling:
```rs
// filter.rs:373
FilterError::AttributeNotFilterable {
attribute,
filterable: filterable_fields.into_iter().collect::<Vec<_>>().join(" "),
},
```
and
```rs
// filter.rs:424
return Err(point[0].as_external_error(FilterError::AttributeNotFilterable {
attribute: "_geo",
filterable: filterable_fields.into_iter().collect::<Vec<_>>().join(" "),
}))?;
```
I think we could make the `filterable_fields.into_iter().collect::<Vec<_>>().join(" ")` directly into the error handling like the sortable error. I made it into the last commit, if this is something to avoid, let me know and I can remove it :)
Co-authored-by: Bruno Casali <brunoocasali@gmail.com>
466: Bump version to 0.23.1 r=curquiza a=Kerollmops
This PR bumps the crate versions to 0.23.1. Nothing seems to be breaking in the next release.
Co-authored-by: Kerollmops <clement@meilisearch.com>
467: optimize prefix database r=Kerollmops a=MarinPostma
This pr introduces two optimizations that greatly improve the speed of computing prefix databases.
- The time that it takes to create the prefix FST has been divided by 5 by inverting the way we iterated over the words FST.
- We unconditionally and needlessly checked for documents to remove in `word_prefix_pair`, which caused an iteration over the whole database.
Co-authored-by: ad hoc <postma.marin@protonmail.com>
2238: cargo: use resolver 2 r=MarinPostma a=happysalada
# Pull Request
## What does this PR do?
use resolver 2 from cargo.
This enables mainly to propagate the `--no-default-features` flag to workspace crates.
I mistakenly thought before that it was enough to have edition 2021 enabled. However it turns out that for virtual workspaces, this needs to be explicitely defined.
https://doc.rust-lang.org/edition-guide/rust-2021/default-cargo-resolver.html
This will also change a little how your dependencies are compiled. See https://blog.rust-lang.org/2021/03/25/Rust-1.51.0.html#cargos-new-feature-resolver for more details.
Just to give a bit more context, this is for usage in nixos. I have tried to do the upgrade today with the latest version, and the no default features flag is just ignored.
Let me know if you need more details of course.
## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: happysalada <raphael@megzari.com>
465: Update dependencies r=ManyTheFish a=Kerollmops
This PR upgrade and updates this crate's dependencies but first, it removes three dependencies that we don't use anymore. I used [cargo udeps](https://github.com/est31/cargo-udeps) to upgrade them ⬆️
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
464: exporting heed to avoid having different versions of Heed in Meilisearch r=curquiza a=psvnlsaikumar
# Pull Request
## What does this PR do?
Fixes the issue in meilisearch https://github.com/meilisearch/meilisearch/issues/2210
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: psvnl sai kumar <psvnlsaikumar@gmail.com>
2233: Change CI name for publishing binaries r=Kerollmops a=curquiza
Minor change regarding the CI job names. Should not impact the usage.
Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
457: Avoid iterating on big databases when useless r=Kerollmops a=Kerollmops
This PR makes the prefix database updates to avoid iterating on big grenad files when it is unnecessary. We introduced this regression in #436 but it went unnoticed.
---
According to the following benchmark results, we take more time when we index documents in one run than before #436. It looks like it is probably due to the fact that, now, instead of computing the prefixes database by iterating on the LMDB we directly iterate on the grenad file. Those could be slower to iterate on and could be the slowdown cause.
I just pushed a commit that tests this branch with the new unreleased version of grenad where some work was done to speed up the iteration on grenad files. [The benchmarks for this last commit](https://github.com/meilisearch/milli/actions/runs/1927187408) are currently running. You can [see the diff](https://github.com/meilisearch/grenad/compare/v0.4.1...main) between the v0.4 and the unreleased v0.5 version of grenad.
```diff
group indexing_benchmark-multi-batch-indexing-before-speed-up_45f52620 indexing_stop-iterating-on-big-grenad-files_ac8b85c4
----- ---------------------------------------------------------------- ----------------------------------------------------
+ indexing/Indexing songs in three batches with default settings 1.12 57.7±2.14s ? ?/sec 1.00 51.3±2.76s ? ?/sec
- indexing/Indexing wiki 1.00 917.3±30.01s ? ?/sec 1.10 1008.4±38.27s ? ?/sec
+ indexing/Indexing wiki in three batches 1.10 1091.2±32.73s ? ?/sec 1.00 995.5±24.33s ? ?/sec
```
Co-authored-by: Kerollmops <clement@meilisearch.com>
461: Add a new error message when the `valid_fields` is empty r=curquiza a=brunoocasali
I've created a test case to handle the new error formatting behavior, but I'm not sure if:
- this is the right place to add the test?
- this is the best way to test this behavior?
And I'm not sure also regarding the `match` implementation, is this something required? Or maybe just an `if` statement is ok as well?
I left the two messages literally without "reusing the prefix" in the implementation because I think this could help the "searchability" of the error in the future.
# Pull Request
## What does this PR do?
Fixes https://github.com/meilisearch/meilisearch/issues/2140
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [ ] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: Bruno Casali <brunoocasali@gmail.com>