127 Commits

Author SHA1 Message Date
Jakub Jirutka
13f1277637 Allow to disable specialized tokenizations (again)
In PR #2773, I added the `chinese`, `hebrew`, `japanese` and `thai`
feature flags to allow melisearch to be built without huge specialed
tokenizations that took up 90% of the melisearch binary size.
Unfortunately, due to some recent changes, this doesn't work anymore.
The problem lies in excessive use of the `default` feature flag, which
infects the dependency graph.

Instead of adding `default-features = false` here and there, it's easier
and more future-proof to not declare `default` in `milli` and
`meilisearch-types`. I've renamed it to `all-tokenizers`, which also
makes it a bit clearer what it's about.
2023-05-04 15:45:40 +02:00
meili-bors[bot]
e0537c3870
Merge #3720
3720: Change links of docs everywhere r=curquiza a=curquiza

Completely fixes #3668 

Co-authored-by: curquiza <clementine@meilisearch.com>
2023-05-04 10:07:41 +00:00
curquiza
30edba3497 Update links of the docs 2023-05-03 19:14:57 +02:00
Louis Dureuil
90bc230820
Merge remote-tracking branch 'origin/main' into search-refactor
Conflicts | resolution
----------|-----------
Cargo.lock | added mimalloc
Cargo.toml |  took origin/main version
milli/src/search/criteria/exactness.rs | deleted after checking it was only clippy changes
milli/src/search/query_tree.rs | deleted after checking it was only clippy changes
2023-05-03 12:19:06 +02:00
Kerollmops
47b66e49b8
Upgrade the compatible versions of the dependencies 2023-04-24 17:50:52 +02:00
ManyTheFish
1ba8a40d61 Remove formating benchmark because they can't be isoloated easily anymore 2023-04-06 15:10:16 +02:00
ManyTheFish
37489fd495 Return an internal error in the case of matching word is invalid 2023-03-01 19:05:16 +01:00
Tamo
74d1a67a99 Use the workspace inheritance feature of rust 1.64 2023-02-15 13:51:07 +01:00
Clément Renault
1d507c84b2
Fix the formatting 2023-01-17 18:25:55 +01:00
Clément Renault
1b78231e18
Make clippy happy 2023-01-17 18:25:54 +01:00
Kerollmops
97005dd505
Bump the milli-imported crates to v1.0.0 2023-01-16 16:29:12 +01:00
curquiza
9e32ac7cb2 Update version for the next release (v0.39.0) in Cargo.toml files 2023-01-11 15:05:06 +00:00
Loïc Lecrenier
02fd06ea0b Integrate deserr 2023-01-11 13:56:47 +01:00
curquiza
c72535531b Update version for the next release (v0.38.0) in Cargo.toml files 2022-12-19 16:35:38 +00:00
Loïc Lecrenier
80588daae5 Fix compilation error in formatting benches 2022-11-28 10:27:15 +01:00
curquiza
cd5aaa3a9f Update version for the next release (v0.37.0) in Cargo.toml files 2022-11-17 12:50:07 +00:00
Kerollmops
d00d2aab3f Update version for the next release (v0.36.0) in Cargo.toml files 2022-11-09 11:03:09 +00:00
Kerollmops
bd12989610 Update version for the next release (v0.35.1) in Cargo.toml files 2022-11-08 14:31:39 +00:00
bors[bot]
d3f95e6c69
Merge #671
671: Update version for the next release (v0.35.0) in Cargo.toml files r=Kerollmops a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2022-10-26 11:58:05 +00:00
curquiza
e883bccc76 Update version for the next release (v0.35.0) in Cargo.toml files 2022-10-26 11:43:54 +00:00
bors[bot]
c8f16530d5
Merge #616
616: Introduce an indexation abortion function when indexing documents r=Kerollmops a=Kerollmops



Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-10-26 11:41:18 +00:00
curquiza
f3874d58b9 Update version for the next release (v0.34.0) in Cargo.toml files 2022-10-24 10:13:25 +00:00
Kerollmops
6603437cb1
Introduce an indexation abortion function when indexing documents 2022-10-17 17:28:03 +02:00
Loïc Lecrenier
4c481a8947 Upgrade all dependencies 2022-10-17 13:05:56 +02:00
Loïc Lecrenier
53503f09ca Make milli's default features optional in other executable targets 2022-10-12 09:22:05 +02:00
curquiza
753e76d451 Update version for the next release (v0.33.4) in Cargo.toml files 2022-09-13 13:55:50 +00:00
curquiza
077dcd2002 Update version for the next release (v0.33.3) in Cargo.toml files 2022-09-07 15:48:53 +00:00
ManyTheFish
97a04887a3 Update version for next release (v0.33.2) in Cargo.toml 2022-09-01 11:47:23 +02:00
Clémentine Urquizar
c3363706c5
Update version for next release (v0.33.1) in Cargo.toml 2022-08-31 11:37:27 +02:00
Clémentine Urquizar
9ed7324995
Update version for next release (v0.33.0) 2022-08-23 11:47:48 +02:00
bors[bot]
18886dc6b7
Merge #598
598: Matching query terms policy r=Kerollmops a=ManyTheFish

## Summary

Implement several optional words strategy.

## Content

Replace `optional_words` boolean with an enum containing several term matching strategies:
```rust
pub enum TermsMatchingStrategy {
    // remove last word first
    Last,
    // remove first word first
    First,
    // remove more frequent word first
    Frequency,
    // remove smallest word first
    Size,
    // only one of the word is mandatory
    Any,
    // all words are mandatory
    All,
}
```

All strategies implemented during the prototype are kept, but only `Last` and `All` will be published by Meilisearch in the `v0.29.0` release.

## Related

spec: https://github.com/meilisearch/specifications/pull/173
prototype discussion: https://github.com/meilisearch/meilisearch/discussions/2639#discussioncomment-3447699


Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-08-22 15:51:37 +00:00
ManyTheFish
5391e3842c replace optional_words by term_matching_strategy 2022-08-22 17:47:19 +02:00
ManyTheFish
f9029727e0 Fix benchmarks 2022-08-22 14:55:53 +02:00
Irevoire
e7624abe63
share heed between all sub-crates 2022-08-19 11:23:41 +02:00
bors[bot]
60a7221827
Merge #609
609: Retry downloading the benchmarks datasets r=Kerollmops a=irevoire

Downloading the benchmarks datasets is failing [more and more](https://github.com/meilisearch/milli/pull/607#pullrequestreview-1076023074) often; thus, instead of fixing the issue, I thought we could retry multiple times.


Co-authored-by: Irevoire <tamo@meilisearch.com>
2022-08-18 11:47:09 +00:00
Irevoire
84a784834e
retry downloading the benchmarks datasets 2022-08-17 19:25:05 +02:00
Irevoire
4aae07d5f5
expose the size methods 2022-08-17 17:07:38 +02:00
Loïc Lecrenier
5d59bfde8a Sort Cargo.toml dependencies 2022-08-17 11:46:56 +02:00
Loïc Lecrenier
fb2b6c0c28 Use mimalloc for benchmarks on all platforms 2022-08-10 16:56:42 +02:00
Loïc Lecrenier
8f73251012 Use mimalloc for benchmarks on macOS 2022-08-10 13:30:56 +02:00
Clémentine Urquizar
d5e9b7305b
Update version for next release (v0.32.0) 2022-07-21 13:20:02 +04:00
bors[bot]
941af58239
Merge #561
561: Enriched documents batch reader r=curquiza a=Kerollmops

~This PR is based on #555 and must be rebased on main after it has been merged to ease the review.~
This PR contains the work in #555 and can be merged on main as soon as reviewed and approved.

- [x] Create an `EnrichedDocumentsBatchReader` that contains the external documents id.
- [x] Extract the primary key name and make it accessible in the `EnrichedDocumentsBatchReader`.
- [x] Use the external id from the `EnrichedDocumentsBatchReader` in the `Transform::read_documents`.
- [x] Remove the `update_primary_key` from the _transform.rs_ file.
- [x] Really generate the auto-generated documents ids.
- [x] Insert the (auto-generated) document ids in the document while processing it in `Transform::read_documents`.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-07-21 07:08:50 +00:00
Loïc Lecrenier
8270e2b768 Fix name of "release_date" facet in movies benchmarks 2022-07-18 10:34:12 +02:00
Kerollmops
448114cc1c
Fix the benchmarks with the new indexation API 2022-07-12 15:22:09 +02:00
Kerollmops
a892a4a79c
Introduce a function to extend from a JSON array of objects 2022-07-12 15:14:06 +02:00
Kerollmops
ea852200bb
Fix the format used for a geo deleting benchmark 2022-07-12 14:55:52 +02:00
Kerollmops
399eec5c01
Fix the indexation tests 2022-07-12 14:55:51 +02:00
Kerollmops
fcfc4caf8c
Move the Object type in the lib.rs file and use it everywhere 2022-07-12 14:55:51 +02:00
Kerollmops
a97d4d63b9
Fix the benchmarks 2022-07-12 14:55:50 +02:00
Loïc Lecrenier
aae03356cb Use BufReader to read datasets in benchmarks 2022-07-06 18:20:15 +02:00