181 Commits

Author SHA1 Message Date
ManyTheFish
28d6a4466d Make the tokenizer creating a char map during highlighting 2023-02-22 17:43:10 +01:00
ManyTheFish
ad35edfa32 Add test 2023-02-22 15:47:15 +01:00
bors[bot]
39407885c2
Merge #3347
3347: Enhance language detection r=irevoire a=ManyTheFish

## Summary

Some completely unrelated Languages can share the same characters, in Meilisearch we detect the Languages using `whatlang`, which works well on large texts but fails on small search queries leading to a bad segmentation and normalization of the query.

This PR now stores the Languages detected during the indexing in order to reduce the Languages list that can be detected during the search.

## Detail

- Create a 19th database mapping the scripts and the Languages detected with the documents where the Language is detected
- Fill the newly created database during indexing
- Create an allow-list with this database and pass it to Charabia
- Add a test ensuring that a Japanese request containing kanjis only is detected as Japanese and not Chinese

## Related issues
Fixes #2403
Fixes #3513

Co-authored-by: f3r10 <frledesma@outlook.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2023-02-21 10:52:13 +00:00
bors[bot]
a3e41ba33e
Merge #3496
3496: Fix metrics feature r=irevoire a=james-2001

# Pull Request

## Related issue

Resolves: #3469
See also: #2763

## What does this PR do?
As reported the metrics feature was broken by still using and old reference to `meilisearch_auth::actions`. This commit switches to the new location, `meilisearch_types::keys::actions`.

The original issue was not *that* clear as to exactly what was broken, and the build logs have disappeared, but it seemed to just be this one line fix. If this is not the case and I've missed the mark let me know, and i'll head back to the drawing board.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Co-authored-by: James <james.a.may.2001@gmail.com>
2023-02-21 10:13:11 +00:00
James
ce807d760b Fix formatting issue on Opt struct
tab in enable_metrics_route to fix cargo fmt issues

Resolves: #3469
See also: #2763
2023-02-21 09:45:18 +00:00
James
5cff435bf6 Add feature flags to Opt structure
Resolves: #3469
See also: #2763
2023-02-21 07:41:41 +00:00
ManyTheFish
8aa808d51b Merge branch 'main' into enhance-language-detection 2023-02-20 18:14:34 +01:00
bors[bot]
1e9ac00800
Merge #3505
3505: Csv delimiter r=irevoire a=irevoire

Fixes https://github.com/meilisearch/meilisearch/issues/3442
Closes https://github.com/meilisearch/meilisearch/pull/2803
Specified in https://github.com/meilisearch/specifications/pull/221

This PR is a reimplementation of https://github.com/meilisearch/meilisearch/pull/2803, on the new engine. Thanks for your idea and initial PR `@MixusMinimax;` sorry I couldn’t update/merge your PR. Way too many changes happened on the engine in the meantime.

**Attention to reviewer**; I had to update deserr to implement the support of deserializing `char`s

-------

It introduces four new error messages;
- Invalid value in parameter csvDelimiter: expected a string of one character, but found an empty string
- Invalid value in parameter csvDelimiter: expected a string of one character, but found the following string of 5 characters: doggo
- csv delimiter must be an ascii character. Found: 🍰 
- The Content-Type application/json does not support the use of a csv delimiter. The csv delimiter can only be used with the Content-Type text/csv.

And one error code;
- `invalid_index_csv_delimiter`

The `invalid_content_type` error code is now also used when we encounter the `csvDelimiter` query parameter with a non-csv content type.

Co-authored-by: Tamo <tamo@meilisearch.com>
2023-02-20 17:01:36 +00:00
ManyTheFish
23f4e82b53 Add test ensuring that Meilisearch works on kanji only requests 2023-02-20 15:43:29 +01:00
bors[bot]
a8f6f108e0
Merge #3515
3515: Consider null as a valid geo field r=irevoire a=irevoire

Fix #3497
Associated spec; https://github.com/meilisearch/specifications/pull/222

Co-authored-by: Tamo <tamo@meilisearch.com>
2023-02-20 14:12:55 +00:00
Tamo
1479050f7a apply review suggestions 2023-02-20 14:53:37 +01:00
Charlotte Vermandel
dd120e0e16 Bump version of mini-dashboard to v0.2.6 2023-02-20 13:45:57 +01:00
Tamo
18796d6e6a Consider null as a valid geo object 2023-02-20 13:45:51 +01:00
bors[bot]
c91bfeaf15
Merge #3467
3467: Identify builds git tagged with `prototype-...` in CLI and analytics r=curquiza a=dureuill

# Pull Request

## What does this PR do?

- Parses the last git tag to extract a prototype name if:
  - Current build uses the prototype tag (not after the tag) precisely
  - The prototype tag name respects the following conditions:
    1. starts with `prototype-`
    2. ends with a number
    3. the hyphen-separated segment right before the number is not a number (required to reject commits after the tag).
- Display the prototype name in the launch summary in the CLI
- Send the prototype name to analytics if any
- Update prototypes instructions in CONTRIBUTING.md

|`VERGEN_GIT_SEMVER_LIGHTWEIGHT` value | Prototype |
|---|---|
| `Some("prototype-geo-bounding-box-0-139-gcde89018")` | `None` (does not end with a number) |
| `Some("prototype-geo-bounding-box-0-139-89018")` | `None` (before the last segment is a number) |
| `Some("prototype-geo-bounding-box-0")` | `Some("prototype-geo-bounding-box-0")` |
| `Some("prototype-geo-bounding-box")` | `None` (does not end with a number") |
| `Some("geo-bounding-box-0")` | `None` (does not start with "prototype") |
| `None` | `None` | 

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-02-20 09:27:51 +00:00
James
91048d209d Fix metrics feature
Metrics feature was relying on old references. Refactored with inspiration from the `get_stats` method in `meilisearch/src/routes/lib.rs`. `enable_metrics_routes` added to options in `segment_analytics`.

Resolves: #3469
See also: #2763
2023-02-17 20:11:57 +00:00
Tamo
f11c7d4b62 cargo run execute meilisearch by default 2023-02-16 18:03:45 +01:00
Tamo
e79f6f87f6 make cargo fmt&clippy happy 2023-02-16 18:00:40 +01:00
Tamo
5367d8f05a add two tests on the indexing of csvs 2023-02-16 17:37:11 +01:00
Tamo
52686da028 test various error on the document ressource 2023-02-16 17:37:10 +01:00
Tamo
8c074f5028 implements the csv delimiter without tests
Co-authored-by: Maxi Barmetler <maxi.barmetler@gmail.com>
2023-02-16 17:35:36 +01:00
Louis Dureuil
54240db495
Add note in code so one does not forget next time 2023-02-16 10:53:14 +01:00
Louis Dureuil
9bd1cfb3a3
Ignore -dirty flag 2023-02-16 10:53:14 +01:00
Louis Dureuil
f46cf46b8c
Add prototype to analytics if any 2023-02-16 10:53:14 +01:00
Louis Dureuil
c3a30a5a91
If using a prototype, display its name at Meilisearch startup 2023-02-16 10:53:14 +01:00
Tamo
74d1a67a99 Use the workspace inheritance feature of rust 1.64 2023-02-15 13:51:07 +01:00
Tamo
42a3cdca66
get rids of the unwrap_any function in favor of take_cf_content 2023-02-14 20:06:31 +01:00
Tamo
a43765d454
use the pre-defined deserr extractors 2023-02-14 20:05:30 +01:00
Tamo
769576fd94
get rids of the whole error_message module since it has been integrated into the last version of deserr 2023-02-14 20:05:27 +01:00
Tamo
8fb7b1d10f
bump deserr 2023-02-14 20:04:30 +01:00
James
62358bd31c Fix metrics feature
As reported the metrics feature was broken by still using and old reference to `meilisearch_auth::actions`. This commit switches to the new location, `meilisearch_types::keys::actions`.

Resolves: #3469
See also: #2763
2023-02-14 17:29:38 +00:00
Clément Renault
47748395dc
Update an authentication comment
Co-authored-by: Many the fish <many@meilisearch.com>
2023-02-13 17:20:08 +01:00
Clément Renault
764df24b7d
Make clippy happy (again) 2023-02-09 13:21:20 +01:00
Clément Renault
4570d5bf3a
Merge remote-tracking branch 'origin/main' into temp-wildcard 2023-02-09 13:14:05 +01:00
Kerollmops
c690c4fec4
Added and modified the current API Key and Tenant Token tests 2023-02-09 11:17:30 +01:00
Kerollmops
7b4b57ecc8
Fix the current tests 2023-02-08 14:54:05 +01:00
dependabot[bot]
5f56e6dd58
Bump tokio from 1.24.1 to 1.24.2
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.1 to 1.24.2.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/commits)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-02-07 12:14:05 +00:00
bors[bot]
c88c3637b4
Merge #3461
3461: Bring v1 changes into main r=curquiza a=Kerollmops

Also bring back changes in milli (the remote repository) into main done during the pre-release

Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Philipp Ahlner <philipp@ahlner.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2023-02-07 11:27:27 +00:00
Tamo
7a38fe624f
throw an error if the top left corner is found below the bottom right corner 2023-02-06 17:50:47 +01:00
Kerollmops
a015e232ab
Merge remote-tracking branch 'origin/release-v1.0.0' into bring-v1-changes 2023-02-06 16:41:10 +01:00
Guillaume Mourier
426d63b01b
Update insta test suite 2023-02-02 12:27:56 +01:00
Kerollmops
a36b1dbd70
Fix the tasks with the new patterns 2023-02-01 18:21:45 +01:00
Kerollmops
d563ed8a39
Making it work with index uid patterns 2023-02-01 17:51:30 +01:00
bors[bot]
20f8184c06
Merge #3441
3441: Fix import of dump v2 r=dureuill a=irevoire

# Pull Request
This bug was introduced because of a mistake we did earlier: We said the last version to export dump v2 was the v0.21.0 while it was the v0.22.0.
To fix the bug I updated our whole v2 reader to use the code from meilisearch v0.22.0.
Also:
- Import the bugged dump in the tests
- Test the import of this dump in the v2 reader and current reader

## Related issue
Fixes #3435


Co-authored-by: Tamo <tamo@meilisearch.com>
2023-01-31 13:23:57 +00:00
Tamo
4b7b2d6a90 fix the import of dump v2 generated by meilisearch v0.22.0 2023-01-31 13:03:28 +01:00
Louis Dureuil
924d5d4c11
clippy: remove needless lifetimes 2023-01-31 10:40:48 +01:00
Louis Dureuil
3296cf7ae6
clippy: remove needless lifetimes 2023-01-31 09:32:40 +01:00
Tamo
cac93f149e fix the tests after rebasing 2023-01-25 16:52:54 +01:00
Tamo
481df7a8b6 Update meilisearch/tests/documents/add_documents.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-01-25 16:45:11 +01:00
Tamo
8356f109c1 bump milli to fix the last test 2023-01-25 16:45:11 +01:00
Tamo
934f2b3cb5 exhaustively test all the errors that can arise from a bad geo field 2023-01-25 16:45:11 +01:00