Commit Graph

462 Commits

Author SHA1 Message Date
ManyTheFish
c30a14cb97 Add telemetry 2023-07-10 13:12:12 +02:00
Louis Dureuil
106f98aa72
Add "scoring.*" analytics to multi search route 2023-07-10 11:45:43 +02:00
Louis Dureuil
bb40ce6e35
Experimental features analytics match the spec 2023-07-10 08:57:53 +02:00
meili-bors[bot]
0c8dbf6fa6
Merge #3897
3897: Add automated tests for `/experimental-features` route r=Kerollmops a=dureuill

# Pull Request

## What does this PR do?
- Make `RuntimeTogglableFeatures` `Eq`
- Add various tests for the `/experimental-features` route
  - Integration tests for the route itself
  - Integration tests for the effect of enabling `scoreDetails` and `vectorStore` through this route.
  - Dump integration tests


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-07-06 13:37:56 +00:00
meili-bors[bot]
ff192bb480
Merge #3889
3889: Display the total number of tasks matching a filter/query r=dureuill a=Kerollmops

This PR returns a new field on the `/tasks` routes. The `total` field exposes the total number of tasks that matches the given filter/query. It is useful to display information on a user interface and can help understand when progress is made in processing tasks, i.e., the total number of tasks on `/tasks?statuses=succeeded` will increase over time.

Fixes #3888.

- [ ] Update the specs fo the `/tasks` route.

## How have I implemented it?

I found it much easier to run two times the task filtering system. Once with the original `from` and `limit` parameters and a second time without. The second call will return the total number of tasks that match the query, not only the number of tasks on the current page.

So far, in terms of performance, there doesn't seem to be any issue. I tried different filters with something like 250k tasks. Note that there is a limit of 1M tasks in the queue.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-07-06 10:23:09 +00:00
Clément Renault
22762808ab
Fix the tests 2023-07-06 12:13:29 +02:00
Clément Renault
86b834c9e4
Display the total number of tasks in the tasks route 2023-07-06 10:05:18 +02:00
Louis Dureuil
2d3cec11a7
Search integration test to check score details and vector store 2023-07-06 09:02:02 +02:00
Louis Dureuil
76e1ee9988
integration test on "/experimental-features" route 2023-07-06 09:01:28 +02:00
Louis Dureuil
222615d3df
Allow to get/set features in integration test server 2023-07-06 09:01:05 +02:00
Louis Dureuil
11d024c613
Authentication tests 2023-07-06 09:00:51 +02:00
meili-bors[bot]
886c8bb647
Merge #3891
3891: Fix the way we compute the 99th percentile r=dureuill a=Kerollmops

This PR fixes how we compute the 99th percentile by avoiding using float and doing the multiplication and divisions in the correct order avoiding going out of the buffer of timings. You can see the issue on [this rust playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021).

When there are a very small number of successful requests, the number is so tiny that the 99th percentile calculus sometimes gives an index out of the buffer. In this example, the `1`/`1.0` represent the number of timings you collected (one). As you can see, the float computation gives us the index `1.0`, with is out of a vector of only one value. This makes the engine generate a `null` value.

```rust
1 * 99 / 100 = 0 // with integers
0.99_f64 * (1.0 - 1.0) + 1.0 = 1.0 // with floats
```

Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-07-06 06:04:08 +00:00
Clément Renault
d727ebee05
Fix the way we compute the 99th percentile 2023-07-05 17:53:09 +02:00
Clément Renault
da39a7b29e
Return the right analytics 2023-07-05 17:27:51 +02:00
meili-bors[bot]
82650eaae1
Merge #3877
3877: update the total_received properties of multiple events r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #3814 

## What does this PR do?
-fix name of `total_received` for several events


Co-authored-by: Tamo <tamo@meilisearch.com>
2023-07-03 19:49:53 +00:00
Kerollmops
d1ff631df8
Replace the atty dependency with the is-terminal one 2023-07-03 18:51:42 +02:00
Tamo
202183adf8
update the total_received properties of multiple events 2023-07-03 15:57:09 +02:00
meili-bors[bot]
aae099e330
Merge #3851
3851: Expose lastUpdate and isIndexing in /stats endpoint r=dureuill a=gentcys

# Pull Request

## Related issue
Fixes #3843

## What does this PR do?
- expose lastUpdate in `/stats` endpoint
- expose isIndex in `stats` endpoint
- add a method `is_task_processing` in index-scheduler/src/lib.rs.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Cong Chen <cong.chen@ocrlabs.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-07-03 13:41:04 +00:00
Louis Dureuil
5387cf1718
Don't unwrap in case of error/missing last_update field 2023-07-03 15:32:11 +02:00
meili-bors[bot]
a0df4becf4
Merge #3867
3867: Add a new link to the cloud pricing page r=curquiza a=Kerollmops

This PR promotes the Cloud by adding a link to the Pricing page to the startup message!

<img width="1002" alt="Capture d’écran 2023-06-29 à 17 40 22" src="https://github.com/meilisearch/meilisearch/assets/3610253/b0528c24-fcc2-43ff-a6a1-3ed91716663b">

Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-07-03 11:25:26 +00:00
ManyTheFish
7a80c0dfb3 Fix invalid attributeToSearchOn error code to be consistent with the others search parameters error codes 2023-07-03 11:52:43 +02:00
Cong Chen
9859e65d2f fix tests 2023-07-01 09:32:50 +08:00
Cong Chen
3bdf01bc1c Fix failed test 2023-06-30 17:39:23 +08:00
Clément Renault
cab4c4d7c9
Add a UTMs to the Cloud link 2023-06-29 17:59:59 +02:00
Clément Renault
4ec08e9430
Add a new link to the cloud pricing page 2023-06-29 17:38:10 +02:00
meili-bors[bot]
661d1f90dc
Merge #3866
3866: Update charabia v0.8.0 r=dureuill a=ManyTheFish

# Pull Request

Update Charabia:
- enhance Japanese segmentation
- enhance Latin Tokenization
  - words containing `_` are now properly segmented into several words
  - brackets `{([])}` are no more considered as context separators so word separated by brackets are now considered near together for the proximity ranking rule
- fixes #3815
- fixes #3778
- fixes [product#151](https://github.com/meilisearch/product/discussions/151)

> Important note: now the float numbers are segmented around the `.` so `3.22` is segmented as [`3`, `.`, `22`] but the middle dot isn't considered as a hard separator, which means that if we search `3.22` we find documents containing `3.22`

Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-06-29 15:24:36 +00:00
ManyTheFish
a82c49ab08 Update test 2023-06-29 15:56:36 +02:00
ManyTheFish
84845de9ef Update Charabia 2023-06-29 15:56:32 +02:00
Clément Renault
09c5edf242
Cargo fmt 2023-06-29 14:37:18 +02:00
Clément Renault
9a13b72f25
Fix the tests 2023-06-29 14:33:32 +02:00
Clément Renault
1d8dfafd25
Add analytics when all facets are sorted by count and the number of modified ones 2023-06-29 14:33:31 +02:00
Kerollmops
b132e859f7
Make clippy happy 2023-06-29 14:33:31 +02:00
Kerollmops
9917bf046a
Move the sortFacetValuesBy in the faceting settings 2023-06-29 14:33:31 +02:00
Kerollmops
d9fea0143f
Make Clippy happy 2023-06-29 14:33:31 +02:00
Kerollmops
a385642ec3
Replace the BTreeMap by an IndexMap to return values in order 2023-06-29 14:33:31 +02:00
Kerollmops
34b2e98fe9
Expose a sortFacetValuesBy parameter to the user 2023-06-29 14:33:00 +02:00
meili-bors[bot]
34a07110de
Merge #3864
3864: Remove `/experimental-features` verbs that weren't in the PRD r=dureuill a=dureuill

Removes:

- POST `/experimental-features`
- DELETE `/experimental-features`

keeping only:

- PATCH `/experimental-features`
- GET `/experimental-features`

The two routes that are described in the PRD.

Following `@guimachiavelli's` [question](https://github.com/meilisearch/documentation/issues/2482#issuecomment-1611845372) about the POST route.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-06-29 09:43:14 +00:00
Clément Renault
44b5b9e1a7
Improve the documentation of the FacetSearchQuery struct 2023-06-29 10:28:23 +02:00
Louis Dureuil
68356869c0
Remove /experimental-features verbs that weren't in the PRD 2023-06-29 10:02:55 +02:00
Louis Dureuil
605c1dd54a
Fix analytics 2023-06-28 16:41:56 +02:00
Clément Renault
3e3f73ba1e
Fix the analytics 2023-06-28 15:45:09 +02:00
Louis Dureuil
82e1f59f1e
Add attributes_to_search_on 2023-06-28 15:28:24 +02:00
Clément Renault
362e9ff845
Add more tests 2023-06-28 15:28:24 +02:00
Clément Renault
32f2556d22
Move the additional_search_parameters_provided analytic inside facets 2023-06-28 15:06:09 +02:00
Kerollmops
63fd10aaa5
Fix the invalid facet name field error code 2023-06-28 15:06:09 +02:00
Kerollmops
29b40295b8
Ignore unknown facet search query parameters 2023-06-28 15:06:09 +02:00
Kerollmops
904f6574bf
Make rustfmt happy 2023-06-28 15:06:08 +02:00
Kerollmops
6fb8af423c
Rename the hits and query output into facetHits and facetQuery respectively 2023-06-28 15:06:08 +02:00
Kerollmops
cb0bb399fa
Fix the error code returned when the facetName field is missing 2023-06-28 15:06:08 +02:00
Clément Renault
87e22e436a
Fix compilation issues 2023-06-28 15:01:51 +02:00
Clément Renault
55c17aa38b
Rename the SearchForFacetValues struct 2023-06-28 15:01:50 +02:00
Clément Renault
f36de2115f
Make clippy happy 2023-06-28 15:01:50 +02:00
Clément Renault
702041b7e1
Improve the returned errors from the facet-search route 2023-06-28 15:01:48 +02:00
Clément Renault
93f30e65a9
Return the correct response JSON object from the facet-search route 2023-06-28 14:58:42 +02:00
Clément Renault
893592c5e9
Send analytics about the facet-search route 2023-06-28 14:58:42 +02:00
Clément Renault
e81809aae7
Make the search for facet work 2023-06-28 14:58:41 +02:00
Kerollmops
ce7e7f12c8
Introduce the facet search route 2023-06-28 14:58:41 +02:00
meili-bors[bot]
9deeec88e0
Merge #3861
3861: Add "meilisearch" prefix to last metrics that were missing it r=Kerollmops a=dureuill

# Pull Request

## Related issue
Related to #3790 

## What does this PR do?
- change implementation to follow the spec on metrics name
- regenerate grafana dashboard from the code

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-06-28 09:28:31 +00:00
Louis Dureuil
ea68ccd034
prefix http_* metrics by meilisearch 2023-06-28 11:21:50 +02:00
meili-bors[bot]
d4f10800f2
Merge #3834
3834: Define searchable fields at runtime r=Kerollmops a=ManyTheFish

## Summary
This feature allows the end-user to search in one or multiple attributes using the search parameter `attributesToSearchOn`:

```json
{
  "q": "Captain Marvel",
  "attributesToSearchOn": ["title"]
}
```

This feature act like a filter, forcing Meilisearch to only return the documents containing the requested words in the attributes-to-search-on. Note that, with the matching strategy `last`, Meilisearch will only ensure that the first word is in the attributes-to-search-on, but, the retrieved documents will be ordered taking into account the word contained in the attributes-to-search-on. 

## Trying the prototype

A dedicated docker image has been released for this feature:

#### last prototype version:

```bash
docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-1
```

#### others prototype versions:

```bash
docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-0
```

## Technical Detail

The attributes-to-search-on list is given to the search context, then, the search context uses the `fid_word_docids`database using only the allowed field ids instead of the global `word_docids` database. This is the same for the prefix databases.
The database cache is updated with the merged values, meaning that the union of the field-id-database values is only made if the requested key is missing from the cache.

### Relevancy limits

Almost all ranking rules behave as expected when ordering the documents.
Only `proximity` could miss-order documents if all the searched words are in the restricted attribute but a better proximity is found in an ignored attribute in a document that should be ranked lower. I put below a failing test showing it:
```rust
#[actix_rt::test]
async fn proximity_ranking_rule_order() {
    let server = Server::new().await;
    let index = index_with_documents(
        &server,
        &json!([
        {
            "title": "Captain super mega cool. A Marvel story",
            // Perfect distance between words in an ignored attribute
            "desc": "Captain Marvel",
            "id": "1",
        },
        {
            "title": "Captain America from Marvel",
            "desc": "a Shazam ersatz",
            "id": "2",
        }]),
    )
    .await;

    // Document 2 should appear before document 1.
    index
        .search(json!({"q": "Captain Marvel", "attributesToSearchOn": ["title"], "attributesToRetrieve": ["id"]}), |response, code| {
            assert_eq!(code, 200, "{}", response);
            assert_eq!(
                response["hits"],
                json!([
                    {"id": "2"},
                    {"id": "1"},
                ])
            );
        })
        .await;
}
```

Fixing this would force us to create a `fid_word_pair_proximity_docids` and a `fid_word_prefix_pair_proximity_docids` databases which may multiply the keys of `word_pair_proximity_docids` and `word_prefix_pair_proximity_docids` by the number of attributes in the searchable_attributes list. If we think we should fix this test, I'll suggest doing it in another PR.

## Related

Fixes #3772

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-06-28 08:19:23 +00:00
meili-bors[bot]
dc293911ad
Merge #3745
3745: tests: add unit test for `PayloadTooLarge` error r=curquiza a=cymruu

# Pull Request
Add a unit test for the `Payload`, which verifies that a request with a payload that is too large is rejected with the appropriate message.
This was requested in this PR https://github.com/meilisearch/meilisearch/pull/3739

## Related issue
https://github.com/meilisearch/meilisearch/pull/3739

## What does this PR do?
- Adds requested test

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Filip Bachul <filipbachul@gmail.com>
2023-06-27 14:58:23 +00:00
Louis Dureuil
b4b686d253
Merge all analytics events pertaining to updating the experimental features 2023-06-27 15:16:23 +02:00
Clément Renault
e69be93e42
Log warn about using both q and vector field parameters 2023-06-27 12:32:44 +02:00
Clément Renault
b2b413db12
Return all the _semanticScore values in the documents 2023-06-27 12:32:43 +02:00
Clément Renault
f3e4d70638
Send analytics about the query vector length 2023-06-27 12:32:43 +02:00
Kerollmops
eecf20f109
Introduce a new invalid_vector_store 2023-06-27 12:32:42 +02:00
Louis Dureuil
864ad2a23c
Check that vector store feature is enabled 2023-06-27 12:32:42 +02:00
Kerollmops
66fb5c150c
Rename _semanticSimilarity into _semanticScore 2023-06-27 12:32:42 +02:00
Kerollmops
7aa1275337
Display the _semanticSimilarity even if the _vectors field is not displayed 2023-06-27 12:32:41 +02:00
Kerollmops
737aec1705
Expose an _semanticSimilarity as a dot product in the documents 2023-06-27 12:32:41 +02:00
Kerollmops
1b2923f7c0
Return the vector in the output of the search routes 2023-06-27 12:32:40 +02:00
Clément Renault
642b0f3a1b
Expose a new vector field on the search route 2023-06-27 12:32:38 +02:00
Clément Renault
cad90e8cbc
Add a vector field to the search routes 2023-06-27 12:32:38 +02:00
Louis Dureuil
13e9b4c2e5
Add dump support 2023-06-26 16:29:43 +02:00
Louis Dureuil
5a83cecb0f
fix tests 2023-06-26 16:29:43 +02:00
Louis Dureuil
cca6e47ec1
Errors when GETting metrics without the feature gate 2023-06-26 16:29:43 +02:00
Louis Dureuil
6196a53668
Gate score_details behind a runtime experimental feature flag 2023-06-26 16:29:43 +02:00
Louis Dureuil
bb6448dc2e
Compute instance features from CLI options 2023-06-26 16:29:43 +02:00
Louis Dureuil
eef9293630
New route to set some experimental features 2023-06-26 16:29:43 +02:00
ManyTheFish
9d2a12821d Use insta snapshot 2023-06-26 14:56:19 +02:00
ManyTheFish
59f64a5256 Return an error when an attribute is not searchable 2023-06-26 14:56:19 +02:00
ManyTheFish
dc391deca0 Reverse assert comparison to have a consistent error message 2023-06-26 14:55:57 +02:00
ManyTheFish
114f878205 Rename restrictSearchableAttributes into attributesToSearchOn 2023-06-26 14:55:57 +02:00
ManyTheFish
993b0d012c Remove proximity_ranking_rule_order test, fixing this test would force us to create a fid_word_pair_proximity_docids and a fid_word_prefix_pair_proximity_docids databases which may multiply the keys of word_pair_proximity_docids and word_prefix_pair_proximity_docids by the number of attributes in the searchable_attributes list 2023-06-26 14:55:57 +02:00
ManyTheFish
a61ca4066e Add tests 2023-06-26 14:55:14 +02:00
ManyTheFish
461b5118bd Add API search setting 2023-06-26 14:55:14 +02:00
Cong Chen
6d4981ec25 Expose lastUpdate and isIndexing in /stats endpoint 2023-06-23 07:24:25 +08:00
Louis Dureuil
11d32ad192
Add very light analytics for scoring 2023-06-22 12:39:14 +02:00
Louis Dureuil
49c8bc4de6
Fix tests 2023-06-22 12:39:14 +02:00
Louis Dureuil
da833eb095
Expose the scores and detailed scores in the API 2023-06-22 12:39:14 +02:00
Filip Bachul
9015a8e8d9 Merge branch 'main' into cymruu/payload-unit-test 2023-06-21 09:26:50 +02:00
meili-bors[bot]
c1e3cc04b0
Merge #3811
3811: Bring back changes from `release-v1.2.0` to `main` r=Kerollmops a=curquiza



Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Filip Bachul <filipbachul@gmail.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-06-06 13:10:24 +00:00
Tamo
2acc3ec5ee
fix the type of the document deletion by filter tasks 2023-05-30 15:18:52 +02:00
Tamo
c9b65677bf
return the on disk size actually used by meilisearch 2023-05-25 18:30:30 +02:00
Tamo
35d5556f1f
prefix all the metrics by meilisearch_ 2023-05-25 17:41:53 +02:00
Tamo
c433bdd1cd add a view for the task queue in the metrics 2023-05-25 12:58:13 +02:00
Tamo
1b601f70c6 increase the bucketing of requests 2023-05-25 11:08:16 +02:00
Tamo
9111f5176f get rid of the invalid document delete filter in favor of the invalid document filter 2023-05-24 11:53:16 +02:00
Tamo
b9dd092a62 make the details return null in the originalFilter field if no filter was provided + add a big test on the details 2023-05-24 11:48:22 +02:00
Tamo
ca99bc3188 implement the missing document filter error code when deleting documents 2023-05-24 11:29:20 +02:00