8928 Commits

Author SHA1 Message Date
Kerollmops
bd3c026406
First to-test version of the algorithm 2023-06-29 14:31:17 +02:00
Kerollmops
84f8938f33
Rename facet distribution to be explicit on the order to find them 2023-06-29 14:31:15 +02:00
meili-bors[bot]
34a07110de
Merge #3864
3864: Remove `/experimental-features` verbs that weren't in the PRD r=dureuill a=dureuill

Removes:

- POST `/experimental-features`
- DELETE `/experimental-features`

keeping only:

- PATCH `/experimental-features`
- GET `/experimental-features`

The two routes that are described in the PRD.

Following `@guimachiavelli's` [question](https://github.com/meilisearch/documentation/issues/2482#issuecomment-1611845372) about the POST route.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-06-29 09:43:14 +00:00
meili-bors[bot]
73bb080a26
Merge #3699
3699: Search for Facet Values r=Kerollmops a=Kerollmops

This PR introduces the first version of [the _Search for Facet Values_ feature](https://github.com/meilisearch/product/discussions/515) that allows a user to search for facets, by optionally using a prefix string and optionally specifying the `q` and `filter` original search parameters to restrict the candidates to search in.

The steps to merge it into Meilisearch will first start by providing prototype Docker images. This way users will be able to test the prototypes before using them.

The current route to use the _Search for Facet Values_ feature is the `POST /indexes/{index}/facet-search` where the body is a JSON object that looks like the following:
```json5
{
  "q": "spiderman", // optional
  "filter": "rating > 10", // optional
  "facetName": "genres",
  "facetQuery": "a" // optional
}
```

## What is missing?

 - [x] Send some analytics.
 - [x] Support the `matchingStrategy` parameter.
 - [x] Make sure that the errors are the right ones.
 - [x] Use the [Index typo tolerance settings](https://www.meilisearch.com/docs/learn/configuration/typo_tolerance#minwordsizefortypos) when matching facet values.
    - [x] minWordSizeForTypos.oneTypo
    - [x] minWordSizeForTypos.twoTypo
 - [x] Add tests
 - [x] Log the time it took to compute the results.
 - [x] Fix the compilation warnings.
 - [x] [Create an issue to fix potential performance issues when indexing](https://github.com/meilisearch/meilisearch/issues/3862).


Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2023-06-29 09:08:55 +00:00
Clément Renault
44b5b9e1a7
Improve the documentation of the FacetSearchQuery struct 2023-06-29 10:28:23 +02:00
Louis Dureuil
68356869c0
Remove /experimental-features verbs that weren't in the PRD 2023-06-29 10:02:55 +02:00
Cong Chen
e3fc7112bc use RoaringBitmap::is_empty instead 2023-06-29 11:46:47 +08:00
Louis Dureuil
605c1dd54a
Fix analytics 2023-06-28 16:41:56 +02:00
Clément Renault
3e3f73ba1e
Fix the analytics 2023-06-28 15:45:09 +02:00
Clément Renault
efbe7ce78b
Clean the facet string FSTs when we clear the documents 2023-06-28 15:36:32 +02:00
Louis Dureuil
82e1f59f1e
Add attributes_to_search_on 2023-06-28 15:28:24 +02:00
Clément Renault
362e9ff845
Add more tests 2023-06-28 15:28:24 +02:00
Clément Renault
32f2556d22
Move the additional_search_parameters_provided analytic inside facets 2023-06-28 15:06:09 +02:00
Kerollmops
63fd10aaa5
Fix the invalid facet name field error code 2023-06-28 15:06:09 +02:00
Kerollmops
29b40295b8
Ignore unknown facet search query parameters 2023-06-28 15:06:09 +02:00
Kerollmops
26f0fa678d
Change the error message when a facet is not searchable 2023-06-28 15:06:09 +02:00
Kerollmops
60ddd53439
Return one of the original facet values when doing a facet search 2023-06-28 15:06:09 +02:00
Kerollmops
2bcd8d2983
Make sure the facet queries are normalized 2023-06-28 15:06:09 +02:00
Kerollmops
09079a4e88
Remove useless InvalidSearchFacet error 2023-06-28 15:06:09 +02:00
Kerollmops
904f6574bf
Make rustfmt happy 2023-06-28 15:06:08 +02:00
Kerollmops
6fb8af423c
Rename the hits and query output into facetHits and facetQuery respectively 2023-06-28 15:06:08 +02:00
Kerollmops
cb0bb399fa
Fix the error code returned when the facetName field is missing 2023-06-28 15:06:08 +02:00
Kerollmops
41760a9306
Introduce a new invalid_facet_search_facet_name error code 2023-06-28 15:06:07 +02:00
Kerollmops
e9a3029c30
Use the right field id to write the string facet values FST 2023-06-28 15:01:51 +02:00
Kerollmops
ed0ff47551
Return an empty list of results if attribute is set as filterable 2023-06-28 15:01:51 +02:00
Clément Renault
e1b8fb48ee
Use the minWordSizeForTypos index settings 2023-06-28 15:01:51 +02:00
Clément Renault
87e22e436a
Fix compilation issues 2023-06-28 15:01:51 +02:00
Clément Renault
0252cfe8b6
Simplify the placeholder search of the facet-search route 2023-06-28 15:01:50 +02:00
Clément Renault
f35ad96afa
Use the disableOnAttributes parameter on the facet-search route 2023-06-28 15:01:50 +02:00
Clément Renault
2ceb781c73
Use the disableOnWords parameter on the facet-search route 2023-06-28 15:01:50 +02:00
Clément Renault
7bd67543dd
Support the typoTolerant.enabled parameter 2023-06-28 15:01:50 +02:00
Clément Renault
8e86eb91bb
Log an error when a facet value is missing from the database 2023-06-28 15:01:50 +02:00
Clément Renault
55c17aa38b
Rename the SearchForFacetValues struct 2023-06-28 15:01:50 +02:00
Clément Renault
aadbe88048
Return an internal error when a field id is missing 2023-06-28 15:01:50 +02:00
Clément Renault
f36de2115f
Make clippy happy 2023-06-28 15:01:50 +02:00
Clément Renault
702041b7e1
Improve the returned errors from the facet-search route 2023-06-28 15:01:48 +02:00
Clément Renault
a05074e675
Fix the max number of facets to be returned to 100 2023-06-28 14:58:42 +02:00
Clément Renault
93f30e65a9
Return the correct response JSON object from the facet-search route 2023-06-28 14:58:42 +02:00
Clément Renault
893592c5e9
Send analytics about the facet-search route 2023-06-28 14:58:42 +02:00
Clément Renault
e81809aae7
Make the search for facet work 2023-06-28 14:58:41 +02:00
Kerollmops
ce7e7f12c8
Introduce the facet search route 2023-06-28 14:58:41 +02:00
Kerollmops
addb21f110
Restrict the number of facet search results to 1000 2023-06-28 14:58:41 +02:00
Kerollmops
c34de05106
Introduce the SearchForFacetValue struct 2023-06-28 14:58:41 +02:00
Clément Renault
15a4c05379
Store the facet string values in multiple FSTs 2023-06-28 14:58:41 +02:00
meili-bors[bot]
9deeec88e0
Merge #3861
3861: Add "meilisearch" prefix to last metrics that were missing it r=Kerollmops a=dureuill

# Pull Request

## Related issue
Related to #3790 

## What does this PR do?
- change implementation to follow the spec on metrics name
- regenerate grafana dashboard from the code

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-06-28 09:28:31 +00:00
Louis Dureuil
167ac55a2d
Update dashboard generated from grafana 2023-06-28 11:22:16 +02:00
Louis Dureuil
ea68ccd034
prefix http_* metrics by meilisearch 2023-06-28 11:21:50 +02:00
meili-bors[bot]
d4f10800f2
Merge #3834
3834: Define searchable fields at runtime r=Kerollmops a=ManyTheFish

## Summary
This feature allows the end-user to search in one or multiple attributes using the search parameter `attributesToSearchOn`:

```json
{
  "q": "Captain Marvel",
  "attributesToSearchOn": ["title"]
}
```

This feature act like a filter, forcing Meilisearch to only return the documents containing the requested words in the attributes-to-search-on. Note that, with the matching strategy `last`, Meilisearch will only ensure that the first word is in the attributes-to-search-on, but, the retrieved documents will be ordered taking into account the word contained in the attributes-to-search-on. 

## Trying the prototype

A dedicated docker image has been released for this feature:

#### last prototype version:

```bash
docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-1
```

#### others prototype versions:

```bash
docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-0
```

## Technical Detail

The attributes-to-search-on list is given to the search context, then, the search context uses the `fid_word_docids`database using only the allowed field ids instead of the global `word_docids` database. This is the same for the prefix databases.
The database cache is updated with the merged values, meaning that the union of the field-id-database values is only made if the requested key is missing from the cache.

### Relevancy limits

Almost all ranking rules behave as expected when ordering the documents.
Only `proximity` could miss-order documents if all the searched words are in the restricted attribute but a better proximity is found in an ignored attribute in a document that should be ranked lower. I put below a failing test showing it:
```rust
#[actix_rt::test]
async fn proximity_ranking_rule_order() {
    let server = Server::new().await;
    let index = index_with_documents(
        &server,
        &json!([
        {
            "title": "Captain super mega cool. A Marvel story",
            // Perfect distance between words in an ignored attribute
            "desc": "Captain Marvel",
            "id": "1",
        },
        {
            "title": "Captain America from Marvel",
            "desc": "a Shazam ersatz",
            "id": "2",
        }]),
    )
    .await;

    // Document 2 should appear before document 1.
    index
        .search(json!({"q": "Captain Marvel", "attributesToSearchOn": ["title"], "attributesToRetrieve": ["id"]}), |response, code| {
            assert_eq!(code, 200, "{}", response);
            assert_eq!(
                response["hits"],
                json!([
                    {"id": "2"},
                    {"id": "1"},
                ])
            );
        })
        .await;
}
```

Fixing this would force us to create a `fid_word_pair_proximity_docids` and a `fid_word_prefix_pair_proximity_docids` databases which may multiply the keys of `word_pair_proximity_docids` and `word_prefix_pair_proximity_docids` by the number of attributes in the searchable_attributes list. If we think we should fix this test, I'll suggest doing it in another PR.

## Related

Fixes #3772

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-06-28 08:19:23 +00:00
meili-bors[bot]
dc293911ad
Merge #3745
3745: tests: add unit test for `PayloadTooLarge` error r=curquiza a=cymruu

# Pull Request
Add a unit test for the `Payload`, which verifies that a request with a payload that is too large is rejected with the appropriate message.
This was requested in this PR https://github.com/meilisearch/meilisearch/pull/3739

## Related issue
https://github.com/meilisearch/meilisearch/pull/3739

## What does this PR do?
- Adds requested test

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Filip Bachul <filipbachul@gmail.com>
2023-06-27 14:58:23 +00:00
meili-bors[bot]
9d68e6969e
Merge #3859
3859: Merge all analytics events pertaining to updating the experimental features r=Kerollmops a=dureuill

Follow-up to #3850 

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-06-27 13:26:01 +00:00