Commit Graph

19 Commits

Author SHA1 Message Date
meili-bors[bot]
d4f10800f2
Merge #3834
3834: Define searchable fields at runtime r=Kerollmops a=ManyTheFish

## Summary
This feature allows the end-user to search in one or multiple attributes using the search parameter `attributesToSearchOn`:

```json
{
  "q": "Captain Marvel",
  "attributesToSearchOn": ["title"]
}
```

This feature act like a filter, forcing Meilisearch to only return the documents containing the requested words in the attributes-to-search-on. Note that, with the matching strategy `last`, Meilisearch will only ensure that the first word is in the attributes-to-search-on, but, the retrieved documents will be ordered taking into account the word contained in the attributes-to-search-on. 

## Trying the prototype

A dedicated docker image has been released for this feature:

#### last prototype version:

```bash
docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-1
```

#### others prototype versions:

```bash
docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-0
```

## Technical Detail

The attributes-to-search-on list is given to the search context, then, the search context uses the `fid_word_docids`database using only the allowed field ids instead of the global `word_docids` database. This is the same for the prefix databases.
The database cache is updated with the merged values, meaning that the union of the field-id-database values is only made if the requested key is missing from the cache.

### Relevancy limits

Almost all ranking rules behave as expected when ordering the documents.
Only `proximity` could miss-order documents if all the searched words are in the restricted attribute but a better proximity is found in an ignored attribute in a document that should be ranked lower. I put below a failing test showing it:
```rust
#[actix_rt::test]
async fn proximity_ranking_rule_order() {
    let server = Server::new().await;
    let index = index_with_documents(
        &server,
        &json!([
        {
            "title": "Captain super mega cool. A Marvel story",
            // Perfect distance between words in an ignored attribute
            "desc": "Captain Marvel",
            "id": "1",
        },
        {
            "title": "Captain America from Marvel",
            "desc": "a Shazam ersatz",
            "id": "2",
        }]),
    )
    .await;

    // Document 2 should appear before document 1.
    index
        .search(json!({"q": "Captain Marvel", "attributesToSearchOn": ["title"], "attributesToRetrieve": ["id"]}), |response, code| {
            assert_eq!(code, 200, "{}", response);
            assert_eq!(
                response["hits"],
                json!([
                    {"id": "2"},
                    {"id": "1"},
                ])
            );
        })
        .await;
}
```

Fixing this would force us to create a `fid_word_pair_proximity_docids` and a `fid_word_prefix_pair_proximity_docids` databases which may multiply the keys of `word_pair_proximity_docids` and `word_prefix_pair_proximity_docids` by the number of attributes in the searchable_attributes list. If we think we should fix this test, I'll suggest doing it in another PR.

## Related

Fixes #3772

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-06-28 08:19:23 +00:00
Kerollmops
eecf20f109
Introduce a new invalid_vector_store 2023-06-27 12:32:42 +02:00
Clément Renault
cad90e8cbc
Add a vector field to the search routes 2023-06-27 12:32:38 +02:00
Louis Dureuil
6196a53668
Gate score_details behind a runtime experimental feature flag 2023-06-26 16:29:43 +02:00
ManyTheFish
114f878205 Rename restrictSearchableAttributes into attributesToSearchOn 2023-06-26 14:55:57 +02:00
ManyTheFish
461b5118bd Add API search setting 2023-06-26 14:55:14 +02:00
Louis Dureuil
da833eb095
Expose the scores and detailed scores in the API 2023-06-22 12:39:14 +02:00
Louis Dureuil
a23fbf6c7b
multi-search: Add search with an array of indexes 2023-02-22 17:04:12 +01:00
Louis Dureuil
c8c5944094
Authentication: is_index_authorized takes into account API key indexes even with a tenant token 2023-02-22 16:35:52 +01:00
Tamo
a43765d454
use the pre-defined deserr extractors 2023-02-14 20:05:30 +01:00
Tamo
8fb7b1d10f
bump deserr 2023-02-14 20:04:30 +01:00
Loïc Lecrenier
e225608337 Use invalid_index_uid error code in more places 2023-01-17 15:28:06 +01:00
Loïc Lecrenier
b781f9a0f9 cargo fmt 2023-01-17 11:07:07 +01:00
Loïc Lecrenier
9194508a0f Refactor query parameter deserialisation logic 2023-01-17 11:07:07 +01:00
Loïc Lecrenier
766dd830ae Update deserr to latest version + add new error codes for missing fields
- missing_api_key_indexes
- missing_api_key_actions
- missing_api_key_expires_at

- missing_swap_indexes_indexes
2023-01-17 09:43:07 +01:00
Loïc Lecrenier
436ae4e466 Improve error messages generated by deserr
Split Json and Query Parameter error types
2023-01-17 09:43:07 +01:00
Loïc Lecrenier
1fc11264e8
Refactor deserr integration 2023-01-11 19:08:39 +01:00
Tamo
50ce0409bc
Integrate deserr on the most important routes 2023-01-05 20:48:29 +01:00
Colby Allen
ad2b1467da Renames meilisearch-http to meilisearch 2022-12-08 08:22:53 -07:00