984 Commits

Author SHA1 Message Date
Clément Renault
87e22e436a
Fix compilation issues 2023-06-28 15:01:51 +02:00
Clément Renault
0252cfe8b6
Simplify the placeholder search of the facet-search route 2023-06-28 15:01:50 +02:00
Clément Renault
f35ad96afa
Use the disableOnAttributes parameter on the facet-search route 2023-06-28 15:01:50 +02:00
Clément Renault
2ceb781c73
Use the disableOnWords parameter on the facet-search route 2023-06-28 15:01:50 +02:00
Clément Renault
7bd67543dd
Support the typoTolerant.enabled parameter 2023-06-28 15:01:50 +02:00
Clément Renault
8e86eb91bb
Log an error when a facet value is missing from the database 2023-06-28 15:01:50 +02:00
Clément Renault
55c17aa38b
Rename the SearchForFacetValues struct 2023-06-28 15:01:50 +02:00
Clément Renault
aadbe88048
Return an internal error when a field id is missing 2023-06-28 15:01:50 +02:00
Clément Renault
702041b7e1
Improve the returned errors from the facet-search route 2023-06-28 15:01:48 +02:00
Clément Renault
a05074e675
Fix the max number of facets to be returned to 100 2023-06-28 14:58:42 +02:00
Clément Renault
93f30e65a9
Return the correct response JSON object from the facet-search route 2023-06-28 14:58:42 +02:00
Clément Renault
e81809aae7
Make the search for facet work 2023-06-28 14:58:41 +02:00
Kerollmops
ce7e7f12c8
Introduce the facet search route 2023-06-28 14:58:41 +02:00
Kerollmops
addb21f110
Restrict the number of facet search results to 1000 2023-06-28 14:58:41 +02:00
Kerollmops
c34de05106
Introduce the SearchForFacetValue struct 2023-06-28 14:58:41 +02:00
meili-bors[bot]
d4f10800f2
Merge #3834
3834: Define searchable fields at runtime r=Kerollmops a=ManyTheFish

## Summary
This feature allows the end-user to search in one or multiple attributes using the search parameter `attributesToSearchOn`:

```json
{
  "q": "Captain Marvel",
  "attributesToSearchOn": ["title"]
}
```

This feature act like a filter, forcing Meilisearch to only return the documents containing the requested words in the attributes-to-search-on. Note that, with the matching strategy `last`, Meilisearch will only ensure that the first word is in the attributes-to-search-on, but, the retrieved documents will be ordered taking into account the word contained in the attributes-to-search-on. 

## Trying the prototype

A dedicated docker image has been released for this feature:

#### last prototype version:

```bash
docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-1
```

#### others prototype versions:

```bash
docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-0
```

## Technical Detail

The attributes-to-search-on list is given to the search context, then, the search context uses the `fid_word_docids`database using only the allowed field ids instead of the global `word_docids` database. This is the same for the prefix databases.
The database cache is updated with the merged values, meaning that the union of the field-id-database values is only made if the requested key is missing from the cache.

### Relevancy limits

Almost all ranking rules behave as expected when ordering the documents.
Only `proximity` could miss-order documents if all the searched words are in the restricted attribute but a better proximity is found in an ignored attribute in a document that should be ranked lower. I put below a failing test showing it:
```rust
#[actix_rt::test]
async fn proximity_ranking_rule_order() {
    let server = Server::new().await;
    let index = index_with_documents(
        &server,
        &json!([
        {
            "title": "Captain super mega cool. A Marvel story",
            // Perfect distance between words in an ignored attribute
            "desc": "Captain Marvel",
            "id": "1",
        },
        {
            "title": "Captain America from Marvel",
            "desc": "a Shazam ersatz",
            "id": "2",
        }]),
    )
    .await;

    // Document 2 should appear before document 1.
    index
        .search(json!({"q": "Captain Marvel", "attributesToSearchOn": ["title"], "attributesToRetrieve": ["id"]}), |response, code| {
            assert_eq!(code, 200, "{}", response);
            assert_eq!(
                response["hits"],
                json!([
                    {"id": "2"},
                    {"id": "1"},
                ])
            );
        })
        .await;
}
```

Fixing this would force us to create a `fid_word_pair_proximity_docids` and a `fid_word_prefix_pair_proximity_docids` databases which may multiply the keys of `word_pair_proximity_docids` and `word_prefix_pair_proximity_docids` by the number of attributes in the searchable_attributes list. If we think we should fix this test, I'll suggest doing it in another PR.

## Related

Fixes #3772

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-06-28 08:19:23 +00:00
Clément Renault
29d8268c94
Fix the vector query part by using the correct universe 2023-06-27 12:32:43 +02:00
Kerollmops
ab9f2269aa
Normalize the vectors during indexation and search 2023-06-27 12:32:41 +02:00
Kerollmops
3b560ef7d0
Make clippy happy 2023-06-27 12:32:40 +02:00
Kerollmops
3c31e1cdd1
Support more pages but in an ugly way 2023-06-27 12:32:39 +02:00
Kerollmops
c79e82c62a
Move back to the hnsw crate
This reverts commit 7a4b6c065482f988b01298642f4c18775503f92f.
2023-06-27 12:32:39 +02:00
Kerollmops
268a9ef416
Move to the hgg crate 2023-06-27 12:32:38 +02:00
Clément Renault
642b0f3a1b
Expose a new vector field on the search route 2023-06-27 12:32:38 +02:00
ManyTheFish
63ca25290b Take into account small Review requests 2023-06-26 14:56:19 +02:00
ManyTheFish
59f64a5256 Return an error when an attribute is not searchable 2023-06-26 14:56:19 +02:00
ManyTheFish
42709ea9a5 Fix clippy warnings 2023-06-26 14:55:57 +02:00
ManyTheFish
fb8fa07169 Restrict field ids in search context 2023-06-26 14:55:57 +02:00
ManyTheFish
0ccf1e2e40 Allow the search cache to store owned values 2023-06-26 14:55:57 +02:00
ManyTheFish
461b5118bd Add API search setting 2023-06-26 14:55:14 +02:00
Tamo
a3716c5678 add the new parameter to the search builder of milli 2023-06-26 14:55:14 +02:00
Louis Dureuil
d26e9a96ec
Add score details to new search tests 2023-06-22 12:39:14 +02:00
Louis Dureuil
49c8bc4de6
Fix tests 2023-06-22 12:39:14 +02:00
Louis Dureuil
da833eb095
Expose the scores and detailed scores in the API 2023-06-22 12:39:14 +02:00
Louis Dureuil
701d44bd91
Store the scores for each bucket
Remove optimization where ranking rules are not executed on buckets of a single document
when the score needs to be computed
2023-06-22 12:39:14 +02:00
Louis Dureuil
c621a250a7
Score for graph based ranking rules
Count phrases in matchingWords and maxMatchingWords
2023-06-22 12:39:14 +02:00
Louis Dureuil
8939e85f60
Add rank_to_score for graph based ranking rules 2023-06-22 12:39:14 +02:00
Louis Dureuil
fa41d2489e
Score for sort 2023-06-22 12:39:14 +02:00
Louis Dureuil
59c5b992c2
Score for geosort 2023-06-22 12:39:14 +02:00
Louis Dureuil
2ea8194c18
Score for exact_attributes 2023-06-22 12:39:14 +02:00
Louis Dureuil
421df64602
RankingRuleOutput now contains a Score 2023-06-22 12:39:14 +02:00
Louis Dureuil
f050634b1e
add virtual conditions to fid and position to always have the max cost 2023-06-20 10:07:18 +02:00
Louis Dureuil
becf1f066a
Change how the cost of removing words is computed 2023-06-20 09:45:43 +02:00
Louis Dureuil
701d299369
Remove out-of-date comment 2023-06-20 09:45:42 +02:00
Louis Dureuil
a20e4d447c
Position now takes into account the distance to the position of the word in the query
it used to be based on the distance to the position 0
2023-06-20 09:45:42 +02:00
Louis Dureuil
af57c3c577
Proximity costs 0 for documents that are perfectly matching 2023-06-20 09:45:42 +02:00
Louis Dureuil
0c40ef6911
Fix sort id 2023-06-20 09:45:42 +02:00
Loïc Lecrenier
2da86b31a6 Remove comments and add documentation 2023-06-14 12:39:42 +02:00
Louis Dureuil
a2a3b8c973
Fix offset difference between query and indexing for hard separators 2023-06-08 12:07:12 +02:00
Louis Dureuil
1dfc4038ab
Add test that fails before PR and passes now 2023-05-29 11:58:26 +02:00
Louis Dureuil
73198179f1
Consistently use wrapping add to avoid overflow in debug when query starts with a separator 2023-05-29 11:54:12 +02:00