Commit Graph

107 Commits

Author SHA1 Message Date
Louis Dureuil
24240934f9
Improve errors when indexing documents with a user provided embedder 2024-07-16 13:39:01 +02:00
Clément Renault
aace587dd1
Create errors for the internal processing ones 2024-07-10 16:29:18 +02:00
Tamo
1daaed163a Make _vectors.:embedding.regenerate mandatory + tests + error messages 2024-06-27 11:04:58 +02:00
Tamo
1693332cab Update arroy and always build the tree that need to be built 2024-06-24 10:14:03 +02:00
Clément Renault
ee39309aae
Improve errors and introduce a new InvalidSearchDistinct error code 2024-06-11 16:03:39 -04:00
Louis Dureuil
8f7c8ca7f0
Remove now unused error variant 2024-05-22 12:23:43 +02:00
Louis Dureuil
52d9cb6e5a
Refactor vector indexing
- use the parsed_vectors module
- only parse `_vectors` once per document, instead of once per embedder per document
2024-05-20 10:36:17 +02:00
Tamo
897d25780e update milli to latest version 2024-05-16 18:31:32 +02:00
Tamo
7ec4e2a3fb apply all style review comments 2024-05-15 15:02:26 +02:00
Clément Renault
d4aeff92d0
Introduce the ThreadPoolNoAbort wrapper 2024-04-24 16:40:12 +02:00
Clément Renault
b3173d0423
Remove useless dots in the error messages 2024-04-22 18:09:33 +02:00
Clément Renault
96cc5319c8
Introduce a new internal error type to categorize panics 2024-04-22 18:09:33 +02:00
Louis Dureuil
3c6e9851a4
Correct error formatting 2024-04-04 15:58:19 +02:00
Louis Dureuil
dfa5e41ea6
Check validity of the URL setting 2024-03-25 11:23:16 +01:00
Louis Dureuil
88d03c56ab
Don't accept dimensions of 0 (ever) or dimensions greater than the default dimensions of the model 2024-02-07 11:52:09 +01:00
Louis Dureuil
517f5332d6
Allow actually passing dimensions for OpenAI source
-> make sure the settings change is rejected or the settings task fails when the specified model doesn't support
overriding `dimensions` and the passed `dimensions` differs from the model's default dimensions.
2024-02-07 11:51:44 +01:00
Clément Renault
01e2c3d6bb
Bump arroy to v0.2.0 2024-01-16 16:45:55 +01:00
Clément Renault
9f9ad4cc05
Fix Clippy warnings 2024-01-16 15:27:24 +01:00
Louis Dureuil
14b396d302
Add new errors 2023-12-20 17:16:45 +01:00
Louis Dureuil
12940d79a9
WIP
- manual embedder
- multi embedders OK
- clippy + tests OK
2023-12-14 16:08:41 +01:00
Louis Dureuil
922a640188
WIP multi embedders
fixed template bugs
2023-12-14 16:08:41 +01:00
Louis Dureuil
cb4ebe163e
WIP 2023-12-14 16:07:49 +01:00
Louis Dureuil
13c2c6c16b
Small commit to add hybrid search and autoembedding 2023-12-14 16:07:48 +01:00
Clément Renault
0d4482625a
Make the changes to use heed v0.20-alpha.6 2023-11-23 11:43:58 +01:00
Louis Dureuil
113527f466
Remove soft-deleted related methods from Index 2023-10-30 11:41:22 +01:00
Vivek Kumar
dd57873f8e
hide fields not in the displayedAttributes list from errors 2023-08-05 16:03:10 +05:30
Kerollmops
26f0fa678d
Change the error message when a facet is not searchable 2023-06-28 15:06:09 +02:00
Kerollmops
41760a9306
Introduce a new invalid_facet_search_facet_name error code 2023-06-28 15:06:07 +02:00
Clément Renault
702041b7e1
Improve the returned errors from the facet-search route 2023-06-28 15:01:48 +02:00
meili-bors[bot]
d4f10800f2
Merge #3834
3834: Define searchable fields at runtime r=Kerollmops a=ManyTheFish

## Summary
This feature allows the end-user to search in one or multiple attributes using the search parameter `attributesToSearchOn`:

```json
{
  "q": "Captain Marvel",
  "attributesToSearchOn": ["title"]
}
```

This feature act like a filter, forcing Meilisearch to only return the documents containing the requested words in the attributes-to-search-on. Note that, with the matching strategy `last`, Meilisearch will only ensure that the first word is in the attributes-to-search-on, but, the retrieved documents will be ordered taking into account the word contained in the attributes-to-search-on. 

## Trying the prototype

A dedicated docker image has been released for this feature:

#### last prototype version:

```bash
docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-1
```

#### others prototype versions:

```bash
docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-0
```

## Technical Detail

The attributes-to-search-on list is given to the search context, then, the search context uses the `fid_word_docids`database using only the allowed field ids instead of the global `word_docids` database. This is the same for the prefix databases.
The database cache is updated with the merged values, meaning that the union of the field-id-database values is only made if the requested key is missing from the cache.

### Relevancy limits

Almost all ranking rules behave as expected when ordering the documents.
Only `proximity` could miss-order documents if all the searched words are in the restricted attribute but a better proximity is found in an ignored attribute in a document that should be ranked lower. I put below a failing test showing it:
```rust
#[actix_rt::test]
async fn proximity_ranking_rule_order() {
    let server = Server::new().await;
    let index = index_with_documents(
        &server,
        &json!([
        {
            "title": "Captain super mega cool. A Marvel story",
            // Perfect distance between words in an ignored attribute
            "desc": "Captain Marvel",
            "id": "1",
        },
        {
            "title": "Captain America from Marvel",
            "desc": "a Shazam ersatz",
            "id": "2",
        }]),
    )
    .await;

    // Document 2 should appear before document 1.
    index
        .search(json!({"q": "Captain Marvel", "attributesToSearchOn": ["title"], "attributesToRetrieve": ["id"]}), |response, code| {
            assert_eq!(code, 200, "{}", response);
            assert_eq!(
                response["hits"],
                json!([
                    {"id": "2"},
                    {"id": "1"},
                ])
            );
        })
        .await;
}
```

Fixing this would force us to create a `fid_word_pair_proximity_docids` and a `fid_word_prefix_pair_proximity_docids` databases which may multiply the keys of `word_pair_proximity_docids` and `word_prefix_pair_proximity_docids` by the number of attributes in the searchable_attributes list. If we think we should fix this test, I'll suggest doing it in another PR.

## Related

Fixes #3772

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-06-28 08:19:23 +00:00
Kerollmops
531748c536
Return a user error when the _vectors type is invalid 2023-06-27 12:32:41 +02:00
Kerollmops
a7e0f0de89
Introduce a new error message for invalid vector dimensions 2023-06-27 12:32:40 +02:00
ManyTheFish
59f64a5256 Return an error when an attribute is not searchable 2023-06-26 14:56:19 +02:00
ManyTheFish
42650f82e8 Re-add final dot 2023-05-16 10:57:26 +02:00
ManyTheFish
4d691d071a Change double-quotes by back-ticks in sort error message 2023-05-15 11:10:36 +02:00
ManyTheFish
23d1c86825 Re-introduce the sort error message fix 2023-05-15 11:07:23 +02:00
Louis Dureuil
732c52093d
Processing time without autobatching implementation 2023-05-03 17:41:48 +02:00
ManyTheFish
37489fd495 Return an internal error in the case of matching word is invalid 2023-03-01 19:05:16 +01:00
bors[bot]
c88c3637b4
Merge #3461
3461: Bring v1 changes into main r=curquiza a=Kerollmops

Also bring back changes in milli (the remote repository) into main done during the pre-release

Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Philipp Ahlner <philipp@ahlner.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2023-02-07 11:27:27 +00:00
Guillaume Mourier
65a3086cf1
fix test 2023-02-02 12:27:58 +01:00
Tamo
de3c4f1986 throw an error on unknown fields specified in the _geo field 2023-01-24 12:23:24 +01:00
Louis Dureuil
4fd6fd9bef
Indicate filterable attributes when the user set a non filterable attribute in facet distributions 2023-01-19 12:25:18 +01:00
Louis Dureuil
be9786bed9
Change primary key inference error messages 2023-01-05 10:40:09 +01:00
Louis Dureuil
402dcd6b2f
Simplify primary key inference 2022-12-21 15:13:38 +01:00
Kerollmops
6603437cb1
Introduce an indexation abortion function when indexing documents 2022-10-17 17:28:03 +02:00
Irevoire
e96b852107
bump heed 2022-08-17 17:05:50 +02:00
Loïc Lecrenier
dea00311b6 Add type annotations to remove compiler error 2022-08-16 09:19:30 +02:00
Kerollmops
2eec290424
Check the validity of the latitute and longitude numbers 2022-07-12 15:14:06 +02:00
Kerollmops
0bbcc7b180
Expose the DocumentId struct to be sure to inject the generated ids 2022-07-12 15:14:06 +02:00
Kerollmops
6a0a0ae94f
Make the Transform read from an EnrichedDocumentsBatchReader 2022-07-12 14:55:52 +02:00