Commit Graph

2104 Commits

Author SHA1 Message Date
Louis Dureuil
65d0c32aa7
Allow overriding OpenAI's url 2024-07-16 13:39:00 +02:00
Louis Dureuil
82647bcded
When retrieveVectors is true, retrieve _vectors.embedder even if there are no vector for that embedder 2024-07-16 13:39:00 +02:00
Louis Dureuil
e83da00446
Milli changes to match to allow for more flexible lifetimes 2024-07-11 16:29:35 +02:00
Louis Dureuil
7fb3e378ff
Do not fail sort comparisons when the field name or target point are different 2024-07-11 16:28:14 +02:00
meili-bors[bot]
29b44e5541
Merge #4626
4626: Edit Documents with Rhai r=ManyTheFish a=Kerollmops

This PR introduces a first version of [the _Update Documents with Function_ (internal)](https://www.notion.so/meilisearch/Update-Documents-by-Function-45f87b13e61c4435b73943768a490808). It uses [the Rhai programming language](https://rhai.rs/) to let users express the modifications they want apply.

You can read more about the way to use this functions on [the Usage PRD Page](https://meilisearch.notion.site/Edit-Documents-with-Rhai-0cff8fea7655436592e7c8a6de932062?pvs=25). The [prototype is available](https://github.com/meilisearch/meilisearch/actions/runs/9038384483) through Docker by using the following command:

```
docker run -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:prototype-edit-documents-with-rhai-3
```

## TODO
 - [x] Support the `DocumentEdition` task in dumps.
 - [x] Remove the unwraps and panics.
 - [x] Improve error codes for the `function` parameter.
 - [x] [Update Rhai to v1.19.0](https://github.com/rhaiscript/rhai/releases/tag/v1.19.0) 🚀
 - [x] Make it an experimental feature (only restrict the HTTP calls).
 - [x] It must be possible not to send a context.
 - [x] Rebase on main.
 - [x] Check that the script cannot do any io.
 - [x] ~Introduce a `Documents.edit` action or~ require the `Documents.all` action.
 - [x] Change the `editionCode` to the clearer `function` field name in the tasks.
 - [x] Support a user provided context and maybe more (but keep function execution isolated for reproducibility).
 - [x] Support deleting documents when the `doc` is `()` (nil, null).
 - [x] Support canceling document edition.
 - [x] Multithread document edition by using rayon (and [rayon-par-bridge](https://docs.rs/rayon-par-bridge/latest/rayon_par_bridge/)).
 - [x] Limit the number of instruction by function execution.
 - [ ] ~Expose the limit of instructions in the settings.~ Not sure, in fact.
 - [x] Ignore unmodified documents in the tasks count.
 - [x] Make the `filter` field optional (not forced to be `null`).

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-07-11 09:02:55 +00:00
Clément Renault
6e80364c50
Apply review comments 2024-07-11 11:00:27 +02:00
Clément Renault
3bac22fd87
We do not do intersections with the universe when it is related to cache 2024-07-10 16:49:36 +02:00
Clément Renault
ce61cb7fe6
Simplify and speedup an intersection pass 2024-07-10 16:49:36 +02:00
Clément Renault
1693d1a311
Simplify the check to decide to stop a loop 2024-07-10 16:49:36 +02:00
Clément Renault
febea735ca
Remove the unused universe parameter from resolve_negative_phrases 2024-07-10 16:49:36 +02:00
Clément Renault
93ba051094
Remove the invalid get_phrases_docids universe parameter 2024-07-10 16:49:35 +02:00
Clément Renault
cd7a20fa32
Make it work by avoid storing invalid stuff in the cache 2024-07-10 16:49:35 +02:00
Clément Renault
41f51adbec
Do less useless intersections 2024-07-10 16:49:35 +02:00
Clément Renault
0ca1a4e805
Always do the intersections with the universe 2024-07-10 16:49:34 +02:00
Clément Renault
50a7393c55
Modify the compute_query_term_subset_docids function to accept the universe 2024-07-10 16:49:34 +02:00
Clément Renault
837274f853
Restrict even more the Rhai engine 2024-07-10 16:30:18 +02:00
Clément Renault
aace587dd1
Create errors for the internal processing ones 2024-07-10 16:29:18 +02:00
Clément Renault
81ec0abad1
Use the new rayon-par-bridge library 2024-07-10 16:29:04 +02:00
Clément Renault
b67d385cf0
Parallelize the edition functions 2024-07-10 16:28:54 +02:00
Clément Renault
2eae2015d7
Support aborting documents edition by function 2024-07-10 16:28:15 +02:00
Clément Renault
33fa17bf12
Support deleting documents with functions 2024-07-10 16:28:15 +02:00
Clément Renault
400e6b93ce
Support user-provided context for documents edition 2024-07-10 16:28:15 +02:00
Clément Renault
f4add93043
Limit the number of script operations 2024-07-10 16:28:14 +02:00
Clément Renault
2fae96ac14
Show the actual number of actually edited documents 2024-07-10 16:28:14 +02:00
Clément Renault
45af18ae9c
Check the Rhai syntax before accepting the script 2024-07-10 16:28:13 +02:00
Clément Renault
2d97164d9f
It works perfectly with some Rhai 2024-07-10 16:28:13 +02:00
Clément Renault
efc156a4a4
Executing Lua works correctly 2024-07-10 16:27:36 +02:00
meili-bors[bot]
2099b4f0dd
Merge #4786
4786: Update dependencies r=Kerollmops a=irevoire

# Pull Request

## Related issue
Fixes #4753

## What does this PR do?
- Update all dependencies except rustls
- [x] Release charabia
- [x] Update charabia
- [x] Double check that the docker build works after updating charabia



Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-07-10 13:23:54 +00:00
Tamo
4d5005b01a make clippy happy 2024-07-10 10:06:59 +02:00
hanbings
0a40a98bb6
Make milli use edition 2021 (#4770)
* Make milli use edition 2021

* Add lifetime annotations to milli.

* Run cargo fmt
2024-07-09 17:25:39 +02:00
Tamo
cd46ebd6b5 remove insta deprecating 2024-07-08 18:38:05 +02:00
Louis Dureuil
128e6c7502
Search: spans with a finer granularity 2024-07-02 16:13:53 +02:00
ManyTheFish
015d90a962 merge main 2024-07-01 11:50:36 +02:00
Louis Dureuil
e53de15b8e
Fix behavior of limit and offset for hybrid search when keyword results are returned early
The test is fixed
2024-06-27 14:25:33 +02:00
Tamo
ce08dc509b add more tests and improve the location of the error 2024-06-27 11:51:45 +02:00
Tamo
1daaed163a Make _vectors.:embedding.regenerate mandatory + tests + error messages 2024-06-27 11:04:58 +02:00
meili-bors[bot]
7e3c306c54
Merge #4725
4725: Store primary key as String when Number exceeds i64 range r=irevoire a=JWSong

# Pull Request

## Related issue
Fixes #4696 

## What does this PR do?
- When a Number value exceeding the range of i64 is received as a primary key, it will be stored as a String.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: JWSong <thdwjddn123@gmail.com>
2024-06-26 07:06:04 +00:00
JWSong
dcdc83946f accept large number as string 2024-06-25 21:41:47 +09:00
meili-bors[bot]
3c4c46377b
Merge #4665
4665: Add missing Korean support r=ManyTheFish a=junhochoi

Some configuration is missing `korean` features and add a test case in `milli/src/search/mod.rs`.

# Pull Request

## Related issue

#3443 #3882 

## What does this PR do?
- Improvement on enabling Korean support

Inspired by the work (#3882) I tried to enable Korean features but have found some missing configurations.
This PR is add those missing configs (mostly Cargo.toml) and added one test case.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Junho Choi <jh.choi@catenoid.net>
2024-06-25 11:51:21 +00:00
Louis Dureuil
d75e0098c7
Fixes for Rust v1.79 2024-06-25 11:16:06 +02:00
Junho Choi
2e0ff56f3f Add missing Korean support
Some configuration is missing `korean` features and
add a test case in `milli/src/search/mod.rs`.
2024-06-25 12:45:21 +09:00
Tamo
1693332cab Update arroy and always build the tree that need to be built 2024-06-24 10:14:03 +02:00
meili-bors[bot]
ddd564665b
Merge #4713
4713: Speed up facet distribution r=ManyTheFish a=Kerollmops

This PR is akin to #4682, but this time, the same logic is applied to the facets. Bitmaps are not decoded, and we do an intersection on the bytes with the search candidates instead of materializing the RoaringBitmap to destroy it just after the operation.

A prospect raised some slow requests when performing facet searches, and I found out that the disk optimization intersection wasn't performed on the facets.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-06-24 05:23:46 +00:00
Clément Renault
9736e16a88
Make clippy happy 2024-06-20 13:02:44 +02:00
Clément Renault
6fa4da8ae7
Improve facet distribution speed in count mode 2024-06-20 12:58:51 +02:00
Clément Renault
19d7cdc20d
Improve facet distribution speed in lexico mode 2024-06-20 12:57:08 +02:00
Louis Dureuil
a04041c8f2
Only spawn the pool once 2024-06-19 16:25:33 +02:00
meili-bors[bot]
e580d6b98f
Merge #4693
4693: Introduce distinct attributes at search time r=irevoire a=Kerollmops

This PR fixes #4611.

### To Do
- [x] Remove the `distinguishableAttributes` settings (not even a commit about that).
- [x] Use the `filterableAttributes` to be able to use the `distinct` parameter at search.
- [x] Work on the errors and make tests.

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-18 07:45:03 +00:00
Tamo
43875e6758 fix bug around nested fields 2024-06-17 15:59:30 +02:00
meili-bors[bot]
e9bf4c43a4
Merge #4649
4649: Don't store the vectors in the documents database r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4607

## What does this PR do?
- Ensure that anything falling under `_vectors` is NOT searchable, filterable or sortable
- [x] per embedder, add a roaring bitmap of documents that provide "userProvided" embeddings
- [x] in the indexing process in extract_vector_points, set the bit corresponding to the document depending on the "userProvided" subfield in the _vectors field.
- [x] in the document DB in typed chunks, when writing the _vectors field, remove all keys corresponding to an embedder

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-17 12:32:03 +00:00