ManyTheFish
bb7a503e5d
Compute prefix databases
...
We are now computing the prefix FST and a prefix delta in the Merger thread,
after all the databases are written, the main thread will recompute the prefix databases based on the prefix delta without needing any grenad temporary file anymore
2024-10-01 09:57:06 +02:00
Clément Renault
c1557734dc
Use the GlobalFieldsIdsMap everywhere and write it to disk
...
Co-authored-by: Dureuill <louis@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-09-03 12:01:01 +02:00
Clément Renault
794ebcd582
Replace grenad with the new grenad various-improvement branch
2024-08-30 11:53:59 +02:00
Clément Renault
0c57cf7565
Replace obkv with the temporary new version of it
2024-08-30 11:53:58 +02:00
ManyTheFish
70d71581ee
fix clippy
2024-07-25 10:52:56 +02:00
ManyTheFish
04fa44e7eb
Implement localized attributes settings
2024-07-25 10:51:27 +02:00
Clément Renault
45af18ae9c
Check the Rhai syntax before accepting the script
2024-07-10 16:28:13 +02:00
hanbings
0a40a98bb6
Make milli use edition 2021 ( #4770 )
...
* Make milli use edition 2021
* Add lifetime annotations to milli.
* Run cargo fmt
2024-07-09 17:25:39 +02:00
Louis Dureuil
ca6cc4654b
Add similar route
2024-05-28 15:28:19 +02:00
meili-bors[bot]
19acc65ad2
Merge #4646
...
4646: Reduce `Transform`'s disk usage r=Kerollmops a=Kerollmops
This PR implements what is described in #4485 . It reduces the number of disk writes and disk usage.
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-05-23 16:06:50 +00:00
Clément Renault
fe17c0f52e
Construct the minimal OBKVs according to the settings diff
2024-05-23 11:23:57 +02:00
Louis Dureuil
52d9cb6e5a
Refactor vector indexing
...
- use the parsed_vectors module
- only parse `_vectors` once per document, instead of once per embedder per document
2024-05-20 10:36:17 +02:00
Tamo
c22460045c
Stops returning an option in the internal searchable fields
2024-05-14 17:00:02 +02:00
Clément Renault
d4aeff92d0
Introduce the ThreadPoolNoAbort wrapper
2024-04-24 16:40:12 +02:00
Tamo
19137be0ea
increase the default search time budget from 150ms to 1.5s
2024-04-16 18:09:49 +02:00
Louis Dureuil
6ebb6b55a6
Lazily embed, don't fail hybrid search on embedding failure
2024-04-04 15:58:17 +02:00
Tamo
b8cda6c300
fix the search cutoff and add a test
2024-03-19 10:35:47 +01:00
Tamo
d1db495119
add a settings for the search cutoff
2024-03-19 10:28:23 +01:00
Tamo
4a467739cd
implements a first version of the cutoff without settings
2024-03-19 10:28:21 +01:00
Clément Renault
306b25ad3a
Move the searchForFacetValues struct into a dedicated module
2024-03-13 10:24:21 +01:00
Clément Renault
d3a95ea2f6
Introduce a new OrderByMap struct to simplify the sort by usage
2024-03-12 13:56:56 +01:00
Louis Dureuil
217105b7da
hybrid search uses semantic ratio, error handling
2023-12-14 16:08:42 +01:00
Louis Dureuil
65e49b7092
Remove stuff, add distribution shift (WIP)
2023-12-14 16:08:38 +01:00
Louis Dureuil
dde3a04679
WIP arroy integration
2023-12-14 16:07:49 +01:00
Louis Dureuil
13c2c6c16b
Small commit to add hybrid search and autoembedding
2023-12-14 16:07:48 +01:00
Clément Renault
0d4482625a
Make the changes to use heed v0.20-alpha.6
2023-11-23 11:43:58 +01:00
Kerollmops
c53841e166
Accept the null JSON value as the value of _vectors
2023-08-14 16:03:55 +02:00
ManyTheFish
35758db9ec
Truncate the the normalized long facets used in search for facet value
2023-08-08 16:38:30 +02:00
Clément Renault
df528b41d8
Normalize for the search the facets values
2023-07-20 17:57:07 +02:00
Kerollmops
34b2e98fe9
Expose a sortFacetValuesBy parameter to the user
2023-06-29 14:33:00 +02:00
Clément Renault
55c17aa38b
Rename the SearchForFacetValues struct
2023-06-28 15:01:50 +02:00
Clément Renault
93f30e65a9
Return the correct response JSON object from the facet-search route
2023-06-28 14:58:42 +02:00
Kerollmops
c34de05106
Introduce the SearchForFacetValue struct
2023-06-28 14:58:41 +02:00
Clément Renault
ebad1f396f
Remove the useless euclidean distance implementation
2023-06-27 12:32:43 +02:00
Kerollmops
66b8cfd8c8
Introduce a way to store the HNSW on multiple LMDB entries
2023-06-27 12:32:42 +02:00
Kerollmops
7aa1275337
Display the _semanticSimilarity even if the _vectors
field is not displayed
2023-06-27 12:32:41 +02:00
Kerollmops
737aec1705
Expose an _semanticSimilarity as a dot product in the documents
2023-06-27 12:32:41 +02:00
Kerollmops
5c5a4e075d
Make clippy happy
2023-06-27 12:32:41 +02:00
Kerollmops
ab9f2269aa
Normalize the vectors during indexation and search
2023-06-27 12:32:41 +02:00
Kerollmops
23eaaf1001
Change the name of the distance module
2023-06-27 12:32:39 +02:00
Kerollmops
c79e82c62a
Move back to the hnsw crate
...
This reverts commit 7a4b6c065482f988b01298642f4c18775503f92f.
2023-06-27 12:32:39 +02:00
Kerollmops
268a9ef416
Move to the hgg crate
2023-06-27 12:32:38 +02:00
Clément Renault
4571e512d2
Store the vectors in an HNSW in LMDB
2023-06-27 12:32:38 +02:00
Louis Dureuil
c0fca6f884
Add score_details
2023-06-22 12:39:14 +02:00
Loïc Lecrenier
8628a0c856
Remove docid_word_positions_db + fix deletion bug
...
That would happen when a word was deleted from all exact attributes
but not all regular attributes.
2023-06-07 10:52:50 +02:00
Loïc Lecrenier
48f5bb1693
Implements the geo-sort ranking rule
2023-04-29 11:02:16 +02:00
Loïc Lecrenier
1f813a6f3b
Simplify implementation of the detailed (=visual) logger
2023-04-12 16:32:53 +02:00
Loïc Lecrenier
e7bb8c940f
Merge branch 'search-refactor-highlighter' into search-refactor-highlighter-merged
2023-04-11 12:22:34 +02:00
Loïc Lecrenier
a81165f0d8
Merge remote-tracking branch 'origin/main' into search-refactor
2023-04-07 10:15:55 +02:00
ManyTheFish
9c5f64769a
Integrate the new Highlighter in the search
2023-04-06 13:58:56 +02:00