Commit Graph

122 Commits

Author SHA1 Message Date
Tamo c5322df519
Revert "Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1"" 2024-03-20 10:08:28 +01:00
Tamo 567194b925 Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1"
This reverts commit bd74cce86a, reversing
changes made to d2f77e88bd.
2024-03-19 16:56:21 +01:00
Clément Renault 306b25ad3a
Move the searchForFacetValues struct into a dedicated module 2024-03-13 10:24:21 +01:00
Clément Renault d3a95ea2f6
Introduce a new OrderByMap struct to simplify the sort by usage 2024-03-12 13:56:56 +01:00
Louis Dureuil 217105b7da
hybrid search uses semantic ratio, error handling 2023-12-14 16:08:42 +01:00
Louis Dureuil 65e49b7092
Remove stuff, add distribution shift (WIP) 2023-12-14 16:08:38 +01:00
Louis Dureuil dde3a04679
WIP arroy integration 2023-12-14 16:07:49 +01:00
Louis Dureuil 13c2c6c16b
Small commit to add hybrid search and autoembedding 2023-12-14 16:07:48 +01:00
Clément Renault 0d4482625a
Make the changes to use heed v0.20-alpha.6 2023-11-23 11:43:58 +01:00
Kerollmops c53841e166
Accept the null JSON value as the value of _vectors 2023-08-14 16:03:55 +02:00
ManyTheFish 35758db9ec Truncate the the normalized long facets used in search for facet value 2023-08-08 16:38:30 +02:00
Clément Renault df528b41d8
Normalize for the search the facets values 2023-07-20 17:57:07 +02:00
Kerollmops 34b2e98fe9
Expose a sortFacetValuesBy parameter to the user 2023-06-29 14:33:00 +02:00
Clément Renault 55c17aa38b
Rename the SearchForFacetValues struct 2023-06-28 15:01:50 +02:00
Clément Renault 93f30e65a9
Return the correct response JSON object from the facet-search route 2023-06-28 14:58:42 +02:00
Kerollmops c34de05106
Introduce the SearchForFacetValue struct 2023-06-28 14:58:41 +02:00
Clément Renault ebad1f396f
Remove the useless euclidean distance implementation 2023-06-27 12:32:43 +02:00
Kerollmops 66b8cfd8c8
Introduce a way to store the HNSW on multiple LMDB entries 2023-06-27 12:32:42 +02:00
Kerollmops 7aa1275337
Display the _semanticSimilarity even if the `_vectors` field is not displayed 2023-06-27 12:32:41 +02:00
Kerollmops 737aec1705
Expose an _semanticSimilarity as a dot product in the documents 2023-06-27 12:32:41 +02:00
Kerollmops 5c5a4e075d
Make clippy happy 2023-06-27 12:32:41 +02:00
Kerollmops ab9f2269aa
Normalize the vectors during indexation and search 2023-06-27 12:32:41 +02:00
Kerollmops 23eaaf1001
Change the name of the distance module 2023-06-27 12:32:39 +02:00
Kerollmops c79e82c62a
Move back to the hnsw crate
This reverts commit 7a4b6c065482f988b01298642f4c18775503f92f.
2023-06-27 12:32:39 +02:00
Kerollmops 268a9ef416
Move to the hgg crate 2023-06-27 12:32:38 +02:00
Clément Renault 4571e512d2
Store the vectors in an HNSW in LMDB 2023-06-27 12:32:38 +02:00
Louis Dureuil c0fca6f884
Add score_details 2023-06-22 12:39:14 +02:00
Loïc Lecrenier 8628a0c856 Remove docid_word_positions_db + fix deletion bug
That would happen when a word was deleted from all exact attributes
but not all regular attributes.
2023-06-07 10:52:50 +02:00
Loïc Lecrenier 48f5bb1693 Implements the geo-sort ranking rule 2023-04-29 11:02:16 +02:00
Loïc Lecrenier 1f813a6f3b Simplify implementation of the detailed (=visual) logger 2023-04-12 16:32:53 +02:00
Loïc Lecrenier e7bb8c940f Merge branch 'search-refactor-highlighter' into search-refactor-highlighter-merged 2023-04-11 12:22:34 +02:00
Loïc Lecrenier a81165f0d8 Merge remote-tracking branch 'origin/main' into search-refactor 2023-04-07 10:15:55 +02:00
ManyTheFish 9c5f64769a Integrate the new Highlighter in the search 2023-04-06 13:58:56 +02:00
ManyTheFish efea1e5837 Fix facet normalization 2023-03-29 12:02:24 +02:00
Louis Dureuil 9b83b1deb0
Expose SearchLogger trait 2023-03-27 17:49:18 +02:00
Loïc Lecrenier 862714a18b Remove criterion_implementation_strategy param of Search 2023-03-23 09:44:12 +01:00
Loïc Lecrenier 9b2653427d Split position DB into fid and relative position DB 2023-03-23 09:22:01 +01:00
Loïc Lecrenier fbb1ba3de0 Cargo fmt 2023-03-20 09:41:56 +01:00
Loïc Lecrenier 8b4e07e1a3 WIP 2023-03-20 09:41:56 +01:00
Loïc Lecrenier 4e266211bf Small code reorganisation 2023-03-20 09:41:56 +01:00
Loïc Lecrenier 57fa689131 Cargo fmt 2023-03-20 09:41:56 +01:00
Loïc Lecrenier c27ea2677f Rewrite cheapest path algorithm and empty path cache
It is now much simpler and has much better performance.
2023-03-20 09:41:56 +01:00
Loïc Lecrenier 600e3dd1c5 Remove warnings 2023-03-20 09:41:56 +01:00
Loïc Lecrenier 6c659dc12f Use MiMalloc in milli tests 2023-03-20 09:41:37 +01:00
Loïc Lecrenier 229405aeb9 Choose implementation strategy of criterion at runtime 2022-12-21 09:29:39 +01:00
Gregory Conrad 50954d31fa feat: Re-export Span and Token to milli:: 2022-12-03 13:37:33 -05:00
bors[bot] 5e754b3ee0
Merge #708
708: Reduce memory usage of the MatchingWords structure r=ManyTheFish a=loiclec

# Pull Request

## Related issue
Fixes (partially) https://github.com/meilisearch/meilisearch/issues/3115 

## What does this PR do?
1. Reduces the memory usage caused by the creation of a 10-word query tree by 20x. 
   This is done by deduplicating the `MatchingWord` values, which are heavy because of their inner DFA. The deduplication works by wrapping each `MatchingWord` in a reference-counted box and using a hash map to determine whether a  `MatchingWord` DFA already exists for a certain signature, or whether a new one needs to be built.
 
2. Avoid the worst-case scenario of creating a `MatchingWord` for extremely long words that cannot be indexed by milli.

Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
2022-11-30 17:47:34 +00:00
Loïc Lecrenier 8d0ace2d64 Avoid creating a MatchingWord for words that exceed the length limit 2022-11-28 10:20:13 +01:00
Gregory Conrad 935a724c57 revert: Revert pass by reference API change 2022-11-24 10:08:23 -05:00
Gregory Conrad 7c0e544839 feat: Add all_obkv_to_json function 2022-11-23 21:18:58 -05:00