Commit Graph

55 Commits

Author SHA1 Message Date
Louis Dureuil
58690dfb19
Fix tests compilation after changes to ExternalDocumentsIds API 2023-10-30 13:34:07 +01:00
Clément Renault
dfab6293c9
Use an LMDB database to store the external documents ids 2023-10-30 11:41:23 +01:00
Louis Dureuil
113527f466
Remove soft-deleted related methods from Index 2023-10-30 11:41:22 +01:00
Louis Dureuil
59f88c14b3
Simplify facet update after removing Index::faceted_documents_ids 2023-10-30 11:39:29 +01:00
Louis Dureuil
14832cb324
Remove Index::faceted_documents_ids 2023-10-30 11:37:32 +01:00
ManyTheFish
b45c36cd71 Merge branch 'main' into tmp-release-v1.3.0 2023-08-01 15:05:17 +02:00
Clément Renault
df528b41d8
Normalize for the search the facets values 2023-07-20 17:57:07 +02:00
Kerollmops
eef95de30e
First iteration on exposing puffin profiling 2023-07-18 17:38:13 +02:00
Clément Renault
efbe7ce78b
Clean the facet string FSTs when we clear the documents 2023-06-28 15:36:32 +02:00
Clément Renault
15a4c05379
Store the facet string values in multiple FSTs 2023-06-28 14:58:41 +02:00
Kerollmops
c79e82c62a
Move back to the hnsw crate
This reverts commit 7a4b6c065482f988b01298642f4c18775503f92f.
2023-06-27 12:32:39 +02:00
Kerollmops
268a9ef416
Move to the hgg crate 2023-06-27 12:32:38 +02:00
Clément Renault
4571e512d2
Store the vectors in an HNSW in LMDB 2023-06-27 12:32:38 +02:00
Loïc Lecrenier
8628a0c856 Remove docid_word_positions_db + fix deletion bug
That would happen when a word was deleted from all exact attributes
but not all regular attributes.
2023-06-07 10:52:50 +02:00
Louis Dureuil
90bc230820
Merge remote-tracking branch 'origin/main' into search-refactor
Conflicts | resolution
----------|-----------
Cargo.lock | added mimalloc
Cargo.toml |  took origin/main version
milli/src/search/criteria/exactness.rs | deleted after checking it was only clippy changes
milli/src/search/query_tree.rs | deleted after checking it was only clippy changes
2023-05-03 12:19:06 +02:00
Loïc Lecrenier
9b2653427d Split position DB into fid and relative position DB 2023-03-23 09:22:01 +01:00
Clément Renault
ea016d97af
Implementing an IS EMPTY filter 2023-03-15 14:12:34 +01:00
Clément Renault
9287858997
Introduce a new facet_id_is_null_docids database in the index 2023-03-08 16:14:00 +01:00
f3r10
b216ddba63 Delete and clear data from the new database 2023-01-31 11:28:05 +01:00
Loïc Lecrenier
9026867d17 Give same interface to bulk and incremental facet indexing types
+ cargo fmt, oops, sorry for the bad history :(
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
3d145d7f48 Merge the two <facetttype>_faceted_documents_ids methods into one 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
264a04922d Add prefix_word_pair_proximity database
Similar to the word_prefix_pair_proximity one but instead the keys are:
(proximity, prefix, word2)
2022-10-18 10:37:34 +02:00
Loïc Lecrenier
acff17fb88 Simplify indexing tests 2022-08-04 12:03:13 +02:00
Loïc Lecrenier
07003704a8 Merge branch 'filter/field-exist' 2022-07-21 14:51:41 +02:00
Loïc Lecrenier
453d593ce8 Add a database containing the docids where each field exists 2022-07-19 10:07:33 +02:00
Kerollmops
399eec5c01
Fix the indexation tests 2022-07-12 14:55:51 +02:00
Tamo
3b309f654a
Fasten the document deletion
When a document deletion occurs, instead of deleting the document we mark it as deleted
in the new “soft deleted” bitmap. It is then removed from the search, and all the other
endpoints.
2022-07-05 15:30:33 +02:00
Irevoire
4f3ce6d9cd
nested fields 2022-04-07 16:58:46 +02:00
ad hoc
6dd2e4ffbd
introduce exact_word_prefix database in index 2022-04-04 20:54:03 +02:00
ad hoc
0a77be4ec0
introduce exact_word_docids db 2022-04-04 20:54:02 +02:00
Irevoire
48542ac8fd
get rid of chrono in favor of time 2022-02-15 11:41:55 +01:00
Marin Postma
0c84a40298 document batch support
reusable transform

rework update api

add indexer config

fix tests

review changes

Co-authored-by: Clément Renault <clement@meilisearch.com>

fmt
2022-01-19 12:40:20 +01:00
Marin Postma
6eb47ab792 remove update_id in UpdateBuilder 2021-11-16 13:07:04 +01:00
many
3296bb243c
Simplify word level position DB into a word position DB 2021-10-05 12:15:02 +02:00
mpostma
aa6c5df0bc Implement documents format
document reader transform

remove update format

support document sequences

fix document transform

clean transform

improve error handling

add documents! macro

fix transform bug

fix tests

remove csv dependency

Add comments on the transform process

replace search cli

fmt

review edits

fix http ui

fix clippy warnings

Revert "fix clippy warnings"

This reverts commit a1ce3cd96e603633dbf43e9e0b12b2453c9c5620.

fix review comments

remove smallvec in transform loop

review edits
2021-09-21 16:58:33 +02:00
Irevoire
ea2f2ecf96
create a new database containing all the documents that were geo-faceted 2021-09-08 17:51:08 +02:00
Irevoire
3b9f1db061
implement the clear of the rtree 2021-09-08 17:51:07 +02:00
Clémentine Urquizar
daef43f504
Rename FieldsDistribution into FieldDistribution 2021-06-21 15:57:41 +02:00
Tamo
969adaefdf
rename fields_distribution in field_distribution 2021-06-17 15:16:20 +02:00
Tamo
9716fb3b36
format the whole project 2021-06-16 18:33:33 +02:00
Kerollmops
312c2d1d8e
Use the Error enum everywhere in the project 2021-06-14 16:58:38 +02:00
many
4ddf008be2
add field id word count database 2021-05-31 16:27:28 +02:00
Clément Renault
3a4a150ef0
Fix the tests and remaining warnings 2021-05-25 11:31:06 +02:00
Clément Renault
bd7b285bae
Split the update side to use the number and the strings facet databases 2021-05-25 11:30:00 +02:00
Clément Renault
837c1041c7
Clear and delete the documents from the facet database 2021-05-25 11:28:36 +02:00
Kerollmops
e65bad16cc
Compute the words prefixes at the end of an update 2021-04-27 14:39:52 +02:00
Kerollmops
f713828406
Implement the clear and delete documents for the word-level-positions database 2021-04-27 14:25:34 +02:00
Kerollmops
b0a417f342
Introduce the word_level_position_docids Index database 2021-04-27 14:25:34 +02:00
Alexey Shekhirin
2658c5c545
feat(index): update fields distribution in clear & delete operations
fixes after review

bump the version of the tokenizer

implement a first version of the stop_words

The front must provide a BTreeSet containing the stop words
The stop_words are set at None if an empty Set is provided
add the stop-words in the http-ui interface

Use maplit in the test
and remove all the useless drop(rtxn) at the end of all tests

Integrate the stop_words in the querytree

remove the stop_words from the querytree except if it was a prefix or a typo

more fixes after review
2021-04-01 19:12:35 +03:00
mpostma
615fe095e1
update index updated at on index writes 2021-03-15 14:05:47 +01:00