Louis Dureuil
58690dfb19
Fix tests compilation after changes to ExternalDocumentsIds API
2023-10-30 13:34:07 +01:00
Clément Renault
dfab6293c9
Use an LMDB database to store the external documents ids
2023-10-30 11:41:23 +01:00
Louis Dureuil
113527f466
Remove soft-deleted related methods from Index
2023-10-30 11:41:22 +01:00
Louis Dureuil
59f88c14b3
Simplify facet update after removing Index::faceted_documents_ids
2023-10-30 11:39:29 +01:00
Louis Dureuil
14832cb324
Remove Index::faceted_documents_ids
2023-10-30 11:37:32 +01:00
ManyTheFish
b45c36cd71
Merge branch 'main' into tmp-release-v1.3.0
2023-08-01 15:05:17 +02:00
Clément Renault
df528b41d8
Normalize for the search the facets values
2023-07-20 17:57:07 +02:00
Kerollmops
eef95de30e
First iteration on exposing puffin profiling
2023-07-18 17:38:13 +02:00
Clément Renault
efbe7ce78b
Clean the facet string FSTs when we clear the documents
2023-06-28 15:36:32 +02:00
Clément Renault
15a4c05379
Store the facet string values in multiple FSTs
2023-06-28 14:58:41 +02:00
Kerollmops
c79e82c62a
Move back to the hnsw crate
...
This reverts commit 7a4b6c065482f988b01298642f4c18775503f92f.
2023-06-27 12:32:39 +02:00
Kerollmops
268a9ef416
Move to the hgg crate
2023-06-27 12:32:38 +02:00
Clément Renault
4571e512d2
Store the vectors in an HNSW in LMDB
2023-06-27 12:32:38 +02:00
Loïc Lecrenier
8628a0c856
Remove docid_word_positions_db + fix deletion bug
...
That would happen when a word was deleted from all exact attributes
but not all regular attributes.
2023-06-07 10:52:50 +02:00
Louis Dureuil
90bc230820
Merge remote-tracking branch 'origin/main' into search-refactor
...
Conflicts | resolution
----------|-----------
Cargo.lock | added mimalloc
Cargo.toml | took origin/main version
milli/src/search/criteria/exactness.rs | deleted after checking it was only clippy changes
milli/src/search/query_tree.rs | deleted after checking it was only clippy changes
2023-05-03 12:19:06 +02:00
Loïc Lecrenier
9b2653427d
Split position DB into fid and relative position DB
2023-03-23 09:22:01 +01:00
Clément Renault
ea016d97af
Implementing an IS EMPTY filter
2023-03-15 14:12:34 +01:00
Clément Renault
9287858997
Introduce a new facet_id_is_null_docids database in the index
2023-03-08 16:14:00 +01:00
f3r10
b216ddba63
Delete and clear data from the new database
2023-01-31 11:28:05 +01:00
Loïc Lecrenier
9026867d17
Give same interface to bulk and incremental facet indexing types
...
+ cargo fmt, oops, sorry for the bad history :(
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
3d145d7f48
Merge the two <facetttype>_faceted_documents_ids methods into one
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
264a04922d
Add prefix_word_pair_proximity database
...
Similar to the word_prefix_pair_proximity one but instead the keys are:
(proximity, prefix, word2)
2022-10-18 10:37:34 +02:00
Loïc Lecrenier
acff17fb88
Simplify indexing tests
2022-08-04 12:03:13 +02:00
Loïc Lecrenier
07003704a8
Merge branch 'filter/field-exist'
2022-07-21 14:51:41 +02:00
Loïc Lecrenier
453d593ce8
Add a database containing the docids where each field exists
2022-07-19 10:07:33 +02:00
Kerollmops
399eec5c01
Fix the indexation tests
2022-07-12 14:55:51 +02:00
Tamo
3b309f654a
Fasten the document deletion
...
When a document deletion occurs, instead of deleting the document we mark it as deleted
in the new “soft deleted” bitmap. It is then removed from the search, and all the other
endpoints.
2022-07-05 15:30:33 +02:00
Irevoire
4f3ce6d9cd
nested fields
2022-04-07 16:58:46 +02:00
ad hoc
6dd2e4ffbd
introduce exact_word_prefix database in index
2022-04-04 20:54:03 +02:00
ad hoc
0a77be4ec0
introduce exact_word_docids db
2022-04-04 20:54:02 +02:00
Irevoire
48542ac8fd
get rid of chrono in favor of time
2022-02-15 11:41:55 +01:00
Marin Postma
0c84a40298
document batch support
...
reusable transform
rework update api
add indexer config
fix tests
review changes
Co-authored-by: Clément Renault <clement@meilisearch.com>
fmt
2022-01-19 12:40:20 +01:00
Marin Postma
6eb47ab792
remove update_id in UpdateBuilder
2021-11-16 13:07:04 +01:00
many
3296bb243c
Simplify word level position DB into a word position DB
2021-10-05 12:15:02 +02:00
mpostma
aa6c5df0bc
Implement documents format
...
document reader transform
remove update format
support document sequences
fix document transform
clean transform
improve error handling
add documents! macro
fix transform bug
fix tests
remove csv dependency
Add comments on the transform process
replace search cli
fmt
review edits
fix http ui
fix clippy warnings
Revert "fix clippy warnings"
This reverts commit a1ce3cd96e603633dbf43e9e0b12b2453c9c5620.
fix review comments
remove smallvec in transform loop
review edits
2021-09-21 16:58:33 +02:00
Irevoire
ea2f2ecf96
create a new database containing all the documents that were geo-faceted
2021-09-08 17:51:08 +02:00
Irevoire
3b9f1db061
implement the clear of the rtree
2021-09-08 17:51:07 +02:00
Clémentine Urquizar
daef43f504
Rename FieldsDistribution into FieldDistribution
2021-06-21 15:57:41 +02:00
Tamo
969adaefdf
rename fields_distribution in field_distribution
2021-06-17 15:16:20 +02:00
Tamo
9716fb3b36
format the whole project
2021-06-16 18:33:33 +02:00
Kerollmops
312c2d1d8e
Use the Error enum everywhere in the project
2021-06-14 16:58:38 +02:00
many
4ddf008be2
add field id word count database
2021-05-31 16:27:28 +02:00
Clément Renault
3a4a150ef0
Fix the tests and remaining warnings
2021-05-25 11:31:06 +02:00
Clément Renault
bd7b285bae
Split the update side to use the number and the strings facet databases
2021-05-25 11:30:00 +02:00
Clément Renault
837c1041c7
Clear and delete the documents from the facet database
2021-05-25 11:28:36 +02:00
Kerollmops
e65bad16cc
Compute the words prefixes at the end of an update
2021-04-27 14:39:52 +02:00
Kerollmops
f713828406
Implement the clear and delete documents for the word-level-positions database
2021-04-27 14:25:34 +02:00
Kerollmops
b0a417f342
Introduce the word_level_position_docids Index database
2021-04-27 14:25:34 +02:00
Alexey Shekhirin
2658c5c545
feat(index): update fields distribution in clear & delete operations
...
fixes after review
bump the version of the tokenizer
implement a first version of the stop_words
The front must provide a BTreeSet containing the stop words
The stop_words are set at None if an empty Set is provided
add the stop-words in the http-ui interface
Use maplit in the test
and remove all the useless drop(rtxn) at the end of all tests
Integrate the stop_words in the querytree
remove the stop_words from the querytree except if it was a prefix or a typo
more fixes after review
2021-04-01 19:12:35 +03:00
mpostma
615fe095e1
update index updated at on index writes
2021-03-15 14:05:47 +01:00