Commit Graph

165 Commits

Author SHA1 Message Date
Irevoire
48542ac8fd
get rid of chrono in favor of time 2022-02-15 11:41:55 +01:00
Marin Postma
0c84a40298 document batch support
reusable transform

rework update api

add indexer config

fix tests

review changes

Co-authored-by: Clément Renault <clement@meilisearch.com>

fmt
2022-01-19 12:40:20 +01:00
Marin Postma
6eb47ab792 remove update_id in UpdateBuilder 2021-11-16 13:07:04 +01:00
marin postma
2e62925a6e
fix tests 2021-10-25 10:26:42 +02:00
many
3296bb243c
Simplify word level position DB into a word position DB 2021-10-05 12:15:02 +02:00
mpostma
aa6c5df0bc Implement documents format
document reader transform

remove update format

support document sequences

fix document transform

clean transform

improve error handling

add documents! macro

fix transform bug

fix tests

remove csv dependency

Add comments on the transform process

replace search cli

fmt

review edits

fix http ui

fix clippy warnings

Revert "fix clippy warnings"

This reverts commit a1ce3cd96e603633dbf43e9e0b12b2453c9c5620.

fix review comments

remove smallvec in transform loop

review edits
2021-09-21 16:58:33 +02:00
Irevoire
3b7a2cdbce
fix typo
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-20 16:10:39 +02:00
Irevoire
a84f3a8b31
Apply suggestions from code review
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-09 15:09:35 +02:00
Irevoire
ea2f2ecf96
create a new database containing all the documents that were geo-faceted 2021-09-08 17:51:08 +02:00
Irevoire
44d6b6ae9e
Index the geo points 2021-09-08 17:51:07 +02:00
Irevoire
8d9c2c4425
create a new db with getters and setters 2021-09-08 17:51:07 +02:00
many
1d314328f0
Plug new indexer 2021-09-01 16:48:36 +02:00
Clément Renault
89d0758713
Revert "Revert "Sort at query time"" 2021-08-24 11:55:16 +02:00
Clémentine Urquizar
922f9fd4d5
Revert "Sort at query time" 2021-08-20 18:09:17 +02:00
Kerollmops
71602e0f1b
Add the sortable fields into the settings and in the index 2021-08-18 15:04:07 +02:00
Kerollmops
90514e03d1
Fix invalid faceted documents ids buffer size 2021-07-29 15:49:23 +02:00
Kerollmops
b12738cfe9
Use the right DB prefixes to store the faceted fields 2021-07-22 19:18:22 +02:00
Kerollmops
7aa6cc9b04
Do not insert fields in the map when changing the settings 2021-07-22 18:40:12 +02:00
Clément Renault
0227254a65
Return the original string values for the inverted facet index database 2021-07-21 16:59:39 +02:00
Kerollmops
03a01166ba
Display the original facet string value from the linear facet database 2021-07-21 16:59:39 +02:00
Kerollmops
757b2b502a
Remove the FacetValueStringCodec 2021-07-21 16:59:38 +02:00
Kerollmops
838ed1cd32
Use an u16 field id instead of one byte 2021-07-06 11:58:03 +02:00
Kerollmops
91c5d0c042
Use the AlwaysFreePages flag when opening an index 2021-07-05 16:36:13 +02:00
Kerollmops
4bce66d5ff
Make the Index::delete_* method private 2021-06-30 10:07:31 +02:00
Tamo
8d2a0b43ff
run the formatter on the whole project a second time 2021-06-22 15:36:22 +02:00
Clémentine Urquizar
daef43f504
Rename FieldsDistribution into FieldDistribution 2021-06-21 15:57:41 +02:00
Tamo
d08cfda796
convert the field_distribution to a BTreeMap and avoid counting twice the same documents 2021-06-17 18:31:54 +02:00
Tamo
969adaefdf
rename fields_distribution in field_distribution 2021-06-17 15:16:20 +02:00
Tamo
9716fb3b36
format the whole project 2021-06-16 18:33:33 +02:00
Kerollmops
713acc408b
Introduce the primary key to the Settings builder structure 2021-06-16 11:03:36 +02:00
Kerollmops
a7d6930905
Replace the panicking expect by tracked Errors 2021-06-15 11:51:32 +02:00
Kerollmops
28c004aa2c
Prefer using constant for the database names 2021-06-15 11:13:04 +02:00
Kerollmops
312c2d1d8e
Use the Error enum everywhere in the project 2021-06-14 16:58:38 +02:00
Kerollmops
3c304c89d4
Make sure that we generate the faceted database when required 2021-06-02 16:24:58 +02:00
Kerollmops
ff440c1d9d
Introduce the faceted fields method to retrieve those that needs faceting 2021-06-02 16:24:57 +02:00
Kerollmops
2a3f9b32ff
Rename the faceted fields into filterable fields 2021-06-02 16:24:57 +02:00
many
4ddf008be2
add field id word count database 2021-05-31 16:27:28 +02:00
Clément Renault
bd7b285bae
Split the update side to use the number and the strings facet databases 2021-05-25 11:30:00 +02:00
Clément Renault
a56c46b6f1
Explode the string and f64 facet databases into two 2021-05-25 11:28:36 +02:00
Clément Renault
df7a32e3d0
Move the creation date initialization into a function 2021-05-25 11:28:35 +02:00
Alexey Shekhirin
f8d0f5265f
fix(update): fields distribution after documents merge 2021-05-04 22:12:20 +03:00
tamo
d61566787e
provide an iterator over all the documents in a milli index 2021-05-04 11:23:51 +02:00
Alexey Shekhirin
d81c0e8bba
feat(update): disable autogenerate_docids by default 2021-04-30 21:41:34 +03:00
Kerollmops
e65bad16cc
Compute the words prefixes at the end of an update 2021-04-27 14:39:52 +02:00
Kerollmops
b0a417f342
Introduce the word_level_position_docids Index database 2021-04-27 14:25:34 +02:00
Alexey Shekhirin
33860bc3b7
test(update, settings): set & reset synonyms
fixes after review

more fixes after review
2021-04-18 11:24:17 +03:00
Alexey Shekhirin
e39aabbfe6
feat(search, update): synonyms 2021-04-18 11:24:17 +03:00
Marin Postma
9c4660d3d6
add tests 2021-04-15 16:25:56 +02:00
Marin Postma
75464a1baa
review fixes 2021-04-15 16:25:56 +02:00
Marin Postma
2f73fa55ae
add documentation 2021-04-15 16:25:55 +02:00
Marin Postma
45c45e11dd
implement distinct attribute
distinct can return error

facet distinct on numbers

return distinct error

review fixes

make get_facet_value more generic

fixes
2021-04-15 16:25:55 +02:00
Alexey Shekhirin
2658c5c545
feat(index): update fields distribution in clear & delete operations
fixes after review

bump the version of the tokenizer

implement a first version of the stop_words

The front must provide a BTreeSet containing the stop words
The stop_words are set at None if an empty Set is provided
add the stop-words in the http-ui interface

Use maplit in the test
and remove all the useless drop(rtxn) at the end of all tests

Integrate the stop_words in the querytree

remove the stop_words from the querytree except if it was a prefix or a typo

more fixes after review
2021-04-01 19:12:35 +03:00
Alexey Shekhirin
27c7ab6e00
feat(index): store fields distribution in index 2021-04-01 18:35:19 +03:00
tamo
a2f46029c7
implement a first version of the stop_words
The front must provide a BTreeSet containing the stop words
The stop_words are set at None if an empty Set is provided
add the stop-words in the http-ui interface

Use maplit in the test
and remove all the useless drop(rtxn) at the end of all tests
2021-04-01 13:57:55 +02:00
Alexey Shekhirin
9205b640a4 feat(index): introduce fields_ids_distribution 2021-03-31 18:44:47 +03:00
mpostma
f0210453a6
add updated at on put primary key 2021-03-15 14:05:48 +01:00
mpostma
615fe095e1
update index updated at on index writes 2021-03-15 14:05:47 +01:00
mpostma
80d0f9c49d
methods to update index time metadata 2021-03-15 14:05:47 +01:00
Kerollmops
f51eb46c69
Use the RoaringBitmapLenCodec to retrieve the count of documents 2021-03-09 10:25:39 +01:00
Kerollmops
c2ffcc4bd1
Return an heed error from the word_documents_count method 2021-02-18 14:59:37 +01:00
Kerollmops
2f561c77f5
Introduce the word documents count method on the index 2021-02-18 14:35:14 +01:00
Kerollmops
8d710c5130
Introduce heed codecs to retrieve the length of roaring bitmaps 2021-02-18 14:30:47 +01:00
Kerollmops
9b03b0a1b2
Introduce the word prefix pair proximity docids database 2021-02-17 11:12:38 +01:00
Clément Renault
b3a21d5a50
Introduce the getters and setters for the words prefixes FST 2021-02-17 10:45:17 +01:00
Clément Renault
e8639517da
Change the project to become a workspace with milli as a default-member 2021-02-12 16:15:09 +01:00