MeiliSearch

mirror of https://github.com/meilisearch/MeiliSearch synced 2025-07-15 13:58:36 +02:00

Author	SHA1	Message	Date
Kerollmops	dc2b63abdf	Introduce an empty FilterCondition variant to support unknown fields	2021-07-27 16:34:04 +02:00
Kerollmops	b12738cfe9	Use the right DB prefixes to store the faceted fields	2021-07-22 19:18:22 +02:00
Kerollmops	7aa6cc9b04	Do not insert fields in the map when changing the settings	2021-07-22 18:40:12 +02:00
Kerollmops	aa02a7fdd8	Add a test to check that we indeed impact the relevancy	2021-07-22 17:04:38 +02:00
bors[bot]	77de82aaa4	Merge #254 254: Improve the facet string distribution speed r=Kerollmops a=Kerollmops This pull request creates a data structure similar to the one we use for the faceted numbers, a tetratomic decision tree but this time for the facet strings. This PR also changes the facet distribution behavior by returning one of the original facet values, fixes #260. This data structure defines bucket-like structures where documents ids are stored under their facet value and helps the search decide if it wants to move to a lower level under a given bucket or not, depending on if the current bucket contains interesting documents or not. The whole format, algorithm, and previous attempts are explained in the [`facet_string.rs` file](`ec1cfdd42b/milli/src/search/facet/facet_string.rs`). Note that this data structure could be used to sort by string lexicographically, that hypothetically possible. We need more testing, in terms of performance and quality, as we will sort on lowercased versions of the facet values. - [x] Implement a faster and more precise way to fetch the facet distribution. - [x] Store and return the original facet string value. We currently return the lowercased version. Co-authored-by: Kerollmops <clement@meilisearch.com> Co-authored-by: Clément Renault <clement@meilisearch.com>	2021-07-21 15:34:40 +00:00
Clément Renault	0227254a65	Return the original string values for the inverted facet index database	2021-07-21 16:59:39 +02:00
Kerollmops	03a01166ba	Display the original facet string value from the linear facet database	2021-07-21 16:59:39 +02:00
Clément Renault	d23c250ad5	Fix a bound error in the facet string range construction	2021-07-21 16:59:39 +02:00
Clément Renault	081278dfd6	Use the facet string levels when computing the facet distribution	2021-07-21 16:59:39 +02:00
Clément Renault	5676b204dd	Fix the facet string levels codecs	2021-07-21 16:59:38 +02:00
Kerollmops	8c86348119	Indexing the facet strings levels	2021-07-21 16:59:38 +02:00
Kerollmops	a7ae552ba7	Fix the FacetStringLevelZeroRange range when unbounded	2021-07-21 16:59:38 +02:00
Kerollmops	757b2b502a	Remove the FacetValueStringCodec	2021-07-21 16:59:38 +02:00
Kerollmops	adfd4da24c	Introduce the FacetStringIter iterator	2021-07-21 16:59:38 +02:00
Kerollmops	a79661c6dc	Introduce a lot of facet string helper iterators	2021-07-21 16:59:38 +02:00
Kerollmops	851f979039	Describe the way we want to group the facet strings	2021-07-21 16:59:38 +02:00
Kerollmops	f858f64b1f	Move the facet number iterators into their own module	2021-07-21 16:59:37 +02:00
bors[bot]	fa44e95c91	Merge #290 290: Add a $HOME to the CI r=Kerollmops a=irevoire This should fix this issue: https://github.com/meilisearch/milli/runs/3104228432?check_suite_focus=true I think a real fix would be to fix the configuration of our github runner but I don't know how to do it. @curquiza could probably help us on that once she's back from vacation 😄 Co-authored-by: Tamo <tamo@meilisearch.com>	2021-07-20 07:32:46 +00:00
Tamo	0ab541627b	add a $HOME var to the ci	2021-07-19 14:33:49 +02:00
bors[bot]	16698f714b	Merge #287 287: Add benchmarks for indexing r=Kerollmops a=irevoire closes #274 I don't really know how much time this will take on our bench machine. I'm afraid the wiki dataset will take a really long time to bench (it takes 1h30 on my computer). If you are ok with it, I would like to merge this first PR since it introduces a first set of benchmarks and see how much time it takes in reality on our setup. Co-authored-by: Tamo <tamo@meilisearch.com>	2021-07-07 15:41:15 +00:00
Tamo	931021fe57	add benchmarks for indexing	2021-07-07 13:09:05 +02:00
bors[bot]	4c9531bdf3	Merge #285 285: Support documents with at most 65536 fields r=Kerollmops a=Kerollmops Fixes #248. In this PR I updated the `obkv` crate, it now supports arbitrary key length and therefore I was able to use an `u16` to represent the fields instead of a single byte. It was impressively easy to update the whole codebase 🍡 🍔 Co-authored-by: Kerollmops <clement@meilisearch.com>	2021-07-06 16:44:51 +00:00
Kerollmops	0a78107525	Fix the infos crate to make it read u16 field ids	2021-07-06 11:58:03 +02:00
Kerollmops	a9553af635	Add a test to check that we can index more that 256 fields	2021-07-06 11:58:03 +02:00
Kerollmops	838ed1cd32	Use an u16 field id instead of one byte	2021-07-06 11:58:03 +02:00
bors[bot]	cc54c41e30	Merge #283 283: Use the AlwaysFreePages flag when opening an index r=irevoire a=Kerollmops We introduced a new flag in our fork of LMDB, this `AlwaysFreePages` flag forces LMDB to always free the single pages it uses before writing to the disk instead of keeping them in a linked list. Declaring this flag reduces the memory print (leak) we have on memory after indexing a lot of documents. Fixes #279. Co-authored-by: Kerollmops <clement@meilisearch.com>	2021-07-05 16:59:16 +00:00
bors[bot]	63db43cc7a	Merge #284 284: [http-ui] Introduce the route `die` r=Kerollmops a=irevoire This route just `exit` the process. This can come in handy when you run `http-ui` inside of another process (a profiler for example), and you don't want to kill everything Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: Irevoire <tamo@meilisearch.com>	2021-07-05 15:47:53 +00:00
Irevoire	4562b278a8	remove a warning and add a log Co-authored-by: Clément Renault <clement@meilisearch.com>	2021-07-05 17:46:02 +02:00
Tamo	a57e522a67	introduce a die route let the program exit itself alone	2021-07-05 17:38:10 +02:00
Kerollmops	91c5d0c042	Use the AlwaysFreePages flag when opening an index	2021-07-05 16:36:13 +02:00
bors[bot]	007fec21fc	Merge #281 281: Bump to v0.7.2 r=ManyTheFish a=Kerollmops Co-authored-by: Kerollmops <clement@meilisearch.com>	2021-07-05 09:00:26 +00:00
Kerollmops	a6b4069172	Bump to v0.7.2	2021-07-05 10:54:53 +02:00
bors[bot]	d7bc6a6999	Merge #280 280: Fix matching lenghth in matching_words r=Kerollmops a=ManyTheFish related to https://github.com/meilisearch/MeiliSearch/issues/1441 Co-authored-by: many <maxime@meilisearch.com>	2021-07-01 18:50:46 +00:00
many	9f62149b94	Fix matching lenghth in matching_words	2021-07-01 19:03:28 +02:00
bors[bot]	f25f454bd4	Merge #275 275: Fix the benchmarks dependencies r=Kerollmops a=irevoire Import exactly the same dependency as milli instead of a wildcard that can do anything Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: Irevoire <irevoire@protonmail.ch>	2021-07-01 11:07:01 +00:00
bors[bot]	885f243afc	Merge #276 276: Fix the fmt of the auto-generated file r=Kerollmops a=irevoire The file generated by the `build.rs` file of the benchmark was badly formatted and that was causing an issue with the git pre-commit hook I wrote [earlier](https://github.com/meilisearch/milli/blob/main/script/pre-commit) Co-authored-by: Tamo <tamo@meilisearch.com>	2021-07-01 10:24:36 +00:00
Irevoire	ec87bf3dd5	Update benchmarks/Cargo.toml Co-authored-by: Clément Renault <renault.cle@gmail.com>	2021-07-01 11:45:05 +02:00
Tamo	ef965aa3f3	fix the fmt of the auto-generated file	2021-07-01 11:43:09 +02:00
Tamo	fc09d77e89	fix the benchmarks dependcies	2021-07-01 11:38:30 +02:00
bors[bot]	056180e6c8	Merge #273 273: Update tokenizer version to v0.2.3 r=Kerollmops a=curquiza Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>	2021-07-01 09:02:16 +00:00
Clémentine Urquizar	3c149d8a43	Update tokenizer version to v0.2.3	2021-06-30 18:41:35 +02:00
bors[bot]	b4dcdbf00d	Merge #269 #271 269: Fix bug when inserting previously deleted documents r=Kerollmops a=Kerollmops This PR fixes #268. The issue was in the `ExternalDocumentsIds` implementation in the specific case that an external document id was in the soft map marked as deleted. The bug was due to a wrong assumption on my side about how the FST unions were returning the `IndexedValue`s, I thought the values returned in an array were in the same order as the FSTs given to the `OpBuilder` but in fact, [the `IndexedValue`'s `index` field was here to indicate from which FST the values were coming from](https://docs.rs/fst/0.4.7/fst/map/struct.IndexedValue.html). 271: Remove the roaring operation functions warnings r=Kerollmops a=Kerollmops In this PR we are just replacing the usages of the roaring operations function by the new operators. This removes a lot of warnings. Co-authored-by: Kerollmops <clement@meilisearch.com>	2021-06-30 12:34:55 +00:00
Kerollmops	32b7bd366f	Remove the roaring operation functions warnings	2021-06-30 14:12:56 +02:00
bors[bot]	00e2845f0f	Merge #270 270: Update milli version to v0.7.1 r=Kerollmops a=curquiza Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>	2021-06-30 12:12:24 +00:00
Kerollmops	c92ef54466	Add a test for when we insert a previously deleted document	2021-06-30 14:00:01 +02:00
Kerollmops	28782ff99d	Fix ExternalDocumentsIds struct when inserting previously deleted ids	2021-06-30 14:00:01 +02:00
Clémentine Urquizar	b489515f4d	Update milli version to v0.7.1	2021-06-30 13:52:46 +02:00
Kerollmops	54889813ce	Implement some debug functions on the ExternalDocumentsIds struct	2021-06-30 11:29:41 +02:00
Kerollmops	4bce66d5ff	Make the Index::delete_* method private	2021-06-30 10:07:31 +02:00
bors[bot]	66e6ea56b8	Merge #267 267: Highlighting r=Kerollmops a=irevoire closes #262 I basically rewrote a part of the damerau-levenshtein function we were using for the highlighting to accept at most two errors from the user and stop on the third mistake. Also, now it supports utf-8, so it should fix our issue. Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: Irevoire <irevoire@protonmail.ch>	2021-06-30 05:43:50 +00:00

1 2 3 4 5 ...

1033 commits