MeiliSearch

mirror of https://github.com/meilisearch/MeiliSearch synced 2024-12-25 22:20:06 +01:00

Author	SHA1	Message	Date
meili-bors[bot]	cac355bfa7	Merge #5124 5124: Optimize Prefixes and Merges r=ManyTheFish a=Kerollmops In this PR, we plan to optimize the read of LMDB to use read the entries in lexicographic order and better use the memory-mapping OS cache: - Optimize the prefix generation for word position docids (`@manythefish)` - Optimize the parallel merging of the caches to sort entries before merging the caches (`@kerollmops)` ## Benchmarks on 1cpu 2gb gpo3 (5k IOps) Before on the tag meilisearch-v1.12.0-rc.3. ``` word_position_docids:merge_and_send_docids: 988s compute_word_fst: 23.3s word_pair_proximity_docids:merge_and_send_docids: 428s compute_word_prefix_fid_docids:recompute_modified_prefixes: 76.3s compute_word_prefix_position_docids:recompute_modified_prefixes:from_prefixes: 429s ``` After sorting the whole `HashMap`s in a `Vec` on this branch. ``` word_position_docids:merge_and_send_docids: 202s compute_word_fst: 20.4s word_pair_proximity_docids:merge_and_send_docids: 427s compute_word_prefix_fid_docids:recompute_modified_prefixes: 65.5s compute_word_prefix_position_docids:recompute_modified_prefixes:from_prefixes: 62.5s ``` Co-authored-by: ManyTheFish <many@meilisearch.com> Co-authored-by: Kerollmops <clement@meilisearch.com>	2024-12-05 09:35:52 +00:00
Kerollmops	9020a50df8	Change the default max memory usage to 5% of the total memory	2024-12-05 10:14:46 +01:00
Kerollmops	52843123d4	Clean up and remove the non-sorted merge_caches function	2024-12-05 10:03:05 +01:00
meili-bors[bot]	6298db5bea	Merge #5113 5113: Fix the Minimum BBQueue channel threshold r=Kerollmops a=Kerollmops Co-authored-by: Kerollmops <clement@meilisearch.com> Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-12-05 09:01:02 +00:00
meili-bors[bot]	a003a0934a	Merge #5121 5121: Make the tasks pulling timeout configurable r=dureuill a=Kerollmops Co-authored-by: Kerollmops <clement@meilisearch.com>	2024-12-04 17:04:14 +00:00
Louis Dureuil	3a11e39c01	Force max_memory to a min of 100MiB	2024-12-04 17:53:30 +01:00
Louis Dureuil	5f896b1050	Fix geo when spilling	2024-12-04 17:51:12 +01:00
Kerollmops	d0c4e6da6b	Make clippy happy	2024-12-04 17:39:10 +01:00
Kerollmops	2da5584bb5	Make the tasks pulling timeout configurable	2024-12-04 17:39:07 +01:00
meili-bors[bot]	b7eb802ae6	Merge #5120 5120: Add cross tasks r=Kerollmops a=ManyTheFish Add 4 xtask bench workloads: - `hackernews-add-new-documents`: adds new documents on a db already containing documents - `hackernews-modify-facet-numbers`: modify filterable fields containing numbers of documents on a db already containing documents - `hackernews-modify-facet-strings`: modify filterable fields containing strings of documents on a db already containing documents - `hackernews-modify-searchables`: modify searchable fields of documents on a db already containing documents Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-12-04 16:16:57 +00:00
Kerollmops	2e32d0474c	Lexicographically sort all the map to merge	2024-12-04 17:05:11 +01:00
Kerollmops	cb99ac6f7e	Consume vec instead of draining	2024-12-04 17:00:22 +01:00
Kerollmops	be411435f5	Use the merge_caches_alt function in the docids merging	2024-12-04 16:37:29 +01:00
Kerollmops	29ef164530	Introduce a new semi ordered merge function	2024-12-04 16:33:35 +01:00
ManyTheFish	739c52a3cd	Replace HashSets by BTreeSets for the prefixes	2024-12-04 16:16:48 +01:00
Tamo	7a2af06b1e	update the impacted snapshots	2024-12-04 15:52:24 +01:00
Tamo	cb0c3a5aad	stop adding one enqueued tasks to all unprioritized batches	2024-12-04 15:48:28 +01:00
ManyTheFish	8388698993	Fix dat hash	2024-12-04 15:09:10 +01:00
Tamo	cbcf6c9ba3	make the processing tasks as processing in a batch	2024-12-04 14:48:48 +01:00
Tamo	bf742d81cf	add a test	2024-12-04 14:47:02 +01:00
ManyTheFish	7458f0386c	fix asset name	2024-12-04 14:44:57 +01:00
ManyTheFish	fc1df5793c	fix tests	2024-12-04 14:35:20 +01:00
meili-bors[bot]	3ded069042	Merge #5122 5122: Yield the BBQueue writing loop r=ManyTheFish a=Kerollmops We prefer yielding to let the writing thread do its job instead of spin looping. Co-authored-by: Kerollmops <clement@meilisearch.com>	2024-12-04 13:33:51 +00:00
Kerollmops	261d2ceb06	Yield the BBQueue writer instead of spin looping	2024-12-04 14:16:40 +01:00
ManyTheFish	1a17e2e572	fix formating	2024-12-04 13:57:06 +01:00
meili-bors[bot]	5b8cd68abe	Merge #5110 5110: Increase margin on deletion of task r=dureuill a=irevoire # Pull Request ## Related issue Fixes https://github.com/meilisearch/meilisearch/issues/5077 ## What does this PR do? - Increase the margin we keep to enqueue task deletion The issue was that we had not enough space on the reserved memory to write both the batch and the deletion task we just enqueued. We could fix it only for this test as it’s not an issue in production where we have 10GiB of margin, but I thought it wasn’t a bad idea either to increase our margin a bit since we’re effectively writing more to lmdb. Co-authored-by: Tamo <tamo@meilisearch.com>	2024-12-04 12:54:48 +00:00
ManyTheFish	5ce9acb0b9	Add workloads	2024-12-04 12:19:19 +01:00
ManyTheFish	953a82ca04	Add new error message	2024-12-04 11:15:29 +01:00
meili-bors[bot]	54341c2e80	Merge #5118 5118: Change the reserve and grant function to accept a closure r=ManyTheFish a=Kerollmops This simplifies the usage of the grant and commits it at the right time, just after having written in it. Co-authored-by: Kerollmops <clement@meilisearch.com>	2024-12-04 10:12:39 +00:00
Kerollmops	96831ed9bb	Send the WakeUp message if necessary in the reserve function	2024-12-04 11:03:01 +01:00
Kerollmops	0459b1a242	Change the reserve and grant function to accept a closure	2024-12-04 10:32:25 +01:00
Kerollmops	8ecb726683	Fix the minimun BBQueue channel threshold	2024-12-03 15:49:11 +01:00
meili-bors[bot]	297e72e262	Merge #5111 5111: Update BBQueue repo to point to the Meilisearch org r=curquiza a=Kerollmops This PR updates the milli dependencies to make BBQueue point to the Meilisearch org repo. Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-12-03 14:27:04 +00:00
Clément Renault	0ad2f57a92	Update bbqueue repo to point to the meilisearch org	2024-12-03 12:00:04 +01:00
meili-bors[bot]	b21d7aedf9	Merge #5029 5029: Guide people to create custom reports on the benchboard r=Kerollmops a=Kerollmops Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-12-03 10:18:11 +00:00
Tamo	71d53f413f	increase the margin allowed to delete task	2024-12-03 11:07:03 +01:00
meili-bors[bot]	054622bd16	Merge #5094 5094: Implement a bbqueue channel between the extractors and the writer r=dureuill a=Kerollmops This PR switches from a bounded crossbeam channel only with allocated entries for the communication between the extractors and the writer to a [BBQueue](https://github.com/jamesmunns/bbqueue)-based system with a Single Producer Single Consumer kind of Circular/Ring Buffers channel. - [x] Implement the BBQueue channel system... - [x] with a crossbeam channel to wake up the receiver. - [x] Manage the BBQueue allocated memory dynamically. - [x] Support content that doesn't fit in the bbqueues. Co-authored-by: Clément Renault <clement@meilisearch.com> v1.12.0-rc.3	2024-12-03 08:00:55 +00:00
Louis Dureuil	e905a72d73	remove mimalloc on Windows	2024-12-02 18:13:56 +01:00
meili-bors[bot]	2e879c1df8	Merge #5109 5109: Fix autobatch r=dureuill a=dureuill Fixes most SDK tests and flaky failures Changes: - Make sure that the settings are not autobatched with document operations, as the new indexer no longer supports this operating mode Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-12-02 16:30:51 +00:00
Louis Dureuil	d040aff101	Stop allocating 1GiB for documents	2024-12-02 16:30:14 +01:00
meili-bors[bot]	5e30731cad	Merge #5107 5107: While spamming the batches route we could see a processing batch becoming missing and then finished, this commit ensures the batches goes from processing to finished directly r=irevoire a=irevoire # Pull Request ## Related issue Fixes the failed tests from this PR: https://github.com/meilisearch/meilisearch-js/pull/1775 See [this message](https://meilisearch.slack.com/archives/CD7Q2UKGB/p1732784680450749) [private link] for more context ## What does this PR do? - Ensure we never enter a state where a processing batches (only existing in RAM) becomes « Not found » by removing the processing batches AFTER writing them to disk - This should also theoretically avoid an issue where a task could go from processing to enqueued and then finished Co-authored-by: Tamo <tamo@meilisearch.com>	2024-12-02 14:36:29 +00:00
Tamo	beeb31ce41	Update crates/index-scheduler/src/lib.rs	2024-12-02 15:32:16 +01:00
Louis Dureuil	057143214d	Fix warnings	2024-12-02 14:42:31 +01:00
Louis Dureuil	6a1d26a60c	Update autobatching tests	2024-12-02 14:15:15 +01:00
Louis Dureuil	d78f4666a0	Fix autobatching of documents and settings	2024-12-02 12:25:01 +01:00
Tamo	a439fa3e1a	While spamming the batches route we could see a processing batch becoming missing and then finished, this commit ensures the batches goes from processing to finished directly	2024-12-02 12:02:16 +01:00
Clément Renault	767259be7e	Prefer returning a abort indexation rather than throwing a panic	2024-12-02 11:53:42 +01:00
Clément Renault	e9f34fb4b1	Make the frame consumer pulling fair	2024-12-02 11:49:01 +01:00
Clément Renault	d5c07ef7b3	Manage key length conversion error correctly	2024-12-02 11:03:00 +01:00
Clément Renault	5e218f3f4d	Remove a sync_all (mark my words)	2024-12-02 11:03:00 +01:00

1 2 3 4 5 ...

10669 Commits