MeiliSearch

mirror of https://github.com/meilisearch/MeiliSearch synced 2025-07-04 20:37:15 +02:00

Author	SHA1	Message	Date
Kerollmops	a751972c57	Prefer using a stable than a random hash builder	2024-12-10 14:25:53 +01:00
Kerollmops	6b269795d2	Update bumparaw-collections to 0.1.2	2024-12-10 14:25:13 +01:00
Kerollmops	89637bcaaf	Use bumparaw-collections in Meilisearch/milli	2024-12-10 11:52:20 +01:00
meili-bors[bot]	1995040846	Merge #5142 5142: Try merge optimisation r=dureuill a=ManyTheFish ![Capture_decran_2024-12-09_a_11 59 42](https://github.com/user-attachments/assets/0dfc7e30-a603-4546-98d2-791990bdfcce) Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-12-09 14:48:26 +00:00
ManyTheFish	07f42e8057	Do not index a filed count when no word is counted	2024-12-09 15:45:12 +01:00
ManyTheFish	71f59749dc	Reduce union impact in merging	2024-12-09 15:44:06 +01:00
meili-bors[bot]	3b0b9967f6	Merge #5141 5141: Use the right amount of max memory and not impact the settings r=curquiza a=Kerollmops Fixes #5132. Related to #5125. Co-authored-by: Kerollmops <clement@meilisearch.com>	2024-12-09 10:40:46 +00:00
meili-bors[bot]	123b54a178	Merge #5056 5056: Attach index name in error message r=irevoire a=airycanon # Pull Request ## Related issue Fixes #4392 ## What does this PR do? - ... ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: airycanon <airycanon@airycanon.me>	2024-12-09 09:59:12 +00:00
Kerollmops	f5dd8dfc3e	Rollback max memory usage changes	2024-12-09 10:26:30 +01:00
Kerollmops	bcfed70888	Revert "Merge #5125 " This reverts commit `9a9383643f`, reversing changes made to `cac355bfa7`.	2024-12-09 10:08:02 +01:00
meili-bors[bot]	503ef3bbc9	Merge #5138 5138: Allow xtask bench to proceed without a commit message r=Kerollmops a=dureuill Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-12-09 09:00:12 +00:00
Louis Dureuil	08f2c696b0	Allow xtask bench to proceed without a commit message	2024-12-09 09:36:59 +01:00
airycanon	b75f1f4c17	fix tests # Conflicts: # crates/index-scheduler/src/batch.rs # crates/index-scheduler/src/snapshots/lib.rs/fail_in_process_batch_for_document_deletion/after_removing_the_documents.snap # crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_bad_primary_key/fifth_task_succeeds.snap # crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_bad_primary_key/fourth_task_fails.snap # crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_multiple_primary_key/second_task_fails.snap # crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_multiple_primary_key/third_task_fails.snap # crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_multiple_primary_key_batch_wrong_key/second_and_third_tasks_fails.snap # crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_set_and_null_primary_key_inference_works/all_other_tasks_succeeds.snap # crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_set_and_null_primary_key_inference_works/second_task_fails.snap # crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_set_and_null_primary_key_inference_works/third_task_succeeds.snap # Conflicts: # crates/index-scheduler/src/batch.rs # crates/meilisearch/src/search/mod.rs # crates/meilisearch/tests/vector/mod.rs # Conflicts: # crates/index-scheduler/src/batch.rs	2024-12-06 02:03:02 +08:00
airycanon	95ed079761	attach index name in errors # Conflicts: # crates/index-scheduler/src/batch.rs # Conflicts: # crates/index-scheduler/src/batch.rs # crates/meilisearch/src/search/mod.rs	2024-12-06 01:12:13 +08:00
meili-bors[bot]	4a082683df	Merge #5131 5131: Ignore documents whose selected fields didn't change r=dureuill a=dureuill Attempts to improve the new indexer performance by ignoring documents whose selected fields didn't change: - Add `Update::has_changed_for_fields` function - Ignore documents whose searchable attributes didn't change for word docids and word pair proximity extraction - Ignore documents whose faceted attributes didn't change for facet extraction Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-12-05 16:04:16 +00:00
meili-bors[bot]	26be5e0733	Merge #5123 5123: Fix batch details r=dureuill a=irevoire # Pull Request ## Related issue Fixes https://github.com/meilisearch/meilisearch/issues/5079 Fixes https://github.com/meilisearch/meilisearch/issues/5112 ## What does this PR do? - Make the processing tasks actually processing in the stats of the batch instead of enqueued - Stop counting one extra task for all non-prioritized batches in the stats - Add a test Co-authored-by: Tamo <tamo@meilisearch.com>	2024-12-05 15:21:55 +00:00
Louis Dureuil	bd5110a2fe	Fix clippy warnings	2024-12-05 16:13:07 +01:00
Louis Dureuil	fa8b9acdf6	Ignore documents that didn't change in facets	2024-12-05 16:12:52 +01:00
Louis Dureuil	2b74d1824b	Ignore documents that didn't change any field in word pair proximity	2024-12-05 15:56:22 +01:00
Louis Dureuil	c77b00d3ac	Don't extract word docids when no searchable changed	2024-12-05 15:51:58 +01:00
Louis Dureuil	c77073efcc	Update::has_changed_for_fields	2024-12-05 15:50:12 +01:00
meili-bors[bot]	1537323eb9	Merge #5119 5119: Settings opt out error msg r=Kerollmops a=ManyTheFish # Pull Request ## Related issue PRD: https://meilisearch.notion.site/API-usage-Settings-to-opt-out-indexing-features-fff4b06b651f8108ade3f858aeb16b14?pvs=4 ## What does this PR do? Add a new error code and message when the user tries a facet search on an index where the facet search is disabled: ```json { "message": "The facet search is disabled for this index", "code": "facet_search_disabled", "type": "invalid_request", "link": "https://docs.meilisearch.com/errors#invalid_facet_search_disabled" } ``` Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-12-05 13:51:11 +00:00
ManyTheFish	a0a3b55700	Change error code	2024-12-05 14:48:29 +01:00
Tamo	214b51de87	try to fix the snapshot on demand flaky test	2024-12-05 14:45:54 +01:00
Tamo	95975944d7	fix the dumps missing the empty swap index tasks	2024-12-05 14:23:38 +01:00
meili-bors[bot]	9a9383643f	Merge #5125 5125: Change the default max memory usage to 5% of the total memory r=ManyTheFish a=Kerollmops After thorough testing, we found that giving 5% of the total available memory to allocate resident memory (caches and channels) is the best approach. The main reason is that the new indexer is highly memory-map oriented, with LMDB, and reads the database while performing the indexation. So, by allowing the maximum amount of memory available to LMDB and the OS, it will perform the key-value store reads and all other indexation operations faster by keeping more pages hot in the cache. In #5124, we also sorted the entries to merge to improve the read speed of LMDB. This is common in database management systems: Reading stuff on the disk is much faster when done in lexicographic order (the default sorted order of key values). The entries have a great chance of already being in the OS memory cache, as they were loaded in a previous read, and reading stuff on the disk is very slow compared to reading memory. Co-authored-by: Kerollmops <clement@meilisearch.com>	2024-12-05 10:11:25 +00:00
meili-bors[bot]	cac355bfa7	Merge #5124 5124: Optimize Prefixes and Merges r=ManyTheFish a=Kerollmops In this PR, we plan to optimize the read of LMDB to use read the entries in lexicographic order and better use the memory-mapping OS cache: - Optimize the prefix generation for word position docids (`@manythefish)` - Optimize the parallel merging of the caches to sort entries before merging the caches (`@kerollmops)` ## Benchmarks on 1cpu 2gb gpo3 (5k IOps) Before on the tag meilisearch-v1.12.0-rc.3. ``` word_position_docids:merge_and_send_docids: 988s compute_word_fst: 23.3s word_pair_proximity_docids:merge_and_send_docids: 428s compute_word_prefix_fid_docids:recompute_modified_prefixes: 76.3s compute_word_prefix_position_docids:recompute_modified_prefixes:from_prefixes: 429s ``` After sorting the whole `HashMap`s in a `Vec` on this branch. ``` word_position_docids:merge_and_send_docids: 202s compute_word_fst: 20.4s word_pair_proximity_docids:merge_and_send_docids: 427s compute_word_prefix_fid_docids:recompute_modified_prefixes: 65.5s compute_word_prefix_position_docids:recompute_modified_prefixes:from_prefixes: 62.5s ``` Co-authored-by: ManyTheFish <many@meilisearch.com> Co-authored-by: Kerollmops <clement@meilisearch.com>	2024-12-05 09:35:52 +00:00
Kerollmops	9020a50df8	Change the default max memory usage to 5% of the total memory	2024-12-05 10:14:46 +01:00
Kerollmops	52843123d4	Clean up and remove the non-sorted merge_caches function	2024-12-05 10:03:05 +01:00
meili-bors[bot]	6298db5bea	Merge #5113 5113: Fix the Minimum BBQueue channel threshold r=Kerollmops a=Kerollmops Co-authored-by: Kerollmops <clement@meilisearch.com> Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-12-05 09:01:02 +00:00
meili-bors[bot]	a003a0934a	Merge #5121 5121: Make the tasks pulling timeout configurable r=dureuill a=Kerollmops Co-authored-by: Kerollmops <clement@meilisearch.com>	2024-12-04 17:04:14 +00:00
Louis Dureuil	3a11e39c01	Force max_memory to a min of 100MiB	2024-12-04 17:53:30 +01:00
Louis Dureuil	5f896b1050	Fix geo when spilling	2024-12-04 17:51:12 +01:00
Kerollmops	d0c4e6da6b	Make clippy happy	2024-12-04 17:39:10 +01:00
Kerollmops	2da5584bb5	Make the tasks pulling timeout configurable	2024-12-04 17:39:07 +01:00
meili-bors[bot]	b7eb802ae6	Merge #5120 5120: Add cross tasks r=Kerollmops a=ManyTheFish Add 4 xtask bench workloads: - `hackernews-add-new-documents`: adds new documents on a db already containing documents - `hackernews-modify-facet-numbers`: modify filterable fields containing numbers of documents on a db already containing documents - `hackernews-modify-facet-strings`: modify filterable fields containing strings of documents on a db already containing documents - `hackernews-modify-searchables`: modify searchable fields of documents on a db already containing documents Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-12-04 16:16:57 +00:00
Kerollmops	2e32d0474c	Lexicographically sort all the map to merge	2024-12-04 17:05:11 +01:00
Kerollmops	cb99ac6f7e	Consume vec instead of draining	2024-12-04 17:00:22 +01:00
Kerollmops	be411435f5	Use the merge_caches_alt function in the docids merging	2024-12-04 16:37:29 +01:00
Kerollmops	29ef164530	Introduce a new semi ordered merge function	2024-12-04 16:33:35 +01:00
ManyTheFish	739c52a3cd	Replace HashSets by BTreeSets for the prefixes	2024-12-04 16:16:48 +01:00
Tamo	7a2af06b1e	update the impacted snapshots	2024-12-04 15:52:24 +01:00
Tamo	cb0c3a5aad	stop adding one enqueued tasks to all unprioritized batches	2024-12-04 15:48:28 +01:00
ManyTheFish	8388698993	Fix dat hash	2024-12-04 15:09:10 +01:00
Tamo	cbcf6c9ba3	make the processing tasks as processing in a batch	2024-12-04 14:48:48 +01:00
Tamo	bf742d81cf	add a test	2024-12-04 14:47:02 +01:00
ManyTheFish	7458f0386c	fix asset name	2024-12-04 14:44:57 +01:00
ManyTheFish	fc1df5793c	fix tests	2024-12-04 14:35:20 +01:00
meili-bors[bot]	3ded069042	Merge #5122 5122: Yield the BBQueue writing loop r=ManyTheFish a=Kerollmops We prefer yielding to let the writing thread do its job instead of spin looping. Co-authored-by: Kerollmops <clement@meilisearch.com>	2024-12-04 13:33:51 +00:00
Kerollmops	261d2ceb06	Yield the BBQueue writer instead of spin looping	2024-12-04 14:16:40 +01:00

1 2 3 4 5 ...

10591 commits