10951 Commits

Author SHA1 Message Date
Tamo
b63c64395d
add a test ensuring the index-scheduler version is set when we cannot write the version file 2025-02-05 18:08:50 +01:00
Tamo
628119e31e
fix the dumpless upgrade potential corruption when upgrading from the v1.12 2025-02-05 18:08:50 +01:00
meili-bors[bot]
78867b6852
Merge #5299
5299: Remote federated search r=dureuill a=dureuill

Fixes #4980 

- Usage: https://www.notion.so/meilisearch/API-usage-Remote-search-request-f64fae093abf409e9434c9b9c8fab6f3?pvs=25#1894b06b651f809a9f3dcc6b7189646e

- Changes database format:
  - Adds a new database key: the code is resilient to the case where the key is missing
  - Adds a new experimental feature: the code for experimental features is resilient to this case

Changes:

- Add experimental feature `proxySearch`
- Add network routes
- Dump support for network
- Add proxy search
- Add various tests

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
v1.13.0-rc.1
2025-02-05 16:08:48 +00:00
Louis Dureuil
b21b8e8f30
Remote search tests 2025-02-05 15:03:33 +01:00
Louis Dureuil
4a9e5ae215
mv multi.rs -> multi/mod.rs 2025-02-05 15:03:33 +01:00
Louis Dureuil
6e1865b75b
network integration tests 2025-02-05 15:03:32 +01:00
Louis Dureuil
64409a1de7
Test server: clear_api_key 2025-02-05 15:03:32 +01:00
Louis Dureuil
1b81cab782
Add more analytics 2025-02-05 15:03:32 +01:00
Louis Dureuil
88190b5602
Fix tests 2025-02-05 15:03:32 +01:00
Louis Dureuil
0b27aa5138
Multi search reads header to know if it is being proxied 2025-02-05 15:03:32 +01:00
Louis Dureuil
35160788d7
Proxy search requests 2025-02-05 15:03:32 +01:00
Louis Dureuil
c3e5c3ba36
Allow rebuilding a SearchQueryWithIndex from its components 2025-02-05 15:03:16 +01:00
Louis Dureuil
04ac0af54b
Add WeightedScoreValues to be able to compare remote scores 2025-02-05 15:03:16 +01:00
Louis Dureuil
9996533364
Make search types serialize and deserialize so that reading from a proxy is possible 2025-02-05 15:03:16 +01:00
Louis Dureuil
3f6b334fc5
Route network 2025-02-05 15:03:16 +01:00
Louis Dureuil
b30e5a7a35
Add new permissions 2025-02-05 15:03:16 +01:00
Louis Dureuil
6d79cb23ba
New error codes 2025-02-05 15:03:16 +01:00
Louis Dureuil
e34afca6d7
Support network in dumps 2025-02-05 15:03:16 +01:00
Louis Dureuil
4918b9ffb6
Network stored in DB 2025-02-05 15:03:15 +01:00
Louis Dureuil
73474e7af0
Network types 2025-02-05 15:03:15 +01:00
Louis Dureuil
7ae6dda03f
Add new experimental feature 2025-02-05 15:01:04 +01:00
meili-bors[bot]
00e764b0d3
Merge #5314
5314: Activate used database size r=irevoire a=ManyTheFish

# Pull Request

make the `/stats` route return the `usedDatabaseSize` corresponding to the size used to store the "real" data in the database and not the disk size used by LMDB


Co-authored-by: ManyTheFish <many@meilisearch.com>
2025-02-05 12:51:57 +00:00
ManyTheFish
4abf0db0b4 Activate used database size 2025-02-05 13:45:47 +01:00
meili-bors[bot]
acc885fd0a
Merge #5312
5312: Send the OSS analytics once per day instead of once per hour r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/5311

## What does this PR do?
- If the instance is OSS => we send the analytics once every day
- If the instance is on the meilisearch cloud => we send the analytics every hour


Co-authored-by: Tamo <tamo@meilisearch.com>
2025-02-05 11:15:34 +00:00
Tamo
61e8cfd4bc
Send the OSS analytics once per day instead of once per hour 2025-02-04 15:39:00 +01:00
meili-bors[bot]
796acd1aee
Merge #5288
5288: Improve AI logging r=dureuill a=Kerollmops

This PR fixes #5285 and brings the changes from #5233 to simplify debugging indexation and search performance issues related to AI. The following texts can be found in the logs to debug and understand performance issues:

 - `embed_one: search` represents the time we spent waiting for the embedding generation, i.e., OpenAI, local HuggingFace, Ollama.
 - `filtered_universe: search::universe` the time spent filtering the documents.
 - ~`next_bucket: search::vector_sort` is the time spent finding the nearest neighbors (ANNs) in the vector store (arroy), locally~ was being triggered too many times.
 - `indexing::vectors` is the time arroy spends indexing the new vectors for a batch.
 - `documents::extract vectors` and `documents::merge vectors` to see the time spent generating and writing the embeddings.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-04 10:20:45 +00:00
Kerollmops
cc8df5e11f
Move back the search-side logging to tracing 2025-02-04 11:16:17 +01:00
meili-bors[bot]
ede74ccc42
Merge #5306
5306: Fix internal error when passing `documentTemplateMaxBytes` to a source that doesn't support it r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes #5305 

## What does this PR do?
- add `DOCUMENT_TEMPLATE_MAX_BYTES` to `allowed_sources_for_field` and `allowed_fields_for_source` to prevent a panic


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-02-04 08:46:13 +00:00
meili-bors[bot]
6425451bbc
Merge #5303
5303: Bring back changes from v1.12.8 into v1.13.0 r=Kerollmops a=Kerollmops

Fixes #5087 and other problems that you can find in the original PR #5294.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-03 10:49:26 +00:00
meili-bors[bot]
fe46855462
Merge #5235
5235: Introduce a compaction subcommand in meilitool r=dureuill a=Kerollmops

This PR proposes a change to the meilitool helper, introducing the `compact-index` subcommand to reduce the size of the indexes.

While working on this tool, I discovered that the current heed `Env::copy_to_file` API is not very temp file friendly and [could be improved](https://github.com/meilisearch/heed/issues/306).

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2025-02-03 10:11:01 +00:00
Kerollmops
8e7d2d25f2
Only open indexes, do not create them 2025-02-03 10:50:38 +01:00
Louis Dureuil
a436534515
Fix test 2025-02-03 10:36:34 +01:00
Kerollmops
2385842537
Fix the imports 2025-02-03 10:29:09 +01:00
Kerollmops
6a70c0ec92
Add a link to the experimental feature GitHub discussion 2025-02-03 10:24:53 +01:00
Kerollmops
7a9382b115
Better document the rayon limitation condition 2025-02-03 10:24:53 +01:00
Kerollmops
62dabeba5f
Do not create too many rayon tasks when processing the settings 2025-02-03 10:24:52 +01:00
Kerollmops
48812229a9
Remove a log that would log too much 2025-02-03 10:24:52 +01:00
Kerollmops
915cc377fb
Refine the env variable and the max readers 2025-02-03 10:24:52 +01:00
Louis Dureuil
96544bfa43
add DOCUMENT_TEMPLATE_MAX_BYTES to allowed_sources_for_field and allowed_fields_for_source 2025-02-03 09:59:17 +01:00
meili-bors[bot]
09d474da63
Merge #5140
5140: Fix workload inversion r=dureuill a=ManyTheFish

The used assets were inverted between `workloads/hackernews-modify-facet-numbers.json`
and `workloads/hackernews-modify-facet-strings.json`, now fixed.


Co-authored-by: ManyTheFish <many@meilisearch.com>
2025-02-03 08:22:22 +00:00
Kerollmops
aaefbfae1f
Do not create too many rayon tasks 2025-01-30 16:36:12 +01:00
Kerollmops
97e17f52a1
Add more logs to see calls to the embedders 2025-01-30 16:36:12 +01:00
Kerollmops
62ced0e3f1
Make cargo fmt happy 2025-01-30 11:09:54 +01:00
Clément Renault
71bb24f17e
Throw and error when the index is not found
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-01-30 11:07:43 +01:00
Clément Renault
c72f114b33
Fix english in the comments
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-01-30 11:07:09 +01:00
meili-bors[bot]
8ed39f5de0
Merge #5300
5300: Improve unexpected panic message r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/5273

## What does this PR do?
- When an unexpected panic happens in the index-scheduler we catch it and rebuild an error message from the join_error
- Same when the upgrade index-scheduler fails


Co-authored-by: Tamo <tamo@meilisearch.com>
2025-01-30 09:23:17 +00:00
Kerollmops
424c5bde40
Move the embedding computation and extraction log to debug 2025-01-29 16:40:36 +01:00
Tamo
bdd3005d10
Log the progress when a batch fails 2025-01-29 16:36:23 +01:00
meili-bors[bot]
4224edea28
Merge #5177
5177: Debug log  the channel congestion r=Kerollmops a=Kerollmops

This PR displays the congestion of the BBQueue channel and the allocated memory for the channel and the extraction. This information can be beneficial for debugging and noticing slow disks. We show three pieces of information in debug:
- The direct attempts: the number of tries to send something in the BBQueue channel,
- The blocked attempts: the number of unsuccessful attempts that must be retried,
- The congestion: The percentage of blocking attempts. The higher, the slower the receiver and, therefore, the disk.

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2025-01-29 15:35:31 +00:00
Kerollmops
cb1b7513af
Log the memory metrics only once 2025-01-29 15:21:52 +01:00