4536: Limit concurrent search requests r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4489

## What does this PR do?
- Adds a « search queue » that limits the number of search requests we can process at the same time and stores search requests to be processed
- Process only one search request per core/thread (we use available_parallelism)
- When the search queue is full, new search requests replace old ones **randomly**. The reason is that:
  - If we serve the oldest one first, like Typesense, we give the worst performances to everyone
  - If we serve the latest one, it gets too easy to DoS us (you just need to fill the queue with as many search requests as we can process simultaneously to ensure no other request will ever be processed)
  - By picking the search request randomly, we give a chance to recent search requests to be processed while ensuring that we can't be owned unless they fill our queue entirely and we start returning errors 5xx
- Adds an experimental parameter to control the size of the queue
- Adds a bunch of tests to ensure the search queue works correctly
- Ensure the loop consuming the search queue is running in the health route and crashes if it’s not the case

Co-authored-by: Tamo <tamo@meilisearch.com>
This commit is contained in:
meili-bors[bot] 2024-03-28 15:01:52 +00:00 committed by GitHub
commit fa9748cc99
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
12 changed files with 375 additions and 2 deletions

View file

@ -23,6 +23,7 @@ use crate::search::{
DEFAULT_CROP_LENGTH, DEFAULT_CROP_MARKER, DEFAULT_HIGHLIGHT_POST_TAG,
DEFAULT_HIGHLIGHT_PRE_TAG, DEFAULT_SEARCH_LIMIT, DEFAULT_SEARCH_OFFSET, DEFAULT_SEMANTIC_RATIO,
};
use crate::search_queue::SearchQueue;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(
@ -182,6 +183,7 @@ fn fix_sort_query_parameters(sort_query: &str) -> Vec<String> {
pub async fn search_with_url_query(
index_scheduler: GuardedData<ActionPolicy<{ actions::SEARCH }>, Data<IndexScheduler>>,
search_queue: web::Data<SearchQueue>,
index_uid: web::Path<String>,
params: AwebQueryParameter<SearchQueryGet, DeserrQueryParamError>,
req: HttpRequest,
@ -204,6 +206,7 @@ pub async fn search_with_url_query(
let distribution = embed(&mut query, index_scheduler.get_ref(), &index)?;
let _permit = search_queue.try_get_search_permit().await?;
let search_result =
tokio::task::spawn_blocking(move || perform_search(&index, query, features, distribution))
.await?;
@ -220,6 +223,7 @@ pub async fn search_with_url_query(
pub async fn search_with_post(
index_scheduler: GuardedData<ActionPolicy<{ actions::SEARCH }>, Data<IndexScheduler>>,
search_queue: web::Data<SearchQueue>,
index_uid: web::Path<String>,
params: AwebJson<SearchQuery, DeserrJsonError>,
req: HttpRequest,
@ -243,6 +247,7 @@ pub async fn search_with_post(
let distribution = embed(&mut query, index_scheduler.get_ref(), &index)?;
let _permit = search_queue.try_get_search_permit().await?;
let search_result =
tokio::task::spawn_blocking(move || perform_search(&index, query, features, distribution))
.await?;