849: Update nbHits count with filtered documents r=MarinPostma a=balajisivaraman

Closes #764 
close #1039

After discussing with @MarinPostma on Slack, this is my first attempt at implementing this for the basic flow that will go through `bucket_sort_with_distinct`.

A few thoughts here: 

- For getting the count of filtered documents alone, I originally thought of using `filter_map.values().filter(|&&v| !v).count()`. In a few cases, this was the same as what I have now implemented. But I realised I couldn't do something similar for `distinct`. So for being consistent, I have implemented both in a similar fashion.
- I also needed the `contains_key` check to ensure we're not counting the same document ID twice.

@MarinPostma also mentioned that this will be an approximation since the sort is lazy. In the test example that I've updated, the actual filtered count will be just 19 (for `male` records), but due to the `limit` in play, it returns 32 (filtering out 11 records overall).

Please let me know if this is the kind of fix we are looking for, and I can implement it in the placeholder search also.

Co-authored-by: Balaji Sivaraman <balaji@balajisivaraman.com>
This commit is contained in:
bors[bot] 2020-11-26 09:53:13 +00:00 committed by GitHub
commit f564a9ce51
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
4 changed files with 129 additions and 7 deletions

View file

@ -588,3 +588,48 @@ async fn placeholder_search_with_empty_query() {
assert_eq!(response["hits"].as_array().unwrap().len(), 3);
});
}
#[actix_rt::test]
async fn test_filter_nb_hits_search_placeholder() {
let mut server = common::Server::with_uid("test");
let body = json!({
"uid": "test",
"primaryKey": "id",
});
server.create_index(body).await;
let documents = json!([
{
"id": 1,
"content": "a",
"color": "green",
"size": 1,
},
{
"id": 2,
"content": "a",
"color": "green",
"size": 2,
},
{
"id": 3,
"content": "a",
"color": "blue",
"size": 3,
},
]);
server.add_or_update_multiple_documents(documents).await;
let (response, _) = server.search_post(json!({})).await;
assert_eq!(response["nbHits"], 3);
server.update_distinct_attribute(json!("color")).await;
let (response, _) = server.search_post(json!({})).await;
assert_eq!(response["nbHits"], 2);
let (response, _) = server.search_post(json!({"filters": "size < 3"})).await;
println!("result: {}", response);
assert_eq!(response["nbHits"], 1);
}

View file

@ -1829,3 +1829,51 @@ async fn update_documents_with_facet_distribution() {
let (response2, _) = server.search_post(search).await;
assert_json_eq!(expected_facet_distribution, response2["facetsDistribution"].clone());
}
#[actix_rt::test]
async fn test_filter_nb_hits_search_normal() {
let mut server = common::Server::with_uid("test");
let body = json!({
"uid": "test",
"primaryKey": "id",
});
server.create_index(body).await;
let documents = json!([
{
"id": 1,
"content": "a",
"color": "green",
"size": 1,
},
{
"id": 2,
"content": "a",
"color": "green",
"size": 2,
},
{
"id": 3,
"content": "a",
"color": "blue",
"size": 3,
},
]);
server.add_or_update_multiple_documents(documents).await;
let (response, _) = server.search_post(json!({"q": "a"})).await;
assert_eq!(response["nbHits"], 3);
let (response, _) = server.search_post(json!({"q": "a", "filters": "size = 1"})).await;
assert_eq!(response["nbHits"], 1);
server.update_distinct_attribute(json!("color")).await;
let (response, _) = server.search_post(json!({"q": "a"})).await;
assert_eq!(response["nbHits"], 2);
let (response, _) = server.search_post(json!({"q": "a", "filters": "size < 3"})).await;
println!("result: {}", response);
assert_eq!(response["nbHits"], 1);
}