322: Geosearch r=ManyTheFish a=irevoire

This PR introduces [basic geo-search functionalities](https://github.com/meilisearch/specifications/pull/59), it makes the engine able to index, filter and, sort by geo-point. We decided to use [the rstar library](https://docs.rs/rstar) and to save the points in [an RTree](https://docs.rs/rstar/0.9.1/rstar/struct.RTree.html) that we de/serialize in the index database [by using serde](https://serde.rs/) with [bincode](https://docs.rs/bincode). This is not an efficient way to query this tree as it will consume a lot of CPU and memory when a search is made, but at least it is an easy first way to do so.

### What we will have to do on the indexing part:
 - [x] Index the `_geo` fields from the documents.
   - [x] Create a new module with an extractor in the `extract` module that takes the `obkv_documents` and retrieves the latitude and longitude coordinates, outputting them in a `grenad::Reader` for further process.
   - [x] Call the extractor in the `extract::extract_documents_data` function and send the result to the `TypedChunk` module.
   - [x] Get the `grenad::Reader` in the `typed_chunk::write_typed_chunk_into_index` function and store all the points in the `rtree`
- [x] Delete the documents from the `RTree` when deleting documents from the database. All this can be done in the `delete_documents.rs` file by getting the data structure and removing the points from it, inserting it back after the modification.
- [x] Clearing the `RTree` entirely when we clear the documents from the database, everything happens in the `clear_documents.rs` file.
- [x] save a Roaring bitmap of all documents containing the `_geo` field

### What we will have to do on the query part:
- [x] Filter the documents at a certain distance around a point, this is done by [collecting the documents from the searched point](https://docs.rs/rstar/0.9.1/rstar/struct.RTree.html#method.nearest_neighbor_iter) while they are in range.
  - [x] We must introduce new `geoLowerThan` and `geoGreaterThan` variants to the `Operator` filter enum.
  - [x] Implement the `negative` method on both variants where the `geoGreaterThan` variant is implemented by executing the `geoLowerThan` and removing the results found from the whole list of geo faceted documents.
  - [x] Add the `_geoRadius` function in the pest parser.
- [x] Introduce a `_geo` ascending ranking function that takes a point in parameter, ~~this function must keep the iterator on the `RTree` and make it peekable~~ This was not possible for now, we had to collect the whole iterator. Only the documents that are part of the candidates must be sent too!
  - [x] This ascending ranking rule will only be active if the search is set up with the `_geoPoint` parameter that indicates the center point of the ascending ranking rule.

-----------

- On Meilisearch part: We must introduce a new concept, returning the documents with a new `_geoDistance` field when it passed by the `_geo` ranking rule, this has never been done before. We could maybe just do it afterward when the documents have been retrieved from the database, computing the distance from the `_geoPoint` and all of the documents to be returned.

Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
This commit is contained in:
bors[bot] 2021-09-20 19:04:57 +00:00 committed by GitHub
commit 31c8de1cca
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
23 changed files with 896 additions and 69 deletions

View file

@ -60,7 +60,13 @@ $('#query, #filters').on('input', function () {
const content = document.createElement('div');
content.classList.add("content");
content.innerHTML = element[prop];
// Stringify Objects and Arrays to avoid [Object object]
if (typeof element[prop] === 'object' && element[prop] !== null) {
content.innerHTML = JSON.stringify(element[prop]);
} else {
content.innerHTML = element[prop];
}
field.appendChild(attribute);
field.appendChild(content);

View file

@ -695,6 +695,7 @@ async fn main() -> anyhow::Result<()> {
struct QueryBody {
query: Option<String>,
filters: Option<String>,
sort: Option<String>,
facet_filters: Option<Vec<UntaggedEither<Vec<String>, String>>>,
facet_distribution: Option<bool>,
limit: Option<usize>,
@ -754,6 +755,10 @@ async fn main() -> anyhow::Result<()> {
search.limit(limit);
}
if let Some(sort) = query.sort {
search.sort_criteria(vec![sort.parse().unwrap()]);
}
let SearchResult { matching_words, candidates, documents_ids } =
search.execute().unwrap();