658: Add proximity calculation for the same word r=ManyTheFish a=msvaljek

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/milli/issues/647

## What does this PR do?
- During [the increase of the current word position](d94339a858/milli/src/update/index_documents/extract/extract_word_pair_proximity_docids.rs (L129-L135)) we extract the proximity between the current position and the next one.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: msvaljek <marko.svaljek@commercetools.com>
This commit is contained in:
bors[bot] 2022-10-10 13:33:58 +00:00 committed by GitHub
commit 55d889522b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -127,6 +127,17 @@ fn document_word_positions_into_sorter<'b>(
// Advance the head and push it in the heap.
if let Some(mut head) = ordered_peeked_word_positions.pop() {
if let Some(next_position) = head.iter.next() {
let prox = positions_proximity(head.position, next_position);
if prox > 0 && prox < MAX_DISTANCE {
word_pair_proximity
.entry((head.word.clone(), head.word.clone()))
.and_modify(|p| {
*p = cmp::min(*p, prox);
})
.or_insert(prox);
}
word_positions_heap.push(PeekedWordPosition {
word: head.word,
position: next_position,