mirror of
https://github.com/meilisearch/MeiliSearch
synced 2025-07-03 20:07:09 +02:00
Make clippy happy
This commit is contained in:
parent
0ee4671a91
commit
71e5605daa
19 changed files with 55 additions and 59 deletions
|
@ -132,12 +132,12 @@ impl<'t, 'b, 'bitmap> FacetRangeSearch<'t, 'b, 'bitmap> {
|
|||
///
|
||||
/// 1. So long as the element's range is less than the left bound, we do nothing and keep iterating
|
||||
/// 2. If the element's range is fully contained by the bounds, then all of its docids are added to
|
||||
/// the roaring bitmap.
|
||||
/// the roaring bitmap.
|
||||
/// 3. If the element's range merely intersects the bounds, then we call the algorithm recursively
|
||||
/// on the children of the element from the level below.
|
||||
/// on the children of the element from the level below.
|
||||
/// 4. If the element's range is greater than the right bound, we do nothing and stop iterating.
|
||||
/// Note that the right bound is found through either the `left_bound` of the *next* element,
|
||||
/// or from the `rightmost_bound` argument
|
||||
/// Note that the right bound is found through either the `left_bound` of the *next* element,
|
||||
/// or from the `rightmost_bound` argument
|
||||
///
|
||||
/// ## Arguments
|
||||
/// - `level`: the level being visited
|
||||
|
|
|
@ -18,10 +18,10 @@ pub struct DistinctOutput {
|
|||
|
||||
/// Return a [`DistinctOutput`] containing:
|
||||
/// - `remaining`: a set of docids built such that exactly one element from `candidates`
|
||||
/// is kept for each distinct value inside the given field. If the field does not exist, it
|
||||
/// is considered unique.
|
||||
/// is kept for each distinct value inside the given field. If the field does not exist, it
|
||||
/// is considered unique.
|
||||
/// - `excluded`: the set of document ids that contain a value for the given field that occurs
|
||||
/// in the given candidates.
|
||||
/// in the given candidates.
|
||||
pub fn apply_distinct_rule(
|
||||
ctx: &mut SearchContext<'_>,
|
||||
field_id: u16,
|
||||
|
|
|
@ -149,7 +149,7 @@ pub type WordId = u16;
|
|||
/// A given token can partially match a query word for several reasons:
|
||||
/// - split words
|
||||
/// - multi-word synonyms
|
||||
/// In these cases we need to match consecutively several tokens to consider that the match is full.
|
||||
/// In these cases we need to match consecutively several tokens to consider that the match is full.
|
||||
#[derive(Debug, PartialEq)]
|
||||
pub enum MatchType<'a> {
|
||||
Full { char_count: usize, byte_len: usize, ids: &'a RangeInclusive<WordId> },
|
||||
|
|
|
@ -21,9 +21,9 @@ use crate::Result;
|
|||
/// 1. `Start` : unique, represents the start of the query
|
||||
/// 2. `End` : unique, represents the end of a query
|
||||
/// 3. `Deleted` : represents a node that was deleted.
|
||||
/// All deleted nodes are unreachable from the start node.
|
||||
/// All deleted nodes are unreachable from the start node.
|
||||
/// 4. `Term` is a regular node representing a word or combination of words
|
||||
/// from the user query.
|
||||
/// from the user query.
|
||||
#[derive(Clone)]
|
||||
pub struct QueryNode {
|
||||
pub data: QueryNodeData,
|
||||
|
|
|
@ -8,7 +8,7 @@ with them, they are "unconditional". These kinds of edges are used to "skip" a n
|
|||
The algorithm uses a depth-first search. It benefits from two main optimisations:
|
||||
- The list of all possible costs to go from any node to the END node is precomputed
|
||||
- The `DeadEndsCache` reduces the number of valid paths drastically, by making some edges
|
||||
untraversable depending on what other edges were selected.
|
||||
untraversable depending on what other edges were selected.
|
||||
|
||||
These two optimisations are meant to avoid traversing edges that wouldn't lead
|
||||
to a valid path. In practically all cases, we avoid the exponential complexity
|
||||
|
@ -24,6 +24,7 @@ For example, the DeadEndsCache could say the following:
|
|||
- if we take `g`, then `[f]` is also forbidden
|
||||
- etc.
|
||||
- etc.
|
||||
|
||||
As we traverse the graph, we also traverse the `DeadEndsCache` and keep a list of forbidden
|
||||
conditions in memory. Then, we know to avoid all edges which have a condition that is forbidden.
|
||||
|
||||
|
|
|
@ -58,7 +58,7 @@ pub struct ComputedCondition {
|
|||
/// 2. The cost of traversing this edge
|
||||
/// 3. The condition associated with it
|
||||
/// 4. The list of nodes that have to be skipped
|
||||
/// if this edge is traversed.
|
||||
/// if this edge is traversed.
|
||||
#[derive(Clone)]
|
||||
pub struct Edge<E> {
|
||||
pub source_node: Interned<QueryNode>,
|
||||
|
|
|
@ -14,7 +14,7 @@ This module tests the following properties about the exactness ranking rule:
|
|||
3. those that contain the most exact words from the remaining query
|
||||
|
||||
- if it is followed by other graph-based ranking rules (`typo`, `proximity`, `attribute`).
|
||||
Then these rules will only work with
|
||||
Then these rules will only work with
|
||||
1. the exact terms selected by `exactness
|
||||
2. the full query term otherwise
|
||||
*/
|
||||
|
|
|
@ -4,15 +4,14 @@ This module tests the Proximity ranking rule:
|
|||
1. A proximity of >7 always has the same cost.
|
||||
|
||||
2. Phrase terms can be in sprximity to other terms via their start and end words,
|
||||
but we need to make sure that the phrase exists in the document that meets this
|
||||
proximity condition. This is especially relevant with split words and synonyms.
|
||||
but we need to make sure that the phrase exists in the document that meets this
|
||||
proximity condition. This is especially relevant with split words and synonyms.
|
||||
|
||||
3. An ngram has the same sprximity cost as its component words being consecutive.
|
||||
e.g. `sunflower` equivalent to `sun flower`.
|
||||
e.g. `sunflower` equivalent to `sun flower`.
|
||||
|
||||
4. The prefix databases can be used to find the sprximity between two words, but
|
||||
they store fewer sprximities than the regular word sprximity DB.
|
||||
|
||||
they store fewer sprximities than the regular word sprximity DB.
|
||||
*/
|
||||
|
||||
use std::collections::BTreeMap;
|
||||
|
|
|
@ -11,7 +11,7 @@ This module tests the following properties:
|
|||
8. 2grams can have 1 typo if they are larger than `min_word_len_two_typos`
|
||||
9. 3grams are not typo tolerant (but they can be split into two words)
|
||||
10. The `typo` ranking rule assumes the role of the `words` ranking rule implicitly
|
||||
if `words` doesn't exist before it.
|
||||
if `words` doesn't exist before it.
|
||||
11. The `typo` ranking rule places documents with the same number of typos in the same bucket
|
||||
12. Prefix tolerance costs nothing according to the typo ranking rule
|
||||
13. Split words cost 1 typo according to the typo ranking rule
|
||||
|
|
|
@ -2,11 +2,11 @@
|
|||
This module tests the following properties:
|
||||
|
||||
1. The `last` term matching strategy starts removing terms from the query
|
||||
starting from the end if no more results match it.
|
||||
starting from the end if no more results match it.
|
||||
2. Phrases are never deleted by the `last` term matching strategy
|
||||
3. Duplicate words don't affect the ranking of a document according to the `words` ranking rule
|
||||
4. The proximity of the first and last word of a phrase to its adjacent terms is taken into
|
||||
account by the proximity ranking rule.
|
||||
account by the proximity ranking rule.
|
||||
5. Unclosed double quotes still make a phrase
|
||||
6. The `all` term matching strategy does not remove any term from the query
|
||||
7. The search is capable of returning no results if no documents match the query
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue