The `matching_bytes` function takes a `&Token` now and:
- gets the number of bytes to highlight (unchanged).
- uses `Token.num_graphemes_from_bytes` to get the number of grapheme
clusters to highlight.
In essence, the `matching_bytes` function returns the number of matching
grapheme clusters instead of bytes. Should this function be renamed
then?
Added proper highlighting in the HTTP UI:
- requires dependency on `unicode-segmentation` to extract grapheme
clusters from tokens
- `<mark>` tag is put around only the matched part
- before this change, the entire word was highlighted even if only a
part of it matched
384: Replace memmap with memmap2 r=Kerollmops a=palfrey
[memmap is unmaintained](https://rustsec.org/advisories/RUSTSEC-2020-0077.html) and needs replacing. memmap2 is a drop-in replacement fork that's well maintained. Note that the version numbers got reset on fork, hence the lower values.
Co-authored-by: Tom Parker-Shemilt <palfrey@tevp.net>
250: Add the limit field to http-ui r=Kerollmops a=irevoire
251: Fix the limit r=Kerollmops a=irevoire
There was no check on the limit and thus if a user specified a very large number this line could cause a panic.
Co-authored-by: Tamo <tamo@meilisearch.com>
The front must provide a BTreeSet containing the stop words
The stop_words are set at None if an empty Set is provided
add the stop-words in the http-ui interface
Use maplit in the test
and remove all the useless drop(rtxn) at the end of all tests
Add `the update_id` to the to the updates. The rationale is the
following:
- It allows for better tracability of the update events, thus improved
debugging and logging.
- The enigne is now aware of what he's already processed, and can return
it if asked. It may not make sense now, but in the future, the update
store may not work the same way, and this information about the state
of the engine will be desirable (distributed environement).