278 Commits

Author SHA1 Message Date
ad hoc
b594d49def
add IndexResolver BatchHandler tests 2022-06-09 14:59:19 +02:00
ad hoc
fbba67fbe9
add mocker to IndexResolver 2022-06-09 14:59:19 +02:00
bors[bot]
b9b32d65a8
Merge #2494
2494: Introduce the new faceting and pagination settings r=ManyTheFish a=Kerollmops

This PR introduces two new settings following the newly created spec https://github.com/meilisearch/specifications/pull/157:
 - The `faceting.max_values_per_facet` one describes the maximum number of values (each with a count) associated with a value in a facet distribution query.
 - The `pagination.limited_to` one describes the maximum number of documents that a search query can ever return.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-06-09 12:09:21 +00:00
bors[bot]
2b2e571c76
Merge #2460
2460: Create custom error types for `TaskType`, `TaskStatus`, and `IndexUid` r=Kerollmops a=walterbm

# Pull Request

## What does this PR do?
Fixes #2443 by making the following changes:

- Add custom `TaskTypeError` for `TaskType::from_str` 
- Add custom `TaskStatusError` for `TaskStatus::from_str`
- Add custom `IndexUidFormatError` for `IndexUid::from_str`
- Implement `From<IndexUidFormatError> for IndexResolverError` to convert between errors
- Replace all usages of `IndexUid::new` with `IndexUid::from_str`
    - **NOTE** I am relatively new to Rust and I struggled a lot with this final part. This PR ended up with a messy error conversion which does not seem ideal. Please let me know if you have any suggestions for how to make this better and I'll be happy to make any updates!

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: walter <walter.beller.morales@gmail.com>
2022-06-09 09:10:28 +00:00
Kerollmops
1e3dcbea3f
Plug the pagination.limited_to setting 2022-06-09 10:56:42 +02:00
Kerollmops
b96399d24b
Plug the faceting.max_values_per_facet setting 2022-06-09 10:56:42 +02:00
Kerollmops
5450b5ced3
Add the faceting.max_values_per_facet setting 2022-06-09 10:54:32 +02:00
walter
96d4fd54bb Change the index uid format check for better legibility 2022-06-08 19:58:47 -04:00
walter
2b944ecd89 Remove IndexUid::new and replace with IndexUid::from_str 2022-06-08 19:56:01 -04:00
bors[bot]
db42268888
Merge #2473
2473: fix blocking in dumps r=irevoire a=MarinPostma

This PR fixes two blocking calls in the dump process.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-06-08 17:14:46 +00:00
ad hoc
108b3520de
fix blocking auth controller dump 2022-06-08 18:19:29 +02:00
ManyTheFish
f5306eb5b0 Set disabled_words to default when Index::exact_words returns None 2022-06-08 14:38:09 +02:00
ManyTheFish
173eea06e1 Replace old tokenizer by charabia 2022-06-08 14:38:09 +02:00
ad hoc
cbd27d313c
fix blocking writing of meta file in dump 2022-06-07 10:07:40 +02:00
ad hoc
6ac8675c6d
add IndexResolver BatchHandler tests 2022-06-07 09:33:57 +02:00
ad hoc
df61ca9cae
add mocker to IndexResolver 2022-06-07 09:33:57 +02:00
ad hoc
bbd685af5e
move IndexResolver to real module 2022-06-07 09:33:56 +02:00
Kerollmops
10d3b367dc
Simplify the const default values 2022-06-06 10:06:00 +02:00
walter
ba55905377 Add custom IndexUidFormatError for IndexUid 2022-06-05 02:26:48 -04:00
bors[bot]
953a209f02
Merge #2447
2447: move index uid in task content r=Kerollmops a=MarinPostma

this pr moves the index_uid from the `Task` to the `TaskContent`. This is because the task can now have content that do not target a particular index.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-06-02 13:54:09 +00:00
ad hoc
0c5352fc22
move index_uid from task to task_content 2022-06-02 15:30:35 +02:00
Irevoire
4667c9fe1a
fix(http): Fix the query parameter in the Documents route 2022-06-02 14:10:44 +02:00
bors[bot]
c9cd1738a5
Merge #2445
2445: Seek-based tasks list r=Kerollmops a=Kerollmops

This PR implements the seek-based pagination for the tasks list following [the spec](https://github.com/meilisearch/specifications/pull/115).

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-06-02 10:25:54 +00:00
Kerollmops
df721b2e9e
Scheduler must not reverse the order of the fetched tasks 2022-06-01 17:16:15 +02:00
ManyTheFish
1816db8c1f Move dump v4 patcher into v4.rs 2022-06-01 16:17:43 +02:00
ManyTheFish
70916d6596 Patch dump v4 2022-06-01 16:08:42 +02:00
Kerollmops
c11d21879a
Introduce tasks limit and after to the tasks route 2022-06-01 13:26:36 +02:00
Kerollmops
461b91fd13
Introduce the fetch_unfinished_tasks function to fetch tasks 2022-06-01 12:09:52 +02:00
bors[bot]
e81c7aa2e6
Merge #2423
2423: Paginate the index resource r=MarinPostma a=irevoire

Fix #2373


Co-authored-by: Irevoire <tamo@meilisearch.com>
2022-05-31 19:25:25 +00:00
bors[bot]
47007fa71b
Merge #2446
2446: rename Succeded to Succeeded r=irevoire a=MarinPostma

this pr renames `TaskEvent::Succeded` to `TaskEvent::Succeeded` and apply the migration to the dumps


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-05-31 18:27:02 +00:00
Irevoire
627f13df85
feat(http): paginate the index resource
Fix #2373
2022-05-31 18:11:45 +02:00
ad hoc
446f1f31e0
rename Succeded to Succeeded 2022-05-31 17:22:37 +02:00
Irevoire
ddad6cc069
feat(http): update the documents resource
- Return Documents API resources on `/documents` in an array in the the results field.
- Add limit, offset and total in the response body.
- Rename `attributesToRetrieve` into `fields` (only for the `/documents` endpoints, not for the `/search` ones).
- The `displayedAttributes` settings does not impact anymore the displayed fields returned in the `/documents` endpoints. These settings only impacts the `/search` endpoint.

Fix #2372
2022-05-31 16:40:40 +02:00
Kerollmops
3684c822f1
Add indexUid filtering on the /tasks route 2022-05-31 11:33:20 +02:00
ManyTheFish
deba0cc096 Make v4::load_dump copy each part a the dump 2022-05-31 10:24:44 +02:00
ad hoc
26e7bdf702
add boilerplate for dump v5 2022-05-30 17:25:29 +02:00
ad hoc
1e310ecc7d
fix typo in docstring
Co-authored-by: Tamo <tamo@meilisearch.com>
2022-05-30 14:34:49 +02:00
ad hoc
4cb2c6ef1e
use map_or instead of map + unwrap_or 2022-05-30 12:30:15 +02:00
ad hoc
a9ef399a6b
processing::Nothing return BatchContent::Empty instead of panic 2022-05-26 12:04:27 +02:00
ad hoc
5a2972fc19
use TaskEvent method instead of variants in BatchHandler impl 2022-05-26 11:51:58 +02:00
ad hoc
1647ca3c1f
fix clipy warnings 2022-05-25 15:07:52 +02:00
ad hoc
74a1f88d88
add test for dump processing order 2022-05-25 14:57:36 +02:00
ad hoc
f58507379a
fix dump priority in scheduler 2022-05-25 14:50:14 +02:00
ad hoc
6b2016b350
remove typo in BatchContent variant 2022-05-25 14:39:07 +02:00
ad hoc
3015265bde
remove useless dump errors 2022-05-25 14:37:10 +02:00
ad hoc
49d8fadb52
test dump handler 2022-05-25 14:32:12 +02:00
ad hoc
127171c812
impl Default on Processing 2022-05-25 14:10:39 +02:00
ad hoc
92d86ce6aa
add tests to IndexResolver BatchHandler 2022-05-25 11:13:36 +02:00
ad hoc
3c85b29865
add doc to BatchHandler 2022-05-25 11:13:35 +02:00
ad hoc
8349f38197
remove unused file 2022-05-25 11:13:35 +02:00
ad hoc
64654ef7c3
rename batch_handler to handler 2022-05-25 11:13:35 +02:00
ad hoc
0f9c134114
fix tests 2022-05-25 11:13:35 +02:00
ad hoc
7b47e4e87a
snapshot batch handler 2022-05-25 11:13:35 +02:00
ad hoc
8743d73973
move DumpHandler to own module 2022-05-25 11:13:35 +02:00
ad hoc
f0aceb4fba
remove unused files 2022-05-25 11:13:35 +02:00
ad hoc
61035a3ea4
create dump v5 2022-05-25 11:13:34 +02:00
ad hoc
57fde30b91
handle dump 2022-05-25 11:13:34 +02:00
ad hoc
56eb2907c9
dump indexes 2022-05-25 11:13:34 +02:00
ad hoc
414d0907ce
register dump handler 2022-05-25 11:13:34 +02:00
ad hoc
60a8249de6
add dump batch handler 2022-05-25 11:13:34 +02:00
ad hoc
46cdc17701
make scheduler accept multiple batch handlers 2022-05-25 11:13:34 +02:00
ad hoc
6a0231cb28
perform dump method 2022-05-25 11:13:33 +02:00
ad hoc
7fa3eb1003
register dump tasks 2022-05-25 11:13:33 +02:00
ad hoc
2f0625a984
register and insert dump task in scheduler 2022-05-25 11:13:33 +02:00
ad hoc
737b891a41
introduce Dump TaskListIdentifier variant 2022-05-25 11:13:33 +02:00
ad hoc
5a5066023b
introduce TaskListIdentifier 2022-05-25 11:13:33 +02:00
ad hoc
aa50acb031
make Task index_uid an option
Not all task relate to an index. Tasks that don't have an index_uid set
to None
2022-05-25 11:13:32 +02:00
bors[bot]
341756a0eb
Merge #2357
2357: chore(dump): add dump tests r=Kerollmops a=irevoire

Add tests on the import of dump v1, v2, v3 and v4.

Since the dumps are slow to decompress, I made the `flate2` crate always compile in optimized.
And since they're also slow to index, I also made the `milli` crate always compile in optimized. What do you think of this `@MarinPostma?`
Should we keep milli unoptimized in case it could help us debug some things? 👀 

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-05-24 12:24:29 +00:00
Tamo
5f0e9b63d2
chore(dump): add tests 2022-05-24 14:21:56 +02:00
Irevoire
4e9accdeb7
chore(search): rename in the search endpoint
Fix ##2376
2022-05-19 16:31:37 +02:00
ManyTheFish
50763aac82 Fix clippy 2022-05-19 11:23:22 +02:00
ManyTheFish
0250ea9157 Intergrate smart crop in Meilisearch 2022-05-18 18:35:51 +02:00
Tamo
85d19bfb3e
chore: bump milli 2022-05-16 18:43:35 +02:00
ad hoc
5670b4d012
fix dump import error 2022-05-16 14:33:33 +02:00
ad hoc
6025372565
fix(lib): Check db presence after dumps 2022-04-27 10:41:09 +02:00
bors[bot]
4a9000bb96
Merge #2332
2332: fix(search): formatted field r=curquiza a=irevoire

fix #2318

Co-authored-by: Irevoire <tamo@meilisearch.com>
2022-04-20 14:59:41 +00:00
Irevoire
58a1124e9a
fix(search): formatted field 2022-04-20 11:30:01 +02:00
ad hoc
9b064e53e7
fix(http, lib): rename_min_word_length_for_typo into rename_min_word_size_for_typo 2022-04-17 10:02:56 +02:00
bors[bot]
b1333ab5b0
Merge #2320
2320: chore(http, lib): rename typo to typo_tolerance r=irevoire a=MarinPostma

fix #2319


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-14 09:50:39 +00:00
ad hoc
276dc6043a
chore(http, lib): rename typo to typo_tolerance 2022-04-14 10:42:06 +02:00
bors[bot]
6c06fb226d
Merge #2307
2307: Feat(Analytics): Add analytics for search format options r=irevoire a=ManyTheFish

Specification: [#120](https://github.com/meilisearch/specifications/pull/120) ([f5c6a8e](f5c6a8e183))

fix #2308

Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-04-13 12:01:52 +00:00
Tamo
2ee210483f
fix(search): remove the back and forth between the IndexMap and the serde_json::Map
This is ok because we're using the preserve_order feature in serde_json which is already internally using an IndexMap.
2022-04-12 16:12:52 +02:00
ManyTheFish
0990e95830 Feat(Analytics): Add analytics for search format options 2022-04-11 14:53:15 +02:00
Tamo
69d312209e
feat(search): Implements the nested fields
See https://github.com/meilisearch/specifications/pull/121
2022-04-07 19:47:20 +02:00
bors[bot]
013fe4cbc9
Merge #2297
2297: Feat(Search): Enhance formating search results r=ManyTheFish a=ManyTheFish

Add new settings and change crop_len behavior to count words instead of characters.

- [x] `highlightPreTag`
- [x] `highlightPostTag`
- [x] `cropMarker`
- [x] `cropLength` count word instead of chars
- [x] `cropLength` 0 is now considered as no `cropLength`
- [ ] ~smart crop finding the best matches interval~ (postponed)

Partially fixes  #2214. (no smart crop)


Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-04-07 13:29:56 +00:00
ManyTheFish
dc2cc1ee89 Feat(Search): Enhance formating search results 2022-04-07 15:04:08 +02:00
ad hoc
67dea08a0a
feat(http, lib): enable disable typos on attributes 2022-04-06 19:25:12 +02:00
ad hoc
e9f66b8766
feat(all): introduce disable typo on words 2022-04-06 19:16:36 +02:00
ad hoc
dd43ba6234
feat(all): introduce disable typos 2022-04-06 19:10:12 +02:00
ad hoc
27a88bcd47
feat(all): introduce minWordLengthForTypo
fix typo in settting

skip serializing not set typo settings
2022-04-06 19:03:24 +02:00
ad hoc
981fba5b44
feat(all): introduce disable typos 2022-04-06 15:47:48 +02:00
ad hoc
a523828f61
chore(lib): bump milli to 0.25.0 2022-04-06 15:03:10 +02:00
bors[bot]
9e344f6576
Merge #2207
2207: Fix: avoid embedding the user input into the error response. r=Kerollmops a=CNLHC

# Pull Request

## What does this PR do?
Fix #2107. 

The problem is meilisearch embeds the user input to the error message. 

The reason for this problem is `milli` throws a `serde_json: Error` whose `Display` implementation will do this embedding.  

I tried to solve this problem in this PR by manually implementing the `Display` trait for `DocumentFormatError` instead of deriving automatically.

<!-- Please link the issue you're trying to fix with this PR, if none then please create an issue first. -->

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Liu Hancheng <cn_lhc@qq.com>
Co-authored-by: LiuHanCheng <2463765697@qq.com>
2022-04-04 17:35:17 +00:00
bors[bot]
09a72cee03
Merge #2281
2281: Hard limit the number of results returned by a search r=Kerollmops a=Kerollmops

This PR fixes #2133 by hard-limiting the number of results that a search request can return at any time. I would like the guidance of `@MarinPostma` to test that, should I use a mocking test here? Or should I do anything else?

I talked about touching the _nb_hits_ value with `@qdequele` and we concluded that it was not correct to do so.

Could you please confirm that it is the right place to change that?

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-04-04 17:19:05 +00:00
Liu Hancheng
7ece7a9d9e change truncate strategy and coresponding test 2022-03-31 10:39:21 +08:00
LiuHanCheng
b28aa8e666
Update meilisearch-lib/src/document_formats.rs
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2022-03-31 10:14:13 +08:00
2shiori17
98107565c0 Add more detailed comments for max_indexing_threads 2022-03-31 09:32:45 +09:00
2shiori17
a2d7c16f91 Remove indexing_jobs option 2022-03-31 09:27:29 +09:00
shiori
9edd407a88
Merge branch 'main' into add-instance-options 2022-03-31 02:38:07 +09:00
Kerollmops
8bc6e8dcf9
Make sure that offsets are clamped too 2022-03-30 10:06:15 -07:00