Commit Graph

1334 Commits

Author SHA1 Message Date
bors[bot]
9875f2646a
Merge #406
406: return document count from builder r=MarinPostma a=MarinPostma

`DocumentBatchBuilder::finish` now returns the number of documents in the batch. This is more compact that calling `len()` just before calling finish.


Co-authored-by: marin postma <postma.marin@protonmail.com>
2021-10-28 08:42:38 +00:00
marin postma
183d3dada7
return document count from builder 2021-10-28 10:33:04 +02:00
bors[bot]
d7943fe225
Merge #402
402: Optimize document transform r=MarinPostma a=MarinPostma

This pr optimizes the transform of documents additions in the obkv format. Instead on accepting any serializable objects, we instead treat json and CSV specifically:
- For json, we build a serde `Visitor`, that transform the json straight into obkv without intermediate representation.
- For csv, we directly write the lines in the obkv, applying other optimization as well.

Co-authored-by: marin postma <postma.marin@protonmail.com>
2021-10-26 09:55:28 +00:00
bors[bot]
6758146213
Merge #404
404: remove search crate r=Kerollmops a=MarinPostma

The functionalities of the search crate have been moved to the cli crate. The outstanding files are removed by this pr.


Co-authored-by: marin postma <postma.marin@protonmail.com>
2021-10-26 09:40:34 +00:00
marin postma
9b8ab40d80
remove search folder 2021-10-26 11:35:49 +02:00
marin postma
baddd80069
implement review suggestions 2021-10-25 18:29:12 +02:00
marin postma
f9445c1d90
return float parsing error context in csv 2021-10-25 17:27:10 +02:00
bors[bot]
15c29cdd9b
Merge #401
401: Update version for the next release (v0.19.0) r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-25 12:49:53 +00:00
bors[bot]
13d8272173
Merge #403
403: Revert "Replacing pest with nom" r=curquiza a=curquiza

Reverts meilisearch/milli#358

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-25 12:16:49 +00:00
Clémentine Urquizar
208903ddde
Revert "Replacing pest with nom " 2021-10-25 11:58:00 +02:00
Clémentine Urquizar
679fe18b17
Update version for the next release (v0.19.0) 2021-10-25 11:52:17 +02:00
marin postma
3fcccc31b5
add document builder example 2021-10-25 10:26:43 +02:00
marin postma
430e9b13d3
add csv builder tests 2021-10-25 10:26:43 +02:00
marin postma
53c79e85f2
document errors 2021-10-25 10:26:43 +02:00
marin postma
2e62925a6e
fix tests 2021-10-25 10:26:42 +02:00
marin postma
0f86d6b28f
implement csv serialization 2021-10-25 10:26:42 +02:00
marin postma
8d70b01714
optimize document deserialization 2021-10-25 10:26:42 +02:00
Clémentine Urquizar
f8fe9316c0
Update version for the next release (v0.18.1) 2021-10-21 11:56:14 +02:00
bors[bot]
b6af84eb77
Merge #394
394:  Added search_geo benchmark in cron job r=irevoire a=fumblehool

fixes: #392 
`search_geo` cron will run every friday at 18:30

Co-authored-by: Damanpreet Singh <daman.4880@gmail.com>
2021-10-18 14:33:32 +00:00
bors[bot]
7906461c14
Merge #396
396: Fix indexing benchmark GH actions upload filename r=irevoire a=fumblehool

fixes: #393 

Co-authored-by: Damanpreet Singh <daman.4880@gmail.com>
2021-10-18 13:34:10 +00:00
Damanpreet Singh
2e4604b0b9 fixed filename for search_* crons 2021-10-18 18:48:38 +05:30
Damanpreet Singh
4c34164d2e fixed filename for search_geo cron 2021-10-18 18:43:36 +05:30
bors[bot]
9df4f3aaad
Merge #397
397: Fix typo in repo r=curquiza a=saintmalik

Fix the single typo found in this repo

Co-authored-by: SaintMalik <37118134+saintmalik@users.noreply.github.com>
2021-10-18 11:59:48 +00:00
bors[bot]
513d3178c6
Merge #398
398: Update version for the next release (v0.18.2) r=irevoire a=curquiza

Breaking because of https://github.com/meilisearch/milli/pull/358

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-18 11:47:26 +00:00
Clémentine Urquizar
2209acbfe2
Update version for the next release (v0.18.2) 2021-10-18 13:45:48 +02:00
SaintMalik
70121e3c6b fix typo in repo 2021-10-18 04:00:19 +01:00
bors[bot]
59cc59e93e
Merge #358
358: Replacing pest with nom  r=Kerollmops a=CNLHC



Co-authored-by: 刘瀚骋 <cn_lhc@qq.com>
2021-10-16 20:44:38 +00:00
Damanpreet Singh
493d9b98f5 fix indexing benchmark GH actions upload filename 2021-10-16 21:52:36 +05:30
Damanpreet Singh
efaef4f748 Added search_geo benchmark in cron job 2021-10-16 21:41:45 +05:30
刘瀚骋
7666e4f34a follow the suggestions 2021-10-14 21:37:59 +08:00
刘瀚骋
2ea2f7570c use nightly cargo to format the code 2021-10-14 16:46:13 +08:00
刘瀚骋
e750465e15 check logic for geolocation. 2021-10-14 16:12:00 +08:00
bors[bot]
aa5e099718
Merge #390
390: Add helper methods on the settings r=Kerollmops a=irevoire

This would be a good addition to look at the content of a setting without consuming it.
It’s useful for analytics.

Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-10-13 20:36:30 +00:00
bors[bot]
c7db4176f3
Merge #384
384: Replace memmap with memmap2 r=Kerollmops a=palfrey

[memmap is unmaintained](https://rustsec.org/advisories/RUSTSEC-2020-0077.html) and needs replacing. memmap2 is a drop-in replacement fork that's well maintained. Note that the version numbers got reset on fork, hence the lower values.

Co-authored-by: Tom Parker-Shemilt <palfrey@tevp.net>
2021-10-13 13:47:23 +00:00
Irevoire
a3e7c468cd
add helper methods on the settings 2021-10-13 13:05:07 +02:00
刘瀚骋
cd359cd96e WIP: extract the error trait bound to new trait. 2021-10-13 18:04:15 +08:00
刘瀚骋
5de5dd80a3 WIP: remove '_nom' suffix/redundant error enum/... 2021-10-13 11:06:15 +08:00
刘瀚骋
2c65781d91 format 2021-10-12 22:20:22 +08:00
bors[bot]
6e3b869e6a
Merge #388
388: fix primary key inference r=MarinPostma a=MarinPostma

The primary key is was infered from a hashtable index of the field. For this reason the order in which the fields were interated upon was not deterministic, and the primary key was chosed ffrom the first field containing "id".

This fix sorts the the index by field_id when infering the primary key.


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-10-12 09:25:16 +00:00
mpostma
86ead92ed5 infer primary key on sorted fields 2021-10-12 11:15:11 +02:00
mpostma
9a266a531b test correct primary key inference 2021-10-12 11:08:53 +02:00
bors[bot]
3f7f24b90e
Merge #368
368: Remove limit of 1000 position per attribute r=irevoire a=ManyTheFish

Instead of using an arbitrary limit we encode the absolute position in a u32
using one strong u16 for the field id and a weak u16 for the relative position in the attribute.

- [x] check database size difference

below is the database size difference for each dataset:
![Capture d’écran 2021-09-27 à 18 01 44](https://user-images.githubusercontent.com/6482087/134944199-bd25fed0-6c34-475c-9afc-197871e06553.png)

- [ ] check search time on big dataset


Related to [product#202](https://github.com/meilisearch/product/issues/202)

Co-authored-by: many <maxime@meilisearch.com>
2021-10-12 08:30:33 +00:00
many
c5a6075484
Make max_position_per_attributes changable 2021-10-12 10:10:50 +02:00
many
360c5ff3df
Remove limit of 1000 position per attribute
Instead of using an arbitrary limit we encode the absolute position in a u32
using one strong u16 for the field id and a weak u16 for the relative position in the attribute.
2021-10-12 10:10:50 +02:00
刘瀚骋
d323e35001 add a test case 2021-10-12 13:30:40 +08:00
刘瀚骋
70f576d5d3 error handling 2021-10-12 13:30:40 +08:00
刘瀚骋
28f9be8d7c support syntax 2021-10-12 13:30:40 +08:00
刘瀚骋
469d92c569 tweak error handling 2021-10-12 13:30:40 +08:00
刘瀚骋
7a90a101ee reorganize parser logic 2021-10-12 13:30:40 +08:00
刘瀚骋
f7796edc7e remove everything about pest 2021-10-12 13:30:40 +08:00