Move crates under a sub folder to clean up the code

This commit is contained in:
Clément Renault 2024-10-21 08:18:43 +02:00
parent 30f3c30389
commit 9c1e54a2c8
No known key found for this signature in database
GPG key ID: F250A4C4E3AE5F5F
1062 changed files with 19 additions and 20 deletions

1
crates/benchmarks/.gitignore vendored Normal file
View file

@ -0,0 +1 @@
benches/datasets_paths.rs

View file

@ -0,0 +1,50 @@
[package]
name = "benchmarks"
publish = false
version.workspace = true
authors.workspace = true
description.workspace = true
homepage.workspace = true
readme.workspace = true
edition.workspace = true
license.workspace = true
[dependencies]
anyhow = "1.0.86"
csv = "1.3.0"
milli = { path = "../milli" }
mimalloc = { version = "0.1.43", default-features = false }
serde_json = { version = "1.0.120", features = ["preserve_order"] }
[dev-dependencies]
criterion = { version = "0.5.1", features = ["html_reports"] }
rand = "0.8.5"
rand_chacha = "0.3.1"
roaring = "0.10.6"
[build-dependencies]
anyhow = "1.0.86"
bytes = "1.6.0"
convert_case = "0.6.0"
flate2 = "1.0.30"
reqwest = { version = "0.12.5", features = ["blocking", "rustls-tls"], default-features = false }
[features]
default = ["milli/all-tokenizations"]
[[bench]]
name = "search_songs"
harness = false
[[bench]]
name = "search_wiki"
harness = false
[[bench]]
name = "search_geo"
harness = false
[[bench]]
name = "indexing"
harness = false

138
crates/benchmarks/README.md Normal file
View file

@ -0,0 +1,138 @@
Benchmarks
==========
## TOC
- [Run the benchmarks](#run-the-benchmarks)
- [Comparison between benchmarks](#comparison-between-benchmarks)
- [Datasets](#datasets)
## Run the benchmarks
### On our private server
The Meili team has self-hosted his own GitHub runner to run benchmarks on our dedicated bare metal server.
To trigger the benchmark workflow:
- Go to the `Actions` tab of this repository.
- Select the `Benchmarks` workflow on the left.
- Click on `Run workflow` in the blue banner.
- Select the branch on which you want to run the benchmarks and select the dataset you want (default: `songs`).
- Finally, click on `Run workflow`.
This GitHub workflow will run the benchmarks and push the `critcmp` report to a DigitalOcean Space (= S3).
The name of the uploaded file is displayed in the workflow.
_[More about critcmp](https://github.com/BurntSushi/critcmp)._
💡 To compare the just-uploaded benchmark with another one, check out the [next section](#comparison-between-benchmarks).
### On your machine
To run all the benchmarks (~5h):
```bash
cargo bench
```
To run only the `search_songs` (~1h), `search_wiki` (~3h), `search_geo` (~20m) or `indexing` (~2h) benchmark:
```bash
cargo bench --bench <dataset name>
```
By default, the benchmarks will be downloaded and uncompressed automatically in the target directory.<br>
If you don't want to download the datasets every time you update something on the code, you can specify a custom directory with the environment variable `MILLI_BENCH_DATASETS_PATH`:
```bash
mkdir ~/datasets
MILLI_BENCH_DATASETS_PATH=~/datasets cargo bench --bench search_songs # the four datasets are downloaded
touch build.rs
MILLI_BENCH_DATASETS_PATH=~/datasets cargo bench --bench songs # the code is compiled again but the datasets are not downloaded
```
## Comparison between benchmarks
The benchmark reports we push are generated with `critcmp`. Thus, we use `critcmp` to show the result of a benchmark, or compare results between multiple benchmarks.
We provide a script to download and display the comparison report.
Requirements:
- `grep`
- `curl`
- [`critcmp`](https://github.com/BurntSushi/critcmp)
List the available file in the DO Space:
```bash
./benchmarks/script/list.sh
```
```bash
songs_main_09a4321.json
songs_geosearch_24ec456.json
search_songs_main_cb45a10b.json
```
Run the comparison script:
```bash
# we get the result of ONE benchmark, this give you an idea of how much time an operation took
./benchmarks/scripts/compare.sh son songs_geosearch_24ec456.json
# we compare two benchmarks
./benchmarks/scripts/compare.sh songs_main_09a4321.json songs_geosearch_24ec456.json
# we compare three benchmarks
./benchmarks/scripts/compare.sh songs_main_09a4321.json songs_geosearch_24ec456.json search_songs_main_cb45a10b.json
```
## Datasets
The benchmarks uses the following datasets:
- `smol-songs`
- `smol-wiki`
- `movies`
- `smol-all-countries`
### Songs
`smol-songs` is a subset of the [`songs.csv` dataset](https://milli-benchmarks.fra1.digitaloceanspaces.com/datasets/songs.csv.gz).
It was generated with this command:
```bash
xsv sample --seed 42 1000000 songs.csv -o smol-songs.csv
```
_[Download the generated `smol-songs` dataset](https://milli-benchmarks.fra1.digitaloceanspaces.com/datasets/smol-songs.csv.gz)._
### Wiki
`smol-wiki` is a subset of the [`wikipedia-articles.csv` dataset](https://milli-benchmarks.fra1.digitaloceanspaces.com/datasets/wiki-articles.csv.gz).
It was generated with the following command:
```bash
xsv sample --seed 42 500000 wiki-articles.csv -o smol-wiki-articles.csv
```
_[Download the `smol-wiki` dataset](https://milli-benchmarks.fra1.digitaloceanspaces.com/datasets/smol-wiki-articles.csv.gz)._
### Movies
`movies` is a really small dataset we uses as our example in the [getting started](https://www.meilisearch.com/docs/learn/getting_started/quick_start)
_[Download the `movies` dataset](https://www.meilisearch.com/movies.json)._
### All Countries
`smol-all-countries` is a subset of the [`all-countries.csv` dataset](https://milli-benchmarks.fra1.digitaloceanspaces.com/datasets/all-countries.csv.gz)
It has been converted to jsonlines and then edited so it matches our format for the `_geo` field.
It was generated with the following command:
```bash
bat all-countries.csv.gz | gunzip | xsv sample --seed 42 1000000 | csv2json-lite | sd '"latitude":"(.*?)","longitude":"(.*?)"' '"_geo": { "lat": $1, "lng": $2 }' | sd '\[|\]|,$' '' | gzip > smol-all-countries.jsonl.gz
```
_[Download the `smol-all-countries` dataset](https://milli-benchmarks.fra1.digitaloceanspaces.com/datasets/smol-all-countries.jsonl.gz)._

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,122 @@
mod datasets_paths;
mod utils;
use criterion::{criterion_group, criterion_main};
use milli::update::Settings;
use utils::Conf;
#[global_allocator]
static ALLOC: mimalloc::MiMalloc = mimalloc::MiMalloc;
fn base_conf(builder: &mut Settings) {
let displayed_fields =
["geonameid", "name", "asciiname", "alternatenames", "_geo", "population"]
.iter()
.map(|s| s.to_string())
.collect();
builder.set_displayed_fields(displayed_fields);
let searchable_fields =
["name", "alternatenames", "elevation"].iter().map(|s| s.to_string()).collect();
builder.set_searchable_fields(searchable_fields);
let filterable_fields =
["_geo", "population", "elevation"].iter().map(|s| s.to_string()).collect();
builder.set_filterable_fields(filterable_fields);
let sortable_fields =
["_geo", "population", "elevation"].iter().map(|s| s.to_string()).collect();
builder.set_sortable_fields(sortable_fields);
}
#[rustfmt::skip]
const BASE_CONF: Conf = Conf {
dataset: datasets_paths::SMOL_ALL_COUNTRIES,
dataset_format: "jsonl",
queries: &[
"",
],
configure: base_conf,
primary_key: Some("geonameid"),
..Conf::BASE
};
fn bench_geo(c: &mut criterion::Criterion) {
#[rustfmt::skip]
let confs = &[
// A basic placeholder with no geo
utils::Conf {
group_name: "placeholder with no geo",
..BASE_CONF
},
// Medium aglomeration: probably the most common usecase
utils::Conf {
group_name: "asc sort from Lille",
sort: Some(vec!["_geoPoint(50.62999333378238, 3.086269263384099):asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "desc sort from Lille",
sort: Some(vec!["_geoPoint(50.62999333378238, 3.086269263384099):desc"]),
..BASE_CONF
},
// Big agglomeration: a lot of documents close to our point
utils::Conf {
group_name: "asc sort from Tokyo",
sort: Some(vec!["_geoPoint(35.749512532692144, 139.61664952543356):asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "desc sort from Tokyo",
sort: Some(vec!["_geoPoint(35.749512532692144, 139.61664952543356):desc"]),
..BASE_CONF
},
// The furthest point from any civilization
utils::Conf {
group_name: "asc sort from Point Nemo",
sort: Some(vec!["_geoPoint(-48.87561645055408, -123.39275749319793):asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "desc sort from Point Nemo",
sort: Some(vec!["_geoPoint(-48.87561645055408, -123.39275749319793):desc"]),
..BASE_CONF
},
// Filters
utils::Conf {
group_name: "filter of 100km from Lille",
filter: Some("_geoRadius(50.62999333378238, 3.086269263384099, 100000)"),
..BASE_CONF
},
utils::Conf {
group_name: "filter of 1km from Lille",
filter: Some("_geoRadius(50.62999333378238, 3.086269263384099, 1000)"),
..BASE_CONF
},
utils::Conf {
group_name: "filter of 100km from Tokyo",
filter: Some("_geoRadius(35.749512532692144, 139.61664952543356, 100000)"),
..BASE_CONF
},
utils::Conf {
group_name: "filter of 1km from Tokyo",
filter: Some("_geoRadius(35.749512532692144, 139.61664952543356, 1000)"),
..BASE_CONF
},
utils::Conf {
group_name: "filter of 100km from Point Nemo",
filter: Some("_geoRadius(-48.87561645055408, -123.39275749319793, 100000)"),
..BASE_CONF
},
utils::Conf {
group_name: "filter of 1km from Point Nemo",
filter: Some("_geoRadius(-48.87561645055408, -123.39275749319793, 1000)"),
..BASE_CONF
},
];
utils::run_benches(c, confs);
}
criterion_group!(benches, bench_geo);
criterion_main!(benches);

View file

@ -0,0 +1,196 @@
mod datasets_paths;
mod utils;
use criterion::{criterion_group, criterion_main};
use milli::update::Settings;
use utils::Conf;
#[global_allocator]
static ALLOC: mimalloc::MiMalloc = mimalloc::MiMalloc;
fn base_conf(builder: &mut Settings) {
let displayed_fields =
["id", "title", "album", "artist", "genre", "country", "released", "duration"]
.iter()
.map(|s| s.to_string())
.collect();
builder.set_displayed_fields(displayed_fields);
let searchable_fields = ["title", "album", "artist"].iter().map(|s| s.to_string()).collect();
builder.set_searchable_fields(searchable_fields);
let faceted_fields = ["released-timestamp", "duration-float", "genre", "country", "artist"]
.iter()
.map(|s| s.to_string())
.collect();
builder.set_filterable_fields(faceted_fields);
}
#[rustfmt::skip]
const BASE_CONF: Conf = Conf {
dataset: datasets_paths::SMOL_SONGS,
queries: &[
"john ", // 9097
"david ", // 4794
"charles ", // 1957
"david bowie ", // 1200
"michael jackson ", // 600
"thelonious monk ", // 303
"charles mingus ", // 142
"marcus miller ", // 60
"tamo ", // 13
"Notstandskomitee ", // 4
],
configure: base_conf,
primary_key: Some("id"),
..Conf::BASE
};
fn bench_songs(c: &mut criterion::Criterion) {
let default_criterion: Vec<String> =
milli::default_criteria().iter().map(|criteria| criteria.to_string()).collect();
let default_criterion = default_criterion.iter().map(|s| s.as_str());
let asc_default: Vec<&str> =
std::iter::once("released-timestamp:asc").chain(default_criterion.clone()).collect();
let desc_default: Vec<&str> =
std::iter::once("released-timestamp:desc").chain(default_criterion.clone()).collect();
let basic_with_quote: Vec<String> = BASE_CONF
.queries
.iter()
.map(|s| {
s.trim().split(' ').map(|s| format!(r#""{}""#, s)).collect::<Vec<String>>().join(" ")
})
.collect();
let basic_with_quote: &[&str] =
&basic_with_quote.iter().map(|s| s.as_str()).collect::<Vec<&str>>();
#[rustfmt::skip]
let confs = &[
/* first we bench each criterion alone */
utils::Conf {
group_name: "proximity",
queries: &[
"black saint sinner lady ",
"les dangeureuses 1960 ",
"The Disneyland Sing-Along Chorus ",
"Under Great Northern Lights ",
"7000 Danses Un Jour Dans Notre Vie ",
],
criterion: Some(&["proximity"]),
optional_words: false,
..BASE_CONF
},
utils::Conf {
group_name: "typo",
queries: &[
"mongus ",
"thelonius monk ",
"Disnaylande ",
"the white striper ",
"indochie ",
"indochien ",
"klub des loopers ",
"fear of the duck ",
"michel depech ",
"stromal ",
"dire straights ",
"Arethla Franklin ",
],
criterion: Some(&["typo"]),
optional_words: false,
..BASE_CONF
},
utils::Conf {
group_name: "words",
queries: &[
"the black saint and the sinner lady and the good doggo ", // four words to pop
"les liaisons dangeureuses 1793 ", // one word to pop
"The Disneyland Children's Sing-Alone song ", // two words to pop
"seven nation mummy ", // one word to pop
"7000 Danses / Le Baiser / je me trompe de mots ", // four words to pop
"Bring Your Daughter To The Slaughter but now this is not part of the title ", // nine words to pop
"whathavenotnsuchforth and a good amount of words to pop to match the first one ", // 13
],
criterion: Some(&["words"]),
..BASE_CONF
},
utils::Conf {
group_name: "asc",
criterion: Some(&["released-timestamp:desc"]),
..BASE_CONF
},
utils::Conf {
group_name: "desc",
criterion: Some(&["released-timestamp:desc"]),
..BASE_CONF
},
/* then we bench the asc and desc criterion on top of the default criterion */
utils::Conf {
group_name: "asc + default",
criterion: Some(&asc_default[..]),
..BASE_CONF
},
utils::Conf {
group_name: "desc + default",
criterion: Some(&desc_default[..]),
..BASE_CONF
},
/* we bench the filters with the default request */
utils::Conf {
group_name: "basic filter: <=",
filter: Some("released-timestamp <= 946728000"), // year 2000
..BASE_CONF
},
utils::Conf {
group_name: "basic filter: TO",
filter: Some("released-timestamp 946728000 TO 1262347200"), // year 2000 to 2010
..BASE_CONF
},
utils::Conf {
group_name: "big filter",
filter: Some("released-timestamp != 1262347200 AND (NOT (released-timestamp = 946728000)) AND (duration-float = 1 OR (duration-float 1.1 TO 1.5 AND released-timestamp > 315576000))"),
..BASE_CONF
},
/* the we bench some global / normal search with all the default criterion in the default
* order */
utils::Conf {
group_name: "basic placeholder",
queries: &[""],
..BASE_CONF
},
utils::Conf {
group_name: "basic without quote",
queries: &BASE_CONF
.queries
.iter()
.map(|s| s.trim()) // we remove the space at the end of each request
.collect::<Vec<&str>>(),
..BASE_CONF
},
utils::Conf {
group_name: "basic with quote",
queries: basic_with_quote,
..BASE_CONF
},
utils::Conf {
group_name: "prefix search",
queries: &[
"s", // 500k+ results
"a", //
"b", //
"i", //
"x", // only 7k results
],
..BASE_CONF
},
];
utils::run_benches(c, confs);
}
criterion_group!(benches, bench_songs);
criterion_main!(benches);

View file

@ -0,0 +1,129 @@
mod datasets_paths;
mod utils;
use criterion::{criterion_group, criterion_main};
use milli::update::Settings;
use utils::Conf;
#[global_allocator]
static ALLOC: mimalloc::MiMalloc = mimalloc::MiMalloc;
fn base_conf(builder: &mut Settings) {
let displayed_fields = ["title", "body", "url"].iter().map(|s| s.to_string()).collect();
builder.set_displayed_fields(displayed_fields);
let searchable_fields = ["title", "body"].iter().map(|s| s.to_string()).collect();
builder.set_searchable_fields(searchable_fields);
}
#[rustfmt::skip]
const BASE_CONF: Conf = Conf {
dataset: datasets_paths::SMOL_WIKI_ARTICLES,
queries: &[
"mingus ", // 46 candidates
"miles davis ", // 159
"rock and roll ", // 1007
"machine ", // 3448
"spain ", // 7002
"japan ", // 10.593
"france ", // 17.616
"film ", // 24.959
],
configure: base_conf,
..Conf::BASE
};
fn bench_songs(c: &mut criterion::Criterion) {
let basic_with_quote: Vec<String> = BASE_CONF
.queries
.iter()
.map(|s| {
s.trim().split(' ').map(|s| format!(r#""{}""#, s)).collect::<Vec<String>>().join(" ")
})
.collect();
let basic_with_quote: &[&str] =
&basic_with_quote.iter().map(|s| s.as_str()).collect::<Vec<&str>>();
#[rustfmt::skip]
let confs = &[
/* first we bench each criterion alone */
utils::Conf {
group_name: "proximity",
queries: &[
"herald sings ",
"april paris ",
"tea two ",
"diesel engine ",
],
criterion: Some(&["proximity"]),
optional_words: false,
..BASE_CONF
},
utils::Conf {
group_name: "typo",
queries: &[
"migrosoft ",
"linax ",
"Disnaylande ",
"phytogropher ",
"nympalidea ",
"aritmetric ",
"the fronce ",
"sisan ",
],
criterion: Some(&["typo"]),
optional_words: false,
..BASE_CONF
},
utils::Conf {
group_name: "words",
queries: &[
"the black saint and the sinner lady and the good doggo ", // four words to pop, 27 results
"Kameya Tokujirō mingus monk ", // two words to pop, 55
"Ulrich Hensel meilisearch milli ", // two words to pop, 306
"Idaho Bellevue pizza ", // one word to pop, 800
"Abraham machin ", // one word to pop, 1141
],
criterion: Some(&["words"]),
..BASE_CONF
},
/* the we bench some global / normal search with all the default criterion in the default
* order */
utils::Conf {
group_name: "basic placeholder",
queries: &[""],
..BASE_CONF
},
utils::Conf {
group_name: "basic without quote",
queries: &BASE_CONF
.queries
.iter()
.map(|s| s.trim()) // we remove the space at the end of each request
.collect::<Vec<&str>>(),
..BASE_CONF
},
utils::Conf {
group_name: "basic with quote",
queries: basic_with_quote,
..BASE_CONF
},
utils::Conf {
group_name: "prefix search",
queries: &[
"t", // 453k results
"c", // 405k
"g", // 318k
"j", // 227k
"q", // 71k
"x", // 17k
],
..BASE_CONF
},
];
utils::run_benches(c, confs);
}
criterion_group!(benches, bench_songs);
criterion_main!(benches);

View file

@ -0,0 +1,256 @@
#![allow(dead_code)]
use std::fs::{create_dir_all, remove_dir_all, File};
use std::io::{self, BufRead, BufReader, Cursor, Read, Seek};
use std::num::ParseFloatError;
use std::path::Path;
use std::str::FromStr;
use criterion::BenchmarkId;
use milli::documents::{DocumentsBatchBuilder, DocumentsBatchReader};
use milli::heed::EnvOpenOptions;
use milli::update::{
IndexDocuments, IndexDocumentsConfig, IndexDocumentsMethod, IndexerConfig, Settings,
};
use milli::{Criterion, Filter, Index, Object, TermsMatchingStrategy};
use serde_json::Value;
pub struct Conf<'a> {
/// where we are going to create our database.mmdb directory
/// each benchmark will first try to delete it and then recreate it
pub database_name: &'a str,
/// the dataset to be used, it must be an uncompressed csv
pub dataset: &'a str,
/// The format of the dataset
pub dataset_format: &'a str,
pub group_name: &'a str,
pub queries: &'a [&'a str],
/// here you can change which criterion are used and in which order.
/// - if you specify something all the base configuration will be thrown out
/// - if you don't specify anything (None) the default configuration will be kept
pub criterion: Option<&'a [&'a str]>,
/// the last chance to configure your database as you want
pub configure: fn(&mut Settings),
pub filter: Option<&'a str>,
pub sort: Option<Vec<&'a str>>,
/// enable or disable the optional words on the query
pub optional_words: bool,
/// primary key, if there is None we'll auto-generate docids for every documents
pub primary_key: Option<&'a str>,
}
impl Conf<'_> {
pub const BASE: Self = Conf {
database_name: "benches.mmdb",
dataset_format: "csv",
dataset: "",
group_name: "",
queries: &[],
criterion: None,
configure: |_| (),
filter: None,
sort: None,
optional_words: true,
primary_key: None,
};
}
pub fn base_setup(conf: &Conf) -> Index {
match remove_dir_all(conf.database_name) {
Ok(_) => (),
Err(e) if e.kind() == std::io::ErrorKind::NotFound => (),
Err(e) => panic!("{}", e),
}
create_dir_all(conf.database_name).unwrap();
let mut options = EnvOpenOptions::new();
options.map_size(100 * 1024 * 1024 * 1024); // 100 GB
options.max_readers(10);
let index = Index::new(options, conf.database_name).unwrap();
let config = IndexerConfig::default();
let mut wtxn = index.write_txn().unwrap();
let mut builder = Settings::new(&mut wtxn, &index, &config);
if let Some(primary_key) = conf.primary_key {
builder.set_primary_key(primary_key.to_string());
}
if let Some(criterion) = conf.criterion {
builder.reset_filterable_fields();
builder.reset_criteria();
builder.reset_stop_words();
let criterion = criterion.iter().map(|s| Criterion::from_str(s).unwrap()).collect();
builder.set_criteria(criterion);
}
(conf.configure)(&mut builder);
builder.execute(|_| (), || false).unwrap();
wtxn.commit().unwrap();
let config = IndexerConfig::default();
let mut wtxn = index.write_txn().unwrap();
let indexing_config = IndexDocumentsConfig {
autogenerate_docids: conf.primary_key.is_none(),
update_method: IndexDocumentsMethod::ReplaceDocuments,
..Default::default()
};
let builder =
IndexDocuments::new(&mut wtxn, &index, &config, indexing_config, |_| (), || false).unwrap();
let documents = documents_from(conf.dataset, conf.dataset_format);
let (builder, user_error) = builder.add_documents(documents).unwrap();
user_error.unwrap();
builder.execute().unwrap();
wtxn.commit().unwrap();
index
}
pub fn run_benches(c: &mut criterion::Criterion, confs: &[Conf]) {
for conf in confs {
let index = base_setup(conf);
let file_name = Path::new(conf.dataset).file_name().and_then(|f| f.to_str()).unwrap();
let name = format!("{}: {}", file_name, conf.group_name);
let mut group = c.benchmark_group(&name);
for &query in conf.queries {
group.bench_with_input(BenchmarkId::from_parameter(query), &query, |b, &query| {
b.iter(|| {
let rtxn = index.read_txn().unwrap();
let mut search = index.search(&rtxn);
search.query(query).terms_matching_strategy(TermsMatchingStrategy::default());
if let Some(filter) = conf.filter {
let filter = Filter::from_str(filter).unwrap().unwrap();
search.filter(filter);
}
if let Some(sort) = &conf.sort {
let sort = sort.iter().map(|sort| sort.parse().unwrap()).collect();
search.sort_criteria(sort);
}
let _ids = search.execute().unwrap();
});
});
}
group.finish();
index.prepare_for_closing().wait();
}
}
pub fn documents_from(filename: &str, filetype: &str) -> DocumentsBatchReader<impl BufRead + Seek> {
let reader = File::open(filename)
.unwrap_or_else(|_| panic!("could not find the dataset in: {}", filename));
let reader = BufReader::new(reader);
let documents = match filetype {
"csv" => documents_from_csv(reader).unwrap(),
"json" => documents_from_json(reader).unwrap(),
"jsonl" => documents_from_jsonl(reader).unwrap(),
otherwise => panic!("invalid update format {:?}", otherwise),
};
DocumentsBatchReader::from_reader(Cursor::new(documents)).unwrap()
}
fn documents_from_jsonl(reader: impl BufRead) -> anyhow::Result<Vec<u8>> {
let mut documents = DocumentsBatchBuilder::new(Vec::new());
for result in serde_json::Deserializer::from_reader(reader).into_iter::<Object>() {
let object = result?;
documents.append_json_object(&object)?;
}
documents.into_inner().map_err(Into::into)
}
fn documents_from_json(reader: impl BufRead) -> anyhow::Result<Vec<u8>> {
let mut documents = DocumentsBatchBuilder::new(Vec::new());
documents.append_json_array(reader)?;
documents.into_inner().map_err(Into::into)
}
fn documents_from_csv(reader: impl BufRead) -> anyhow::Result<Vec<u8>> {
let csv = csv::Reader::from_reader(reader);
let mut documents = DocumentsBatchBuilder::new(Vec::new());
documents.append_csv(csv)?;
documents.into_inner().map_err(Into::into)
}
enum AllowedType {
String,
Number,
}
fn parse_csv_header(header: &str) -> (String, AllowedType) {
// if there are several separators we only split on the last one.
match header.rsplit_once(':') {
Some((field_name, field_type)) => match field_type {
"string" => (field_name.to_string(), AllowedType::String),
"number" => (field_name.to_string(), AllowedType::Number),
// we may return an error in this case.
_otherwise => (header.to_string(), AllowedType::String),
},
None => (header.to_string(), AllowedType::String),
}
}
struct CSVDocumentDeserializer<R>
where
R: Read,
{
documents: csv::StringRecordsIntoIter<R>,
headers: Vec<(String, AllowedType)>,
}
impl<R: Read> CSVDocumentDeserializer<R> {
fn from_reader(reader: R) -> io::Result<Self> {
let mut records = csv::Reader::from_reader(reader);
let headers = records.headers()?.into_iter().map(parse_csv_header).collect();
Ok(Self { documents: records.into_records(), headers })
}
}
impl<R: Read> Iterator for CSVDocumentDeserializer<R> {
type Item = anyhow::Result<Object>;
fn next(&mut self) -> Option<Self::Item> {
let csv_document = self.documents.next()?;
match csv_document {
Ok(csv_document) => {
let mut document = Object::new();
for ((field_name, field_type), value) in
self.headers.iter().zip(csv_document.into_iter())
{
let parsed_value: Result<Value, ParseFloatError> = match field_type {
AllowedType::Number => {
value.parse::<f64>().map(Value::from).map_err(Into::into)
}
AllowedType::String => Ok(Value::String(value.to_string())),
};
match parsed_value {
Ok(value) => drop(document.insert(field_name.to_string(), value)),
Err(_e) => {
return Some(Err(anyhow::anyhow!(
"Value '{}' is not a valid number",
value
)))
}
}
}
Some(Ok(document))
}
Err(e) => Some(Err(anyhow::anyhow!("Error parsing csv document: {}", e))),
}
}
}

115
crates/benchmarks/build.rs Normal file
View file

@ -0,0 +1,115 @@
use std::fs::File;
use std::io::{Cursor, Read, Seek, Write};
use std::path::{Path, PathBuf};
use std::{env, fs};
use bytes::Bytes;
use convert_case::{Case, Casing};
use flate2::read::GzDecoder;
use reqwest::IntoUrl;
const BASE_URL: &str = "https://milli-benchmarks.fra1.digitaloceanspaces.com/datasets";
const DATASET_SONGS: (&str, &str) = ("smol-songs", "csv");
const DATASET_SONGS_1_2: (&str, &str) = ("smol-songs-1_2", "csv");
const DATASET_SONGS_3_4: (&str, &str) = ("smol-songs-3_4", "csv");
const DATASET_SONGS_4_4: (&str, &str) = ("smol-songs-4_4", "csv");
const DATASET_WIKI: (&str, &str) = ("smol-wiki-articles", "csv");
const DATASET_WIKI_1_2: (&str, &str) = ("smol-wiki-articles-1_2", "csv");
const DATASET_WIKI_3_4: (&str, &str) = ("smol-wiki-articles-3_4", "csv");
const DATASET_WIKI_4_4: (&str, &str) = ("smol-wiki-articles-4_4", "csv");
const DATASET_MOVIES: (&str, &str) = ("movies", "json");
const DATASET_MOVIES_1_2: (&str, &str) = ("movies-1_2", "json");
const DATASET_MOVIES_3_4: (&str, &str) = ("movies-3_4", "json");
const DATASET_MOVIES_4_4: (&str, &str) = ("movies-4_4", "json");
const DATASET_NESTED_MOVIES: (&str, &str) = ("nested_movies", "json");
const DATASET_GEO: (&str, &str) = ("smol-all-countries", "jsonl");
const ALL_DATASETS: &[(&str, &str)] = &[
DATASET_SONGS,
DATASET_SONGS_1_2,
DATASET_SONGS_3_4,
DATASET_SONGS_4_4,
DATASET_WIKI,
DATASET_WIKI_1_2,
DATASET_WIKI_3_4,
DATASET_WIKI_4_4,
DATASET_MOVIES,
DATASET_MOVIES_1_2,
DATASET_MOVIES_3_4,
DATASET_MOVIES_4_4,
DATASET_NESTED_MOVIES,
DATASET_GEO,
];
/// The name of the environment variable used to select the path
/// of the directory containing the datasets
const BASE_DATASETS_PATH_KEY: &str = "MILLI_BENCH_DATASETS_PATH";
fn main() -> anyhow::Result<()> {
let out_dir = PathBuf::from(env::var(BASE_DATASETS_PATH_KEY).unwrap_or(env::var("OUT_DIR")?));
let benches_dir = PathBuf::from(env::var("CARGO_MANIFEST_DIR")?).join("benches");
let mut manifest_paths_file = File::create(benches_dir.join("datasets_paths.rs"))?;
write!(
manifest_paths_file,
r#"//! This file is generated by the build script.
//! Do not modify by hand, use the build.rs file.
#![allow(dead_code)]
"#
)?;
writeln!(manifest_paths_file)?;
for (dataset, extension) in ALL_DATASETS {
let out_path = out_dir.join(dataset);
let out_file = out_path.with_extension(extension);
writeln!(
&mut manifest_paths_file,
r#"pub const {}: &str = {:?};"#,
dataset.to_case(Case::ScreamingSnake),
out_file.display(),
)?;
if out_file.exists() {
eprintln!(
"The dataset {} already exists on the file system and will not be downloaded again",
out_path.display(),
);
continue;
}
let url = format!("{}/{}.{}.gz", BASE_URL, dataset, extension);
eprintln!("downloading: {}", url);
let bytes = retry(|| download_dataset(url.clone()), 10)?;
eprintln!("{} downloaded successfully", url);
eprintln!("uncompressing in {}", out_file.display());
uncompress_in_file(bytes, &out_file)?;
}
Ok(())
}
fn retry<Ok, Err>(fun: impl Fn() -> Result<Ok, Err>, times: usize) -> Result<Ok, Err> {
for _ in 0..times {
if let ok @ Ok(_) = fun() {
return ok;
}
}
fun()
}
fn download_dataset<U: IntoUrl>(url: U) -> anyhow::Result<Cursor<Bytes>> {
let bytes =
reqwest::blocking::Client::builder().timeout(None).build()?.get(url).send()?.bytes()?;
Ok(Cursor::new(bytes))
}
fn uncompress_in_file<R: Read + Seek, P: AsRef<Path>>(bytes: R, path: P) -> anyhow::Result<()> {
let path = path.as_ref();
let mut gz = GzDecoder::new(bytes);
let mut dataset = Vec::new();
gz.read_to_end(&mut dataset)?;
fs::write(path, dataset)?;
Ok(())
}

View file

@ -0,0 +1,38 @@
#!/usr/bin/env bash
# Requirements:
# - critcmp. See: https://github.com/BurntSushi/critcmp
# - curl
# Usage
# $ bash compare.sh json_file1 json_file1
# ex: bash compare.sh songs_main_09a4321.json songs_geosearch_24ec456.json
# Checking that critcmp is installed
command -v critcmp > /dev/null 2>&1
if [[ "$?" -ne 0 ]]; then
echo 'You must install critcmp to make this script work.'
echo 'See: https://github.com/BurntSushi/critcmp'
echo ' $ cargo install critcmp'
exit 1
fi
s3_url='https://milli-benchmarks.fra1.digitaloceanspaces.com/critcmp_results'
for file in $@
do
file_s3_url="$s3_url/$file"
file_local_path="/tmp/$file"
if [[ ! -f $file_local_path ]]; then
curl $file_s3_url --output $file_local_path --silent
if [[ "$?" -ne 0 ]]; then
echo 'curl command failed.'
exit 1
fi
fi
done
path_list=$(echo " $@" | sed 's/ / \/tmp\//g')
critcmp $path_list

View file

@ -0,0 +1,14 @@
#!/usr/bin/env bash
# Requirements:
# - curl
# - grep
res=$(curl -s https://milli-benchmarks.fra1.digitaloceanspaces.com | grep -o '<Key>[^<]\+' | cut -c 5- | grep critcmp_results/ | cut -c 18-)
for pattern in "$@"
do
res=$(echo "$res" | grep $pattern)
done
echo "$res"

View file

@ -0,0 +1,5 @@
//! This library is only used to isolate the benchmarks
//! from the original milli library.
//!
//! It does not include interesting functions for milli library
//! users only for milli contributors.

View file

@ -0,0 +1,18 @@
[package]
name = "build-info"
version.workspace = true
authors.workspace = true
description.workspace = true
homepage.workspace = true
readme.workspace = true
edition.workspace = true
license.workspace = true
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
time = { version = "0.3.36", features = ["parsing"] }
[build-dependencies]
anyhow = "1.0.86"
vergen-git2 = "1.0.0"

View file

@ -0,0 +1,29 @@
fn main() {
if let Err(err) = emit_git_variables() {
println!("cargo:warning=vergen: {}", err);
}
}
fn emit_git_variables() -> anyhow::Result<()> {
println!("cargo::rerun-if-env-changed=MEILI_NO_VERGEN");
let has_vergen =
!matches!(std::env::var_os("MEILI_NO_VERGEN"), Some(x) if x != "false" && x != "0");
anyhow::ensure!(has_vergen, "disabled via `MEILI_NO_VERGEN`");
// Note: any code that needs VERGEN_ environment variables should take care to define them manually in the Dockerfile and pass them
// in the corresponding GitHub workflow (publish_docker.yml).
// This is due to the Dockerfile building the binary outside of the git directory.
let mut builder = vergen_git2::Git2Builder::default();
builder.branch(true);
builder.commit_timestamp(true);
builder.commit_message(true);
builder.describe(true, true, None);
builder.sha(false);
let git2 = builder.build()?;
vergen_git2::Emitter::default().fail_on_error().add_instructions(&git2)?.emit()
}

View file

@ -0,0 +1,203 @@
use time::format_description::well_known::Iso8601;
#[derive(Debug, Clone)]
pub struct BuildInfo {
pub branch: Option<&'static str>,
pub describe: Option<DescribeResult>,
pub commit_sha1: Option<&'static str>,
pub commit_msg: Option<&'static str>,
pub commit_timestamp: Option<time::OffsetDateTime>,
}
impl BuildInfo {
pub fn from_build() -> Self {
let branch: Option<&'static str> = option_env!("VERGEN_GIT_BRANCH");
let describe = DescribeResult::from_build();
let commit_sha1 = option_env!("VERGEN_GIT_SHA");
let commit_msg = option_env!("VERGEN_GIT_COMMIT_MESSAGE");
let commit_timestamp = option_env!("VERGEN_GIT_COMMIT_TIMESTAMP");
let commit_timestamp = commit_timestamp.and_then(|commit_timestamp| {
time::OffsetDateTime::parse(commit_timestamp, &Iso8601::DEFAULT).ok()
});
Self { branch, describe, commit_sha1, commit_msg, commit_timestamp }
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub enum DescribeResult {
Prototype { name: &'static str },
Release { version: &'static str, major: u64, minor: u64, patch: u64 },
Prerelease { version: &'static str, major: u64, minor: u64, patch: u64, rc: u64 },
NotATag { describe: &'static str },
}
impl DescribeResult {
pub fn new(describe: &'static str) -> Self {
if let Some(name) = prototype_name(describe) {
Self::Prototype { name }
} else if let Some(release) = release_version(describe) {
release
} else if let Some(prerelease) = prerelease_version(describe) {
prerelease
} else {
Self::NotATag { describe }
}
}
pub fn from_build() -> Option<Self> {
let describe: &'static str = option_env!("VERGEN_GIT_DESCRIBE")?;
Some(Self::new(describe))
}
pub fn as_tag(&self) -> Option<&'static str> {
match self {
DescribeResult::Prototype { name } => Some(name),
DescribeResult::Release { version, .. } => Some(version),
DescribeResult::Prerelease { version, .. } => Some(version),
DescribeResult::NotATag { describe: _ } => None,
}
}
pub fn as_prototype(&self) -> Option<&'static str> {
match self {
DescribeResult::Prototype { name } => Some(name),
DescribeResult::Release { .. }
| DescribeResult::Prerelease { .. }
| DescribeResult::NotATag { .. } => None,
}
}
}
/// Parses the input as a prototype name.
///
/// Returns `Some(prototype_name)` if the following conditions are met on this value:
///
/// 1. starts with `prototype-`,
/// 2. ends with `-<some_number>`,
/// 3. does not end with `<some_number>-<some_number>`.
///
/// Otherwise, returns `None`.
fn prototype_name(describe: &'static str) -> Option<&'static str> {
if !describe.starts_with("prototype-") {
return None;
}
let mut rsplit_prototype = describe.rsplit('-');
// last component MUST be a number
rsplit_prototype.next()?.parse::<u64>().ok()?;
// before than last component SHALL NOT be a number
rsplit_prototype.next()?.parse::<u64>().err()?;
Some(describe)
}
fn release_version(describe: &'static str) -> Option<DescribeResult> {
if !describe.starts_with('v') {
return None;
}
// full release version don't contain a `-`
if describe.contains('-') {
return None;
}
// full release version parse as vX.Y.Z, with X, Y, Z numbers.
let mut dots = describe[1..].split('.');
let major: u64 = dots.next()?.parse().ok()?;
let minor: u64 = dots.next()?.parse().ok()?;
let patch: u64 = dots.next()?.parse().ok()?;
if dots.next().is_some() {
return None;
}
Some(DescribeResult::Release { version: describe, major, minor, patch })
}
fn prerelease_version(describe: &'static str) -> Option<DescribeResult> {
// prerelease version is in the shape vM.N.P-rc.C
let mut hyphen = describe.rsplit('-');
let prerelease = hyphen.next()?;
if !prerelease.starts_with("rc.") {
return None;
}
let rc: u64 = prerelease[3..].parse().ok()?;
let release = hyphen.next()?;
let DescribeResult::Release { version: _, major, minor, patch } = release_version(release)?
else {
return None;
};
Some(DescribeResult::Prerelease { version: describe, major, minor, patch, rc })
}
#[cfg(test)]
mod test {
use super::DescribeResult;
fn assert_not_a_tag(describe: &'static str) {
assert_eq!(DescribeResult::NotATag { describe }, DescribeResult::new(describe))
}
fn assert_proto(describe: &'static str) {
assert_eq!(DescribeResult::Prototype { name: describe }, DescribeResult::new(describe))
}
fn assert_release(describe: &'static str, major: u64, minor: u64, patch: u64) {
assert_eq!(
DescribeResult::Release { version: describe, major, minor, patch },
DescribeResult::new(describe)
)
}
fn assert_prerelease(describe: &'static str, major: u64, minor: u64, patch: u64, rc: u64) {
assert_eq!(
DescribeResult::Prerelease { version: describe, major, minor, patch, rc },
DescribeResult::new(describe)
)
}
#[test]
fn not_a_tag() {
assert_not_a_tag("whatever-fuzzy");
assert_not_a_tag("whatever-fuzzy-5-ggg-dirty");
assert_not_a_tag("whatever-fuzzy-120-ggg-dirty");
// technically a tag, but not a proto nor a version, so not parsed as a tag
assert_not_a_tag("whatever");
// dirty version
assert_not_a_tag("v1.7.0-1-ggga-dirty");
assert_not_a_tag("v1.7.0-rc.1-1-ggga-dirty");
// after version
assert_not_a_tag("v1.7.0-1-ggga");
assert_not_a_tag("v1.7.0-rc.1-1-ggga");
// after proto
assert_not_a_tag("protoype-tag-0-1-ggga");
assert_not_a_tag("protoype-tag-0-1-ggga-dirty");
}
#[test]
fn prototype() {
assert_proto("prototype-tag-0");
assert_proto("prototype-tag-10");
assert_proto("prototype-long-name-tag-10");
}
#[test]
fn release() {
assert_release("v1.7.2", 1, 7, 2);
}
#[test]
fn prerelease() {
assert_prerelease("v1.7.2-rc.3", 1, 7, 2, 3);
}
}

34
crates/dump/Cargo.toml Normal file
View file

@ -0,0 +1,34 @@
[package]
name = "dump"
publish = false
version.workspace = true
authors.workspace = true
description.workspace = true
edition.workspace = true
homepage.workspace = true
readme.workspace = true
license.workspace = true
[dependencies]
anyhow = "1.0.86"
flate2 = "1.0.30"
http = "1.1.0"
meilisearch-types = { path = "../meilisearch-types" }
once_cell = "1.19.0"
regex = "1.10.5"
roaring = { version = "0.10.6", features = ["serde"] }
serde = { version = "1.0.204", features = ["derive"] }
serde_json = { version = "1.0.120", features = ["preserve_order"] }
tar = "0.4.41"
tempfile = "3.10.1"
thiserror = "1.0.61"
time = { version = "0.3.36", features = ["serde-well-known", "formatting", "parsing", "macros"] }
tracing = "0.1.40"
uuid = { version = "1.10.0", features = ["serde", "v4"] }
[dev-dependencies]
big_s = "1.0.2"
maplit = "1.0.2"
meili-snap = { path = "../meili-snap" }
meilisearch-types = { path = "../meilisearch-types" }

17
crates/dump/README.md Normal file
View file

@ -0,0 +1,17 @@
```
dump
├── indexes
│ ├── cattos
│ │ ├── documents.jsonl
│ │ └── settings.json
│ └── doggos
│ ├── documents.jsonl
│ └── settings.json
├── instance-uid.uuid
├── keys.jsonl
├── metadata.json
└── tasks
├── update_files
│ └── [task_id].jsonl
└── queue.jsonl
```

34
crates/dump/src/error.rs Normal file
View file

@ -0,0 +1,34 @@
use meilisearch_types::error::{Code, ErrorCode};
use thiserror::Error;
#[derive(Debug, Error)]
pub enum Error {
#[error("Bad index name.")]
BadIndexName,
#[error("Malformed task.")]
MalformedTask,
#[error(transparent)]
Io(#[from] std::io::Error),
#[error(transparent)]
Serde(#[from] serde_json::Error),
#[error(transparent)]
Uuid(#[from] uuid::Error),
}
impl ErrorCode for Error {
fn error_code(&self) -> Code {
match self {
Error::Io(e) => e.error_code(),
// These errors either happen when creating a dump and don't need any error code,
// or come from an internal bad deserialization.
Error::Serde(_) => Code::Internal,
Error::Uuid(_) => Code::Internal,
// all these errors should never be raised when creating a dump, thus no error code should be associated.
Error::BadIndexName => Code::Internal,
Error::MalformedTask => Code::Internal,
}
}
}

503
crates/dump/src/lib.rs Normal file
View file

@ -0,0 +1,503 @@
#![allow(clippy::type_complexity)]
#![allow(clippy::wrong_self_convention)]
use meilisearch_types::error::ResponseError;
use meilisearch_types::keys::Key;
use meilisearch_types::milli::update::IndexDocumentsMethod;
use meilisearch_types::settings::Unchecked;
use meilisearch_types::tasks::{Details, IndexSwap, KindWithContent, Status, Task, TaskId};
use meilisearch_types::InstanceUid;
use roaring::RoaringBitmap;
use serde::{Deserialize, Serialize};
use time::OffsetDateTime;
mod error;
mod reader;
mod writer;
pub use error::Error;
pub use reader::{DumpReader, UpdateFile};
pub use writer::DumpWriter;
const CURRENT_DUMP_VERSION: Version = Version::V6;
type Result<T> = std::result::Result<T, Error>;
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct Metadata {
pub dump_version: Version,
pub db_version: String,
#[serde(with = "time::serde::rfc3339")]
pub dump_date: OffsetDateTime,
}
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct IndexMetadata {
pub uid: String,
pub primary_key: Option<String>,
#[serde(with = "time::serde::rfc3339")]
pub created_at: OffsetDateTime,
#[serde(with = "time::serde::rfc3339")]
pub updated_at: OffsetDateTime,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Deserialize, Serialize)]
pub enum Version {
V1,
V2,
V3,
V4,
V5,
V6,
}
#[derive(Debug, PartialEq, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct TaskDump {
pub uid: TaskId,
#[serde(default)]
pub index_uid: Option<String>,
pub status: Status,
#[serde(rename = "type")]
pub kind: KindDump,
#[serde(skip_serializing_if = "Option::is_none")]
pub canceled_by: Option<TaskId>,
#[serde(skip_serializing_if = "Option::is_none")]
pub details: Option<Details>,
#[serde(skip_serializing_if = "Option::is_none")]
pub error: Option<ResponseError>,
#[serde(with = "time::serde::rfc3339")]
pub enqueued_at: OffsetDateTime,
#[serde(
with = "time::serde::rfc3339::option",
skip_serializing_if = "Option::is_none",
default
)]
pub started_at: Option<OffsetDateTime>,
#[serde(
with = "time::serde::rfc3339::option",
skip_serializing_if = "Option::is_none",
default
)]
pub finished_at: Option<OffsetDateTime>,
}
// A `Kind` specific version made for the dump. If modified you may break the dump.
#[derive(Debug, PartialEq, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub enum KindDump {
DocumentImport {
primary_key: Option<String>,
method: IndexDocumentsMethod,
documents_count: u64,
allow_index_creation: bool,
},
DocumentDeletion {
documents_ids: Vec<String>,
},
DocumentClear,
DocumentDeletionByFilter {
filter: serde_json::Value,
},
DocumentEdition {
filter: Option<serde_json::Value>,
context: Option<serde_json::Map<String, serde_json::Value>>,
function: String,
},
Settings {
settings: Box<meilisearch_types::settings::Settings<Unchecked>>,
is_deletion: bool,
allow_index_creation: bool,
},
IndexDeletion,
IndexCreation {
primary_key: Option<String>,
},
IndexUpdate {
primary_key: Option<String>,
},
IndexSwap {
swaps: Vec<IndexSwap>,
},
TaskCancelation {
query: String,
tasks: RoaringBitmap,
},
TasksDeletion {
query: String,
tasks: RoaringBitmap,
},
DumpCreation {
keys: Vec<Key>,
instance_uid: Option<InstanceUid>,
},
SnapshotCreation,
}
impl From<Task> for TaskDump {
fn from(task: Task) -> Self {
TaskDump {
uid: task.uid,
index_uid: task.index_uid().map(|uid| uid.to_string()),
status: task.status,
kind: task.kind.into(),
canceled_by: task.canceled_by,
details: task.details,
error: task.error,
enqueued_at: task.enqueued_at,
started_at: task.started_at,
finished_at: task.finished_at,
}
}
}
impl From<KindWithContent> for KindDump {
fn from(kind: KindWithContent) -> Self {
match kind {
KindWithContent::DocumentAdditionOrUpdate {
primary_key,
method,
documents_count,
allow_index_creation,
..
} => KindDump::DocumentImport {
primary_key,
method,
documents_count,
allow_index_creation,
},
KindWithContent::DocumentDeletion { documents_ids, .. } => {
KindDump::DocumentDeletion { documents_ids }
}
KindWithContent::DocumentDeletionByFilter { filter_expr, .. } => {
KindDump::DocumentDeletionByFilter { filter: filter_expr }
}
KindWithContent::DocumentEdition { filter_expr, context, function, .. } => {
KindDump::DocumentEdition { filter: filter_expr, context, function }
}
KindWithContent::DocumentClear { .. } => KindDump::DocumentClear,
KindWithContent::SettingsUpdate {
new_settings,
is_deletion,
allow_index_creation,
..
} => KindDump::Settings { settings: new_settings, is_deletion, allow_index_creation },
KindWithContent::IndexDeletion { .. } => KindDump::IndexDeletion,
KindWithContent::IndexCreation { primary_key, .. } => {
KindDump::IndexCreation { primary_key }
}
KindWithContent::IndexUpdate { primary_key, .. } => {
KindDump::IndexUpdate { primary_key }
}
KindWithContent::IndexSwap { swaps } => KindDump::IndexSwap { swaps },
KindWithContent::TaskCancelation { query, tasks } => {
KindDump::TaskCancelation { query, tasks }
}
KindWithContent::TaskDeletion { query, tasks } => {
KindDump::TasksDeletion { query, tasks }
}
KindWithContent::DumpCreation { keys, instance_uid } => {
KindDump::DumpCreation { keys, instance_uid }
}
KindWithContent::SnapshotCreation => KindDump::SnapshotCreation,
}
}
}
#[cfg(test)]
pub(crate) mod test {
use std::fs::File;
use std::io::Seek;
use std::str::FromStr;
use big_s::S;
use maplit::{btreemap, btreeset};
use meilisearch_types::facet_values_sort::FacetValuesSort;
use meilisearch_types::features::RuntimeTogglableFeatures;
use meilisearch_types::index_uid_pattern::IndexUidPattern;
use meilisearch_types::keys::{Action, Key};
use meilisearch_types::milli;
use meilisearch_types::milli::update::Setting;
use meilisearch_types::settings::{Checked, FacetingSettings, Settings};
use meilisearch_types::tasks::{Details, Status};
use serde_json::{json, Map, Value};
use time::macros::datetime;
use uuid::Uuid;
use crate::reader::Document;
use crate::{DumpReader, DumpWriter, IndexMetadata, KindDump, TaskDump, Version};
pub fn create_test_instance_uid() -> Uuid {
Uuid::parse_str("9e15e977-f2ae-4761-943f-1eaf75fd736d").unwrap()
}
pub fn create_test_index_metadata() -> IndexMetadata {
IndexMetadata {
uid: S("doggo"),
primary_key: None,
created_at: datetime!(2022-11-20 12:00 UTC),
updated_at: datetime!(2022-11-21 00:00 UTC),
}
}
pub fn create_test_documents() -> Vec<Map<String, Value>> {
vec![
json!({ "id": 1, "race": "golden retriever", "name": "paul", "age": 4 })
.as_object()
.unwrap()
.clone(),
json!({ "id": 2, "race": "bernese mountain", "name": "tamo", "age": 6 })
.as_object()
.unwrap()
.clone(),
json!({ "id": 3, "race": "great pyrenees", "name": "patou", "age": 5 })
.as_object()
.unwrap()
.clone(),
]
}
pub fn create_test_settings() -> Settings<Checked> {
let settings = Settings {
displayed_attributes: Setting::Set(vec![S("race"), S("name")]).into(),
searchable_attributes: Setting::Set(vec![S("name"), S("race")]).into(),
filterable_attributes: Setting::Set(btreeset! { S("race"), S("age") }),
sortable_attributes: Setting::Set(btreeset! { S("age") }),
ranking_rules: Setting::NotSet,
stop_words: Setting::NotSet,
non_separator_tokens: Setting::NotSet,
separator_tokens: Setting::NotSet,
dictionary: Setting::NotSet,
synonyms: Setting::NotSet,
distinct_attribute: Setting::NotSet,
proximity_precision: Setting::NotSet,
typo_tolerance: Setting::NotSet,
faceting: Setting::Set(FacetingSettings {
max_values_per_facet: Setting::Set(111),
sort_facet_values_by: Setting::Set(
btreemap! { S("age") => FacetValuesSort::Count },
),
}),
pagination: Setting::NotSet,
embedders: Setting::NotSet,
search_cutoff_ms: Setting::NotSet,
localized_attributes: Setting::NotSet,
_kind: std::marker::PhantomData,
};
settings.check()
}
pub fn create_test_tasks() -> Vec<(TaskDump, Option<Vec<Document>>)> {
vec![
(
TaskDump {
uid: 0,
index_uid: Some(S("doggo")),
status: Status::Succeeded,
kind: KindDump::DocumentImport {
method: milli::update::IndexDocumentsMethod::UpdateDocuments,
allow_index_creation: true,
primary_key: Some(S("bone")),
documents_count: 12,
},
canceled_by: None,
details: Some(Details::DocumentAdditionOrUpdate {
received_documents: 12,
indexed_documents: Some(10),
}),
error: None,
enqueued_at: datetime!(2022-11-11 0:00 UTC),
started_at: Some(datetime!(2022-11-20 0:00 UTC)),
finished_at: Some(datetime!(2022-11-21 0:00 UTC)),
},
None,
),
(
TaskDump {
uid: 1,
index_uid: Some(S("doggo")),
status: Status::Enqueued,
kind: KindDump::DocumentImport {
method: milli::update::IndexDocumentsMethod::UpdateDocuments,
allow_index_creation: true,
primary_key: None,
documents_count: 2,
},
canceled_by: None,
details: Some(Details::DocumentAdditionOrUpdate {
received_documents: 2,
indexed_documents: None,
}),
error: None,
enqueued_at: datetime!(2022-11-11 0:00 UTC),
started_at: None,
finished_at: None,
},
Some(vec![
json!({ "id": 4, "race": "leonberg" }).as_object().unwrap().clone(),
json!({ "id": 5, "race": "patou" }).as_object().unwrap().clone(),
]),
),
(
TaskDump {
uid: 5,
index_uid: Some(S("catto")),
status: Status::Enqueued,
kind: KindDump::IndexDeletion,
canceled_by: None,
details: None,
error: None,
enqueued_at: datetime!(2022-11-15 0:00 UTC),
started_at: None,
finished_at: None,
},
None,
),
]
}
pub fn create_test_api_keys() -> Vec<Key> {
vec![
Key {
description: Some(S("The main key to manage all the doggos")),
name: Some(S("doggos_key")),
uid: Uuid::from_str("9f8a34da-b6b2-42f0-939b-dbd4c3448655").unwrap(),
actions: vec![Action::DocumentsAll],
indexes: vec![IndexUidPattern::from_str("doggos").unwrap()],
expires_at: Some(datetime!(4130-03-14 12:21 UTC)),
created_at: datetime!(1960-11-15 0:00 UTC),
updated_at: datetime!(2022-11-10 0:00 UTC),
},
Key {
description: Some(S("The master key for everything and even the doggos")),
name: Some(S("master_key")),
uid: Uuid::from_str("4622f717-1c00-47bb-a494-39d76a49b591").unwrap(),
actions: vec![Action::All],
indexes: vec![IndexUidPattern::all()],
expires_at: None,
created_at: datetime!(0000-01-01 00:01 UTC),
updated_at: datetime!(1964-05-04 17:25 UTC),
},
Key {
description: Some(S("The useless key to for nothing nor the doggos")),
name: Some(S("useless_key")),
uid: Uuid::from_str("fb80b58b-0a34-412f-8ba7-1ce868f8ac5c").unwrap(),
actions: vec![],
indexes: vec![],
expires_at: None,
created_at: datetime!(400-02-29 0:00 UTC),
updated_at: datetime!(1024-02-29 0:00 UTC),
},
]
}
pub fn create_test_dump() -> File {
let instance_uid = create_test_instance_uid();
let dump = DumpWriter::new(Some(instance_uid)).unwrap();
// ========== Adding an index
let documents = create_test_documents();
let settings = create_test_settings();
let mut index = dump.create_index("doggos", &create_test_index_metadata()).unwrap();
for document in &documents {
index.push_document(document).unwrap();
}
index.flush().unwrap();
index.settings(&settings).unwrap();
// ========== pushing the task queue
let tasks = create_test_tasks();
let mut task_queue = dump.create_tasks_queue().unwrap();
for (task, update_file) in &tasks {
let mut update = task_queue.push_task(task).unwrap();
if let Some(update_file) = update_file {
for u in update_file {
update.push_document(u).unwrap();
}
}
}
task_queue.flush().unwrap();
// ========== pushing the api keys
let api_keys = create_test_api_keys();
let mut keys = dump.create_keys().unwrap();
for key in &api_keys {
keys.push_key(key).unwrap();
}
keys.flush().unwrap();
// ========== experimental features
let features = create_test_features();
dump.create_experimental_features(features).unwrap();
// create the dump
let mut file = tempfile::tempfile().unwrap();
dump.persist_to(&mut file).unwrap();
file.rewind().unwrap();
file
}
fn create_test_features() -> RuntimeTogglableFeatures {
RuntimeTogglableFeatures { vector_store: true, ..Default::default() }
}
#[test]
fn test_creating_and_read_dump() {
let mut file = create_test_dump();
let mut dump = DumpReader::open(&mut file).unwrap();
// ==== checking the top level infos
assert_eq!(dump.version(), Version::V6);
assert!(dump.date().is_some());
assert_eq!(dump.instance_uid().unwrap().unwrap(), create_test_instance_uid());
// ==== checking the index
let mut indexes = dump.indexes().unwrap();
let mut index = indexes.next().unwrap().unwrap();
assert!(indexes.next().is_none()); // there was only one index in the dump
for (document, expected) in index.documents().unwrap().zip(create_test_documents()) {
assert_eq!(document.unwrap(), expected);
}
assert_eq!(index.settings().unwrap(), create_test_settings());
assert_eq!(index.metadata(), &create_test_index_metadata());
drop(index);
drop(indexes);
// ==== checking the task queue
for (task, expected) in dump.tasks().unwrap().zip(create_test_tasks()) {
let (task, content_file) = task.unwrap();
assert_eq!(task, expected.0);
if let Some(expected_update) = expected.1 {
assert!(
content_file.is_some(),
"A content file was expected for the task {}.",
expected.0.uid
);
let updates = content_file.unwrap().collect::<Result<Vec<_>, _>>().unwrap();
assert_eq!(updates, expected_update);
}
}
// ==== checking the keys
for (key, expected) in dump.keys().unwrap().zip(create_test_api_keys()) {
assert_eq!(key.unwrap(), expected);
}
// ==== checking the features
let expected = create_test_features();
assert_eq!(dump.features().unwrap().unwrap(), expected);
}
}

View file

@ -0,0 +1,5 @@
pub mod v1_to_v2;
pub mod v2_to_v3;
pub mod v3_to_v4;
pub mod v4_to_v5;
pub mod v5_to_v6;

View file

@ -0,0 +1,38 @@
---
source: dump/src/reader/compat/v1_to_v2.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null
}

View file

@ -0,0 +1,31 @@
---
source: dump/src/reader/compat/v1_to_v2.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [
"genres",
"id"
],
"sortableAttributes": [
"genres",
"id"
],
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,24 @@
---
source: dump/src/reader/compat/v1_to_v2.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,23 @@
---
source: dump/src/reader/compat/v2_to_v3.rs
expression: movies2.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,23 @@
---
source: dump/src/reader/compat/v2_to_v3.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,37 @@
---
source: dump/src/reader/compat/v2_to_v3.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null
}

View file

@ -0,0 +1,24 @@
---
source: dump/src/reader/compat/v2_to_v3.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,25 @@
---
source: dump/src/reader/compat/v3_to_v4.rs
expression: movies2.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,25 @@
---
source: dump/src/reader/compat/v3_to_v4.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,39 @@
---
source: dump/src/reader/compat/v3_to_v4.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null
}

View file

@ -0,0 +1,31 @@
---
source: dump/src/reader/compat/v3_to_v4.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [
"genres",
"id"
],
"sortableAttributes": [
"release_date"
],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,56 @@
---
source: dump/src/reader/compat/v4_to_v5.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": "Reset",
"searchableAttributes": "Reset",
"filterableAttributes": {
"Set": []
},
"sortableAttributes": {
"Set": []
},
"rankingRules": {
"Set": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
]
},
"stopWords": {
"Set": []
},
"synonyms": {
"Set": {}
},
"distinctAttribute": "Reset",
"typoTolerance": {
"Set": {
"enabled": {
"Set": true
},
"minWordSizeForTypos": {
"Set": {
"oneTypo": {
"Set": 5
},
"twoTypos": {
"Set": 9
}
}
},
"disableOnWords": {
"Set": []
},
"disableOnAttributes": {
"Set": []
}
}
},
"faceting": "NotSet",
"pagination": "NotSet"
}

View file

@ -0,0 +1,70 @@
---
source: dump/src/reader/compat/v4_to_v5.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": "Reset",
"searchableAttributes": "Reset",
"filterableAttributes": {
"Set": []
},
"sortableAttributes": {
"Set": []
},
"rankingRules": {
"Set": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
]
},
"stopWords": {
"Set": []
},
"synonyms": {
"Set": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
}
},
"distinctAttribute": "Reset",
"typoTolerance": {
"Set": {
"enabled": {
"Set": true
},
"minWordSizeForTypos": {
"Set": {
"oneTypo": {
"Set": 5
},
"twoTypos": {
"Set": 9
}
}
},
"disableOnWords": {
"Set": []
},
"disableOnAttributes": {
"Set": []
}
}
},
"faceting": "NotSet",
"pagination": "NotSet"
}

View file

@ -0,0 +1,62 @@
---
source: dump/src/reader/compat/v4_to_v5.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": "Reset",
"searchableAttributes": "Reset",
"filterableAttributes": {
"Set": [
"genres",
"id"
]
},
"sortableAttributes": {
"Set": [
"release_date"
]
},
"rankingRules": {
"Set": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness",
"release_date:asc"
]
},
"stopWords": {
"Set": []
},
"synonyms": {
"Set": {}
},
"distinctAttribute": "Reset",
"typoTolerance": {
"Set": {
"enabled": {
"Set": true
},
"minWordSizeForTypos": {
"Set": {
"oneTypo": {
"Set": 5
},
"twoTypos": {
"Set": 9
}
}
},
"disableOnWords": {
"Set": []
},
"disableOnAttributes": {
"Set": []
}
}
},
"faceting": "NotSet",
"pagination": "NotSet"
}

View file

@ -0,0 +1,40 @@
---
source: dump/src/reader/compat/v5_to_v6.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null,
"typoTolerance": {
"enabled": true,
"minWordSizeForTypos": {
"oneTypo": 5,
"twoTypos": 9
},
"disableOnWords": [],
"disableOnAttributes": []
},
"faceting": {
"maxValuesPerFacet": 100
},
"pagination": {
"maxTotalHits": 1000
}
}

View file

@ -0,0 +1,54 @@
---
source: dump/src/reader/compat/v5_to_v6.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null,
"typoTolerance": {
"enabled": true,
"minWordSizeForTypos": {
"oneTypo": 5,
"twoTypos": 9
},
"disableOnWords": [],
"disableOnAttributes": []
},
"faceting": {
"maxValuesPerFacet": 100
},
"pagination": {
"maxTotalHits": 1000
}
}

View file

@ -0,0 +1,46 @@
---
source: dump/src/reader/compat/v5_to_v6.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [
"genres",
"id"
],
"sortableAttributes": [
"release_date"
],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null,
"typoTolerance": {
"enabled": true,
"minWordSizeForTypos": {
"oneTypo": 5,
"twoTypos": 9
},
"disableOnWords": [],
"disableOnAttributes": []
},
"faceting": {
"maxValuesPerFacet": 100
},
"pagination": {
"maxTotalHits": 1000
}
}

View file

@ -0,0 +1,410 @@
use std::str::FromStr;
use super::v2_to_v3::CompatV2ToV3;
use crate::reader::{v1, v2, Document};
use crate::Result;
pub struct CompatV1ToV2 {
pub from: v1::V1Reader,
}
impl CompatV1ToV2 {
pub fn new(v1: v1::V1Reader) -> Self {
Self { from: v1 }
}
pub fn to_v3(self) -> CompatV2ToV3 {
CompatV2ToV3::Compat(self)
}
pub fn version(&self) -> crate::Version {
self.from.version()
}
pub fn date(&self) -> Option<time::OffsetDateTime> {
self.from.date()
}
pub fn index_uuid(&self) -> Vec<v2::meta::IndexUuid> {
self.from
.index_uuid()
.into_iter()
.enumerate()
// we use the index of the index 😬 as UUID for the index, so that we can link the v2::Task to their index
.map(|(index, index_uuid)| v2::meta::IndexUuid {
uid: index_uuid.uid,
uuid: uuid::Uuid::from_u128(index as u128),
})
.collect()
}
pub fn indexes(&self) -> Result<impl Iterator<Item = Result<CompatIndexV1ToV2>> + '_> {
Ok(self.from.indexes()?.map(|index_reader| Ok(CompatIndexV1ToV2 { from: index_reader? })))
}
pub fn tasks(
&mut self,
) -> Box<dyn Iterator<Item = Result<(v2::Task, Option<v2::UpdateFile>)>> + '_> {
// Convert an error here to an iterator yielding the error
let indexes = match self.from.indexes() {
Ok(indexes) => indexes,
Err(err) => return Box::new(std::iter::once(Err(err))),
};
let it = indexes.enumerate().flat_map(
move |(index, index_reader)| -> Box<dyn Iterator<Item = _>> {
let index_reader = match index_reader {
Ok(index_reader) => index_reader,
Err(err) => return Box::new(std::iter::once(Err(err))),
};
Box::new(
index_reader
.tasks()
// Filter out the UpdateStatus::Customs variant that is not supported in v2
// and enqueued tasks, that don't contain the necessary update file in v1
.filter_map(move |task| -> Option<_> {
let task = match task {
Ok(task) => task,
Err(err) => return Some(Err(err)),
};
Some(Ok((
v2::Task {
uuid: uuid::Uuid::from_u128(index as u128),
update: Option::from(task)?,
},
None,
)))
}),
)
},
);
Box::new(it)
}
}
pub struct CompatIndexV1ToV2 {
pub from: v1::V1IndexReader,
}
impl CompatIndexV1ToV2 {
pub fn metadata(&self) -> &crate::IndexMetadata {
self.from.metadata()
}
pub fn documents(&mut self) -> Result<Box<dyn Iterator<Item = Result<Document>> + '_>> {
self.from.documents().map(|it| Box::new(it) as Box<dyn Iterator<Item = _>>)
}
pub fn settings(&mut self) -> Result<v2::settings::Settings<v2::settings::Checked>> {
Ok(v2::settings::Settings::<v2::settings::Unchecked>::from(self.from.settings()?).check())
}
}
impl From<v1::settings::Settings> for v2::Settings<v2::Unchecked> {
fn from(source: v1::settings::Settings) -> Self {
Self {
displayed_attributes: option_to_setting(source.displayed_attributes)
.map(|displayed| displayed.into_iter().collect()),
searchable_attributes: option_to_setting(source.searchable_attributes),
filterable_attributes: option_to_setting(source.attributes_for_faceting.clone())
.map(|filterable| filterable.into_iter().collect()),
sortable_attributes: option_to_setting(source.attributes_for_faceting)
.map(|sortable| sortable.into_iter().collect()),
ranking_rules: option_to_setting(source.ranking_rules).map(|ranking_rules| {
ranking_rules
.into_iter()
.filter_map(|ranking_rule| {
match v1::settings::RankingRule::from_str(&ranking_rule) {
Ok(ranking_rule) => {
let criterion: Option<v2::settings::Criterion> =
ranking_rule.into();
criterion.as_ref().map(ToString::to_string)
}
Err(()) => {
tracing::warn!(
"Could not import the following ranking rule: `{}`.",
ranking_rule
);
None
}
}
})
.collect()
}),
stop_words: option_to_setting(source.stop_words),
synonyms: option_to_setting(source.synonyms),
distinct_attribute: option_to_setting(source.distinct_attribute),
_kind: std::marker::PhantomData,
}
}
}
fn option_to_setting<T>(opt: Option<Option<T>>) -> v2::Setting<T> {
match opt {
Some(Some(t)) => v2::Setting::Set(t),
None => v2::Setting::NotSet,
Some(None) => v2::Setting::Reset,
}
}
impl From<v1::update::UpdateStatus> for Option<v2::updates::UpdateStatus> {
fn from(source: v1::update::UpdateStatus) -> Self {
use v1::update::UpdateStatus as UpdateStatusV1;
use v2::updates::UpdateStatus as UpdateStatusV2;
Some(match source {
UpdateStatusV1::Enqueued { content } => {
tracing::warn!(
"Cannot import task {} (importing enqueued tasks from v1 dumps is unsupported)",
content.update_id
);
tracing::warn!("Task will be skipped in the queue of imported tasks.");
return None;
}
UpdateStatusV1::Failed { content } => UpdateStatusV2::Failed(v2::updates::Failed {
from: v2::updates::Processing {
from: v2::updates::Enqueued {
update_id: content.update_id,
meta: Option::from(content.update_type)?,
enqueued_at: content.enqueued_at,
content: None,
},
started_processing_at: content.processed_at
- std::time::Duration::from_secs_f64(content.duration),
},
error: v2::ResponseError {
// error code is ignored by serialization, and so always default in deserialized v2 dumps
// that's a good thing, because we don't have them in v1 dump 😅
code: http::StatusCode::default(),
message: content.error.unwrap_or_default(),
// error codes are unchanged between v1 and v2
error_code: content.error_code.unwrap_or_default(),
// error types are unchanged between v1 and v2
error_type: content.error_type.unwrap_or_default(),
// error links are unchanged between v1 and v2
error_link: content.error_link.unwrap_or_default(),
},
failed_at: content.processed_at,
}),
UpdateStatusV1::Processed { content } => {
UpdateStatusV2::Processed(v2::updates::Processed {
success: match &content.update_type {
v1::update::UpdateType::ClearAll => {
v2::updates::UpdateResult::DocumentDeletion { deleted: u64::MAX }
}
v1::update::UpdateType::Customs => v2::updates::UpdateResult::Other,
v1::update::UpdateType::DocumentsAddition { number } => {
v2::updates::UpdateResult::DocumentsAddition(
v2::updates::DocumentAdditionResult { nb_documents: *number },
)
}
v1::update::UpdateType::DocumentsPartial { number } => {
v2::updates::UpdateResult::DocumentsAddition(
v2::updates::DocumentAdditionResult { nb_documents: *number },
)
}
v1::update::UpdateType::DocumentsDeletion { number } => {
v2::updates::UpdateResult::DocumentDeletion { deleted: *number as u64 }
}
v1::update::UpdateType::Settings { .. } => v2::updates::UpdateResult::Other,
},
processed_at: content.processed_at,
from: v2::updates::Processing {
from: v2::updates::Enqueued {
update_id: content.update_id,
meta: Option::from(content.update_type)?,
enqueued_at: content.enqueued_at,
content: None,
},
started_processing_at: content.processed_at
- std::time::Duration::from_secs_f64(content.duration),
},
})
}
})
}
}
impl From<v1::update::UpdateType> for Option<v2::updates::UpdateMeta> {
fn from(source: v1::update::UpdateType) -> Self {
Some(match source {
v1::update::UpdateType::ClearAll => v2::updates::UpdateMeta::ClearDocuments,
v1::update::UpdateType::Customs => {
tracing::warn!("Ignoring task with type 'Customs' that is no longer supported");
return None;
}
v1::update::UpdateType::DocumentsAddition { .. } => {
v2::updates::UpdateMeta::DocumentsAddition {
method: v2::updates::IndexDocumentsMethod::ReplaceDocuments,
format: v2::updates::UpdateFormat::Json,
primary_key: None,
}
}
v1::update::UpdateType::DocumentsPartial { .. } => {
v2::updates::UpdateMeta::DocumentsAddition {
method: v2::updates::IndexDocumentsMethod::UpdateDocuments,
format: v2::updates::UpdateFormat::Json,
primary_key: None,
}
}
v1::update::UpdateType::DocumentsDeletion { .. } => {
v2::updates::UpdateMeta::DeleteDocuments { ids: vec![] }
}
v1::update::UpdateType::Settings { settings } => {
v2::updates::UpdateMeta::Settings((*settings).into())
}
})
}
}
impl From<v1::settings::SettingsUpdate> for v2::Settings<v2::Unchecked> {
fn from(source: v1::settings::SettingsUpdate) -> Self {
let ranking_rules = v2::Setting::from(source.ranking_rules);
// go from the concrete types of v1 (RankingRule) to the concrete type of v2 (Criterion),
// and then back to string as this is what the settings manipulate
let ranking_rules = ranking_rules.map(|ranking_rules| {
ranking_rules
.into_iter()
// filter out the WordsPosition ranking rule that exists in v1 but not v2
.filter_map(Option::<v2::settings::Criterion>::from)
.map(|criterion| criterion.to_string())
.collect()
});
Self {
displayed_attributes: v2::Setting::from(source.displayed_attributes)
.map(|displayed_attributes| displayed_attributes.into_iter().collect()),
searchable_attributes: source.searchable_attributes.into(),
filterable_attributes: v2::Setting::from(source.attributes_for_faceting.clone())
.map(|attributes_for_faceting| attributes_for_faceting.into_iter().collect()),
sortable_attributes: v2::Setting::from(source.attributes_for_faceting)
.map(|attributes_for_faceting| attributes_for_faceting.into_iter().collect()),
ranking_rules,
stop_words: source.stop_words.into(),
synonyms: source.synonyms.into(),
distinct_attribute: source.distinct_attribute.into(),
_kind: std::marker::PhantomData,
}
}
}
impl From<v1::settings::RankingRule> for Option<v2::settings::Criterion> {
fn from(source: v1::settings::RankingRule) -> Self {
match source {
v1::settings::RankingRule::Typo => Some(v2::settings::Criterion::Typo),
v1::settings::RankingRule::Words => Some(v2::settings::Criterion::Words),
v1::settings::RankingRule::Proximity => Some(v2::settings::Criterion::Proximity),
v1::settings::RankingRule::Attribute => Some(v2::settings::Criterion::Attribute),
v1::settings::RankingRule::WordsPosition => {
tracing::warn!("Removing the 'WordsPosition' ranking rule that is no longer supported, please check the resulting ranking rules of your indexes");
None
}
v1::settings::RankingRule::Exactness => Some(v2::settings::Criterion::Exactness),
v1::settings::RankingRule::Asc(field_name) => {
Some(v2::settings::Criterion::Asc(field_name))
}
v1::settings::RankingRule::Desc(field_name) => {
Some(v2::settings::Criterion::Desc(field_name))
}
}
}
}
impl<T> From<v1::settings::UpdateState<T>> for v2::Setting<T> {
fn from(source: v1::settings::UpdateState<T>) -> Self {
match source {
v1::settings::UpdateState::Update(new_value) => v2::Setting::Set(new_value),
v1::settings::UpdateState::Clear => v2::Setting::Reset,
v1::settings::UpdateState::Nothing => v2::Setting::NotSet,
}
}
}
#[cfg(test)]
pub(crate) mod test {
use std::fs::File;
use std::io::BufReader;
use flate2::bufread::GzDecoder;
use meili_snap::insta;
use tempfile::TempDir;
use super::*;
#[test]
fn compat_v1_v2() {
let dump = File::open("tests/assets/v1.dump").unwrap();
let dir = TempDir::new().unwrap();
let mut dump = BufReader::new(dump);
let gz = GzDecoder::new(&mut dump);
let mut archive = tar::Archive::new(gz);
archive.unpack(dir.path()).unwrap();
let mut dump = v1::V1Reader::open(dir).unwrap().to_v2();
// top level infos
assert_eq!(dump.date(), None);
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"2298010973ee98cf4670787314176a3a");
assert_eq!(update_files.len(), 9);
assert!(update_files[..].iter().all(|u| u.is_none())); // no update file in dumps v1
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "2022-10-02T13:23:39.976870431Z",
"updatedAt": "2022-10-02T13:27:54.353262482Z"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"b01c8371aea4c7171af0d4d846a2bdca");
// movies
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "2022-10-02T13:15:29.477512777Z",
"updatedAt": "2022-10-02T13:21:12.671204856Z"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"b63dbed5bbc059f3e32bc471ae699bf5");
// spells
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "2022-10-02T13:38:26.358882984Z",
"updatedAt": "2022-10-02T13:38:26.385609433Z"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"aa24c0cfc733d66c396237ad44263bed");
}
}

View file

@ -0,0 +1,512 @@
use std::str::FromStr;
use time::OffsetDateTime;
use uuid::Uuid;
use super::v1_to_v2::{CompatIndexV1ToV2, CompatV1ToV2};
use super::v3_to_v4::CompatV3ToV4;
use crate::reader::{v2, v3, Document};
use crate::Result;
pub enum CompatV2ToV3 {
V2(v2::V2Reader),
Compat(CompatV1ToV2),
}
impl CompatV2ToV3 {
pub fn new(v2: v2::V2Reader) -> CompatV2ToV3 {
CompatV2ToV3::V2(v2)
}
pub fn index_uuid(&self) -> Vec<v3::meta::IndexUuid> {
let v2_uuids = match self {
CompatV2ToV3::V2(from) => from.index_uuid(),
CompatV2ToV3::Compat(compat) => compat.index_uuid(),
};
v2_uuids
.into_iter()
.map(|index| v3::meta::IndexUuid { uid: index.uid, uuid: index.uuid })
.collect()
}
pub fn to_v4(self) -> CompatV3ToV4 {
CompatV3ToV4::Compat(self)
}
pub fn version(&self) -> crate::Version {
match self {
CompatV2ToV3::V2(from) => from.version(),
CompatV2ToV3::Compat(compat) => compat.version(),
}
}
pub fn date(&self) -> Option<time::OffsetDateTime> {
match self {
CompatV2ToV3::V2(from) => from.date(),
CompatV2ToV3::Compat(compat) => compat.date(),
}
}
pub fn instance_uid(&self) -> Result<Option<uuid::Uuid>> {
Ok(None)
}
pub fn indexes(&self) -> Result<impl Iterator<Item = Result<CompatIndexV2ToV3>> + '_> {
Ok(match self {
CompatV2ToV3::V2(from) => Box::new(from.indexes()?.map(|index_reader| -> Result<_> {
let compat = CompatIndexV2ToV3::new(index_reader?);
Ok(compat)
}))
as Box<dyn Iterator<Item = Result<CompatIndexV2ToV3>> + '_>,
CompatV2ToV3::Compat(compat) => Box::new(compat.indexes()?.map(|index_reader| {
let compat = CompatIndexV2ToV3::Compat(Box::new(index_reader?));
Ok(compat)
}))
as Box<dyn Iterator<Item = Result<CompatIndexV2ToV3>> + '_>,
})
}
pub fn tasks(
&mut self,
) -> Box<
dyn Iterator<Item = Result<(v3::Task, Option<Box<dyn Iterator<Item = Result<Document>>>>)>>
+ '_,
> {
let tasks = match self {
CompatV2ToV3::V2(from) => from.tasks(),
CompatV2ToV3::Compat(compat) => compat.tasks(),
};
Box::new(
tasks
.map(move |task| {
task.map(|(task, content_file)| {
let task = v3::Task { uuid: task.uuid, update: task.update.into() };
Some((
task,
content_file.map(|content_file| {
Box::new(content_file) as Box<dyn Iterator<Item = Result<Document>>>
}),
))
})
})
.filter_map(|res| res.transpose()),
)
}
}
pub enum CompatIndexV2ToV3 {
V2(v2::V2IndexReader),
Compat(Box<CompatIndexV1ToV2>),
}
impl CompatIndexV2ToV3 {
pub fn new(v2: v2::V2IndexReader) -> CompatIndexV2ToV3 {
CompatIndexV2ToV3::V2(v2)
}
pub fn metadata(&self) -> &crate::IndexMetadata {
match self {
CompatIndexV2ToV3::V2(from) => from.metadata(),
CompatIndexV2ToV3::Compat(compat) => compat.metadata(),
}
}
pub fn documents(&mut self) -> Result<Box<dyn Iterator<Item = Result<Document>> + '_>> {
match self {
CompatIndexV2ToV3::V2(from) => from
.documents()
.map(|iter| Box::new(iter) as Box<dyn Iterator<Item = Result<Document>> + '_>),
CompatIndexV2ToV3::Compat(compat) => compat.documents(),
}
}
pub fn settings(&mut self) -> Result<v3::Settings<v3::Checked>> {
let settings = match self {
CompatIndexV2ToV3::V2(from) => from.settings()?,
CompatIndexV2ToV3::Compat(compat) => compat.settings()?,
};
Ok(v3::Settings::<v3::Unchecked>::from(settings).check())
}
}
impl From<v2::updates::UpdateStatus> for v3::updates::UpdateStatus {
fn from(update: v2::updates::UpdateStatus) -> Self {
match update {
v2::updates::UpdateStatus::Processing(processing) => {
match (processing.from.meta.clone(), processing.from.content).try_into() {
Ok(meta) => v3::updates::UpdateStatus::Processing(v3::updates::Processing {
from: v3::updates::Enqueued {
update_id: processing.from.update_id,
meta,
enqueued_at: processing.from.enqueued_at,
},
started_processing_at: processing.started_processing_at,
}),
Err(e) => {
tracing::warn!("Error with task {}: {}", processing.from.update_id, e);
tracing::warn!("Task will be marked as `Failed`.");
v3::updates::UpdateStatus::Failed(v3::updates::Failed {
from: v3::updates::Processing {
from: v3::updates::Enqueued {
update_id: processing.from.update_id,
meta: update_from_unchecked_update_meta(processing.from.meta),
enqueued_at: processing.from.enqueued_at,
},
started_processing_at: processing.started_processing_at,
},
msg: e.to_string(),
code: v3::Code::MalformedDump,
failed_at: OffsetDateTime::now_utc(),
})
}
}
}
v2::updates::UpdateStatus::Enqueued(enqueued) => {
match (enqueued.meta.clone(), enqueued.content).try_into() {
Ok(meta) => v3::updates::UpdateStatus::Enqueued(v3::updates::Enqueued {
update_id: enqueued.update_id,
meta,
enqueued_at: enqueued.enqueued_at,
}),
Err(e) => {
tracing::warn!("Error with task {}: {}", enqueued.update_id, e);
tracing::warn!("Task will be marked as `Failed`.");
v3::updates::UpdateStatus::Failed(v3::updates::Failed {
from: v3::updates::Processing {
from: v3::updates::Enqueued {
update_id: enqueued.update_id,
meta: update_from_unchecked_update_meta(enqueued.meta),
enqueued_at: enqueued.enqueued_at,
},
started_processing_at: OffsetDateTime::now_utc(),
},
msg: e.to_string(),
code: v3::Code::MalformedDump,
failed_at: OffsetDateTime::now_utc(),
})
}
}
}
v2::updates::UpdateStatus::Processed(processed) => {
v3::updates::UpdateStatus::Processed(v3::updates::Processed {
success: processed.success.into(),
processed_at: processed.processed_at,
from: v3::updates::Processing {
from: v3::updates::Enqueued {
update_id: processed.from.from.update_id,
// since we're never going to read the content_file again it's ok to generate a fake one.
meta: update_from_unchecked_update_meta(processed.from.from.meta),
enqueued_at: processed.from.from.enqueued_at,
},
started_processing_at: processed.from.started_processing_at,
},
})
}
v2::updates::UpdateStatus::Aborted(aborted) => {
v3::updates::UpdateStatus::Aborted(v3::updates::Aborted {
from: v3::updates::Enqueued {
update_id: aborted.from.update_id,
// since we're never going to read the content_file again it's ok to generate a fake one.
meta: update_from_unchecked_update_meta(aborted.from.meta),
enqueued_at: aborted.from.enqueued_at,
},
aborted_at: aborted.aborted_at,
})
}
v2::updates::UpdateStatus::Failed(failed) => {
v3::updates::UpdateStatus::Failed(v3::updates::Failed {
from: v3::updates::Processing {
from: v3::updates::Enqueued {
update_id: failed.from.from.update_id,
// since we're never going to read the content_file again it's ok to generate a fake one.
meta: update_from_unchecked_update_meta(failed.from.from.meta),
enqueued_at: failed.from.from.enqueued_at,
},
started_processing_at: failed.from.started_processing_at,
},
msg: failed.error.message,
code: failed.error.error_code.into(),
failed_at: failed.failed_at,
})
}
}
}
}
impl TryFrom<(v2::updates::UpdateMeta, Option<Uuid>)> for v3::updates::Update {
type Error = crate::Error;
fn try_from((update, uuid): (v2::updates::UpdateMeta, Option<Uuid>)) -> Result<Self> {
Ok(match update {
v2::updates::UpdateMeta::DocumentsAddition { method, format: _, primary_key }
if uuid.is_some() =>
{
v3::updates::Update::DocumentAddition {
primary_key,
method: match method {
v2::updates::IndexDocumentsMethod::ReplaceDocuments => {
v3::updates::IndexDocumentsMethod::ReplaceDocuments
}
v2::updates::IndexDocumentsMethod::UpdateDocuments => {
v3::updates::IndexDocumentsMethod::UpdateDocuments
}
},
content_uuid: uuid.unwrap(),
}
}
v2::updates::UpdateMeta::DocumentsAddition { .. } => {
return Err(crate::Error::MalformedTask)
}
v2::updates::UpdateMeta::ClearDocuments => v3::updates::Update::ClearDocuments,
v2::updates::UpdateMeta::DeleteDocuments { ids } => {
v3::updates::Update::DeleteDocuments(ids)
}
v2::updates::UpdateMeta::Settings(settings) => {
v3::updates::Update::Settings(settings.into())
}
})
}
}
pub fn update_from_unchecked_update_meta(update: v2::updates::UpdateMeta) -> v3::updates::Update {
match update {
v2::updates::UpdateMeta::DocumentsAddition { method, format: _, primary_key } => {
v3::updates::Update::DocumentAddition {
primary_key,
method: match method {
v2::updates::IndexDocumentsMethod::ReplaceDocuments => {
v3::updates::IndexDocumentsMethod::ReplaceDocuments
}
v2::updates::IndexDocumentsMethod::UpdateDocuments => {
v3::updates::IndexDocumentsMethod::UpdateDocuments
}
},
// we use this special uuid so we can recognize it if one day there is a bug related to this field.
content_uuid: Uuid::from_str("00112233-4455-6677-8899-aabbccddeeff").unwrap(),
}
}
v2::updates::UpdateMeta::ClearDocuments => v3::updates::Update::ClearDocuments,
v2::updates::UpdateMeta::DeleteDocuments { ids } => {
v3::updates::Update::DeleteDocuments(ids)
}
v2::updates::UpdateMeta::Settings(settings) => {
v3::updates::Update::Settings(settings.into())
}
}
}
impl From<v2::updates::UpdateResult> for v3::updates::UpdateResult {
fn from(result: v2::updates::UpdateResult) -> Self {
match result {
v2::updates::UpdateResult::DocumentsAddition(addition) => {
v3::updates::UpdateResult::DocumentsAddition(v3::updates::DocumentAdditionResult {
nb_documents: addition.nb_documents,
})
}
v2::updates::UpdateResult::DocumentDeletion { deleted } => {
v3::updates::UpdateResult::DocumentDeletion { deleted }
}
v2::updates::UpdateResult::Other => v3::updates::UpdateResult::Other,
}
}
}
impl From<String> for v3::Code {
fn from(code: String) -> Self {
match code.as_ref() {
"create_index" => v3::Code::CreateIndex,
"index_already_exists" => v3::Code::IndexAlreadyExists,
"index_not_found" => v3::Code::IndexNotFound,
"invalid_index_uid" => v3::Code::InvalidIndexUid,
"invalid_state" => v3::Code::InvalidState,
"missing_primary_key" => v3::Code::MissingPrimaryKey,
"primary_key_already_present" => v3::Code::PrimaryKeyAlreadyPresent,
"max_fields_limit_exceeded" => v3::Code::MaxFieldsLimitExceeded,
"missing_document_id" => v3::Code::MissingDocumentId,
"invalid_document_id" => v3::Code::InvalidDocumentId,
"filter" => v3::Code::Filter,
"sort" => v3::Code::Sort,
"bad_parameter" => v3::Code::BadParameter,
"bad_request" => v3::Code::BadRequest,
"database_size_limit_reached" => v3::Code::DatabaseSizeLimitReached,
"document_not_found" => v3::Code::DocumentNotFound,
"internal" => v3::Code::Internal,
"invalid_geo_field" => v3::Code::InvalidGeoField,
"invalid_ranking_rule" => v3::Code::InvalidRankingRule,
"invalid_store" => v3::Code::InvalidStore,
"invalid_token" => v3::Code::InvalidToken,
"missing_authorization_header" => v3::Code::MissingAuthorizationHeader,
"no_space_left_on_device" => v3::Code::NoSpaceLeftOnDevice,
"dump_not_found" => v3::Code::DumpNotFound,
"task_not_found" => v3::Code::TaskNotFound,
"payload_too_large" => v3::Code::PayloadTooLarge,
"retrieve_document" => v3::Code::RetrieveDocument,
"search_documents" => v3::Code::SearchDocuments,
"unsupported_media_type" => v3::Code::UnsupportedMediaType,
"dump_already_in_progress" => v3::Code::DumpAlreadyInProgress,
"dump_process_failed" => v3::Code::DumpProcessFailed,
"invalid_content_type" => v3::Code::InvalidContentType,
"missing_content_type" => v3::Code::MissingContentType,
"malformed_payload" => v3::Code::MalformedPayload,
"missing_payload" => v3::Code::MissingPayload,
other => {
tracing::warn!("Unknown error code {}", other);
v3::Code::UnretrievableErrorCode
}
}
}
}
impl<A> From<v2::Setting<A>> for v3::Setting<A> {
fn from(setting: v2::Setting<A>) -> Self {
match setting {
v2::settings::Setting::Set(a) => v3::settings::Setting::Set(a),
v2::settings::Setting::Reset => v3::settings::Setting::Reset,
v2::settings::Setting::NotSet => v3::settings::Setting::NotSet,
}
}
}
impl<T> From<v2::Settings<T>> for v3::Settings<v3::Unchecked> {
fn from(settings: v2::Settings<T>) -> Self {
v3::Settings {
displayed_attributes: settings.displayed_attributes.into(),
searchable_attributes: settings.searchable_attributes.into(),
filterable_attributes: settings.filterable_attributes.into(),
sortable_attributes: settings.sortable_attributes.into(),
ranking_rules: v3::Setting::from(settings.ranking_rules).map(|criteria| {
criteria.into_iter().map(|criterion| patch_ranking_rules(&criterion)).collect()
}),
stop_words: settings.stop_words.into(),
synonyms: settings.synonyms.into(),
distinct_attribute: settings.distinct_attribute.into(),
_kind: std::marker::PhantomData,
}
}
}
fn patch_ranking_rules(ranking_rule: &str) -> String {
match v2::settings::Criterion::from_str(ranking_rule) {
Ok(v2::settings::Criterion::Words) => String::from("words"),
Ok(v2::settings::Criterion::Typo) => String::from("typo"),
Ok(v2::settings::Criterion::Proximity) => String::from("proximity"),
Ok(v2::settings::Criterion::Attribute) => String::from("attribute"),
Ok(v2::settings::Criterion::Sort) => String::from("sort"),
Ok(v2::settings::Criterion::Exactness) => String::from("exactness"),
Ok(v2::settings::Criterion::Asc(name)) => format!("{name}:asc"),
Ok(v2::settings::Criterion::Desc(name)) => format!("{name}:desc"),
// we want to forward the error to the current version of meilisearch
Err(_) => ranking_rule.to_string(),
}
}
#[cfg(test)]
pub(crate) mod test {
use std::fs::File;
use std::io::BufReader;
use flate2::bufread::GzDecoder;
use meili_snap::insta;
use tempfile::TempDir;
use super::*;
#[test]
fn compat_v2_v3() {
let dump = File::open("tests/assets/v2.dump").unwrap();
let dir = TempDir::new().unwrap();
let mut dump = BufReader::new(dump);
let gz = GzDecoder::new(&mut dump);
let mut archive = tar::Archive::new(gz);
archive.unpack(dir.path()).unwrap();
let mut dump = v2::V2Reader::open(dir).unwrap().to_v3();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-09 20:27:59.904096267 +00:00:00");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();
let (tasks, mut update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"9507711db47c7171c79bc6d57d0bed79");
assert_eq!(update_files.len(), 9);
assert!(update_files[0].is_some()); // the enqueued document addition
assert!(update_files[1..].iter().all(|u| u.is_none())); // everything already processed
let update_file = update_files.remove(0).unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(update_file), @"7b8889539b669c7b9ddba448bafa385d");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies2 = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"548284a84de510f71e88e6cdea495cf5");
// movies
insta::assert_json_snapshot!(movies.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 110);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d153b5a81d8b3cdcbe1dec270b574022");
// movies2
insta::assert_json_snapshot!(movies2.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "movies_2",
"primaryKey": null,
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(movies2.settings().unwrap());
let documents = movies2.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 0);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d751713988987e9331980363e24189ce");
// spells
insta::assert_json_snapshot!(spells.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"235016433dd04262c7f2da01d1e808ce");
}
}

View file

@ -0,0 +1,449 @@
use super::v2_to_v3::{CompatIndexV2ToV3, CompatV2ToV3};
use super::v4_to_v5::CompatV4ToV5;
use crate::reader::{v3, v4, UpdateFile};
use crate::Result;
pub enum CompatV3ToV4 {
V3(v3::V3Reader),
Compat(CompatV2ToV3),
}
impl CompatV3ToV4 {
pub fn new(v3: v3::V3Reader) -> CompatV3ToV4 {
CompatV3ToV4::V3(v3)
}
pub fn to_v5(self) -> CompatV4ToV5 {
CompatV4ToV5::Compat(self)
}
pub fn version(&self) -> crate::Version {
match self {
CompatV3ToV4::V3(v3) => v3.version(),
CompatV3ToV4::Compat(compat) => compat.version(),
}
}
pub fn date(&self) -> Option<time::OffsetDateTime> {
match self {
CompatV3ToV4::V3(v3) => v3.date(),
CompatV3ToV4::Compat(compat) => compat.date(),
}
}
pub fn instance_uid(&self) -> Result<Option<uuid::Uuid>> {
Ok(None)
}
pub fn indexes(&self) -> Result<impl Iterator<Item = Result<CompatIndexV3ToV4>> + '_> {
Ok(match self {
CompatV3ToV4::V3(v3) => {
Box::new(v3.indexes()?.map(|index| index.map(CompatIndexV3ToV4::from)))
as Box<dyn Iterator<Item = Result<CompatIndexV3ToV4>> + '_>
}
CompatV3ToV4::Compat(compat) => {
Box::new(compat.indexes()?.map(|index| index.map(CompatIndexV3ToV4::from)))
as Box<dyn Iterator<Item = Result<CompatIndexV3ToV4>> + '_>
}
})
}
pub fn tasks(
&mut self,
) -> Box<dyn Iterator<Item = Result<(v4::Task, Option<Box<UpdateFile>>)>> + '_> {
let indexes = match self {
CompatV3ToV4::V3(v3) => v3.index_uuid(),
CompatV3ToV4::Compat(compat) => compat.index_uuid(),
};
let tasks = match self {
CompatV3ToV4::V3(v3) => v3.tasks(),
CompatV3ToV4::Compat(compat) => compat.tasks(),
};
Box::new(
tasks
// we need to override the old task ids that were generated
// by index in favor of a global unique incremental ID.
.enumerate()
.map(move |(task_id, task)| {
task.map(|(task, content_file)| {
let index_uid = indexes
.iter()
.find(|index| index.uuid == task.uuid)
.map(|index| index.uid.clone());
let index_uid = match index_uid {
Some(uid) => uid,
None => {
tracing::warn!(
"Error while importing the update {}.",
task.update.id()
);
tracing::warn!(
"The index associated to the uuid `{}` could not be retrieved.",
task.uuid.to_string()
);
if task.update.is_finished() {
// we're fucking with his history but not his data, that's ok-ish.
tracing::warn!("The index-uuid will be set as `unknown`.");
String::from("unknown")
} else {
tracing::warn!("The task will be ignored.");
return None;
}
}
};
let task = v4::Task {
id: task_id as u32,
index_uid: v4::meta::IndexUid(index_uid),
content: match task.update.meta() {
v3::Kind::DeleteDocuments(documents) => {
v4::tasks::TaskContent::DocumentDeletion(
v4::tasks::DocumentDeletion::Ids(documents.clone()),
)
}
v3::Kind::DocumentAddition {
primary_key,
method,
content_uuid,
} => v4::tasks::TaskContent::DocumentAddition {
merge_strategy: match method {
v3::updates::IndexDocumentsMethod::ReplaceDocuments => {
v4::tasks::IndexDocumentsMethod::ReplaceDocuments
}
v3::updates::IndexDocumentsMethod::UpdateDocuments => {
v4::tasks::IndexDocumentsMethod::UpdateDocuments
}
},
primary_key: primary_key.clone(),
documents_count: 0, // we don't have this info
allow_index_creation: true, // there was no API-key in the v3
content_uuid: *content_uuid,
},
v3::Kind::Settings(settings) => {
v4::tasks::TaskContent::SettingsUpdate {
settings: v4::Settings::from(settings.clone()),
is_deletion: false, // that didn't exist at this time
allow_index_creation: true, // there was no API-key in the v3
}
}
v3::Kind::ClearDocuments => {
v4::tasks::TaskContent::DocumentDeletion(
v4::tasks::DocumentDeletion::Clear,
)
}
},
events: match task.update {
v3::Status::Processing(processing) => {
vec![v4::tasks::TaskEvent::Created(processing.from.enqueued_at)]
}
v3::Status::Enqueued(enqueued) => {
vec![v4::tasks::TaskEvent::Created(enqueued.enqueued_at)]
}
v3::Status::Processed(processed) => {
vec![
v4::tasks::TaskEvent::Created(
processed.from.from.enqueued_at,
),
v4::tasks::TaskEvent::Processing(
processed.from.started_processing_at,
),
v4::tasks::TaskEvent::Succeded {
result: match processed.success {
v3::updates::UpdateResult::DocumentsAddition(
document_addition,
) => v4::tasks::TaskResult::DocumentAddition {
indexed_documents: document_addition
.nb_documents
as u64,
},
v3::updates::UpdateResult::DocumentDeletion {
deleted,
} => v4::tasks::TaskResult::DocumentDeletion {
deleted_documents: deleted,
},
v3::updates::UpdateResult::Other => {
v4::tasks::TaskResult::Other
}
},
timestamp: processed.processed_at,
},
]
}
v3::Status::Failed(failed) => vec![
v4::tasks::TaskEvent::Created(failed.from.from.enqueued_at),
v4::tasks::TaskEvent::Processing(
failed.from.started_processing_at,
),
v4::tasks::TaskEvent::Failed {
error: v4::ResponseError::from_msg(
failed.msg.to_string(),
failed.code.into(),
),
timestamp: failed.failed_at,
},
],
v3::Status::Aborted(aborted) => vec![
v4::tasks::TaskEvent::Created(aborted.from.enqueued_at),
v4::tasks::TaskEvent::Failed {
error: v4::ResponseError::from_msg(
"Task was aborted in a previous version of meilisearch."
.to_string(),
v4::errors::Code::UnretrievableErrorCode,
),
timestamp: aborted.aborted_at,
},
],
},
};
Some((task, content_file))
})
})
.filter_map(|res| res.transpose()),
)
}
pub fn keys(&mut self) -> Box<dyn Iterator<Item = Result<v4::Key>> + '_> {
Box::new(std::iter::empty())
}
}
pub enum CompatIndexV3ToV4 {
V3(v3::V3IndexReader),
Compat(CompatIndexV2ToV3),
}
impl From<v3::V3IndexReader> for CompatIndexV3ToV4 {
fn from(index_reader: v3::V3IndexReader) -> Self {
Self::V3(index_reader)
}
}
impl From<CompatIndexV2ToV3> for CompatIndexV3ToV4 {
fn from(index_reader: CompatIndexV2ToV3) -> Self {
Self::Compat(index_reader)
}
}
impl CompatIndexV3ToV4 {
pub fn new(v3: v3::V3IndexReader) -> CompatIndexV3ToV4 {
CompatIndexV3ToV4::V3(v3)
}
pub fn metadata(&self) -> &crate::IndexMetadata {
match self {
CompatIndexV3ToV4::V3(v3) => v3.metadata(),
CompatIndexV3ToV4::Compat(compat) => compat.metadata(),
}
}
pub fn documents(&mut self) -> Result<Box<dyn Iterator<Item = Result<v4::Document>> + '_>> {
match self {
CompatIndexV3ToV4::V3(v3) => v3
.documents()
.map(|iter| Box::new(iter) as Box<dyn Iterator<Item = Result<v4::Document>> + '_>),
CompatIndexV3ToV4::Compat(compat) => compat
.documents()
.map(|iter| Box::new(iter) as Box<dyn Iterator<Item = Result<v4::Document>> + '_>),
}
}
pub fn settings(&mut self) -> Result<v4::Settings<v4::Checked>> {
Ok(match self {
CompatIndexV3ToV4::V3(v3) => {
v4::Settings::<v4::Unchecked>::from(v3.settings()?).check()
}
CompatIndexV3ToV4::Compat(compat) => {
v4::Settings::<v4::Unchecked>::from(compat.settings()?).check()
}
})
}
}
impl<T> From<v3::Setting<T>> for v4::Setting<T> {
fn from(setting: v3::Setting<T>) -> Self {
match setting {
v3::Setting::Set(t) => v4::Setting::Set(t),
v3::Setting::Reset => v4::Setting::Reset,
v3::Setting::NotSet => v4::Setting::NotSet,
}
}
}
impl From<v3::Code> for v4::Code {
fn from(code: v3::Code) -> Self {
match code {
v3::Code::CreateIndex => v4::Code::CreateIndex,
v3::Code::IndexAlreadyExists => v4::Code::IndexAlreadyExists,
v3::Code::IndexNotFound => v4::Code::IndexNotFound,
v3::Code::InvalidIndexUid => v4::Code::InvalidIndexUid,
v3::Code::InvalidState => v4::Code::InvalidState,
v3::Code::MissingPrimaryKey => v4::Code::MissingPrimaryKey,
v3::Code::PrimaryKeyAlreadyPresent => v4::Code::PrimaryKeyAlreadyPresent,
v3::Code::MaxFieldsLimitExceeded => v4::Code::MaxFieldsLimitExceeded,
v3::Code::MissingDocumentId => v4::Code::MissingDocumentId,
v3::Code::InvalidDocumentId => v4::Code::InvalidDocumentId,
v3::Code::Filter => v4::Code::Filter,
v3::Code::Sort => v4::Code::Sort,
v3::Code::BadParameter => v4::Code::BadParameter,
v3::Code::BadRequest => v4::Code::BadRequest,
v3::Code::DatabaseSizeLimitReached => v4::Code::DatabaseSizeLimitReached,
v3::Code::DocumentNotFound => v4::Code::DocumentNotFound,
v3::Code::Internal => v4::Code::Internal,
v3::Code::InvalidGeoField => v4::Code::InvalidGeoField,
v3::Code::InvalidRankingRule => v4::Code::InvalidRankingRule,
v3::Code::InvalidStore => v4::Code::InvalidStore,
v3::Code::InvalidToken => v4::Code::InvalidToken,
v3::Code::MissingAuthorizationHeader => v4::Code::MissingAuthorizationHeader,
v3::Code::NoSpaceLeftOnDevice => v4::Code::NoSpaceLeftOnDevice,
v3::Code::DumpNotFound => v4::Code::DumpNotFound,
v3::Code::TaskNotFound => v4::Code::TaskNotFound,
v3::Code::PayloadTooLarge => v4::Code::PayloadTooLarge,
v3::Code::RetrieveDocument => v4::Code::RetrieveDocument,
v3::Code::SearchDocuments => v4::Code::SearchDocuments,
v3::Code::UnsupportedMediaType => v4::Code::UnsupportedMediaType,
v3::Code::DumpAlreadyInProgress => v4::Code::DumpAlreadyInProgress,
v3::Code::DumpProcessFailed => v4::Code::DumpProcessFailed,
v3::Code::InvalidContentType => v4::Code::InvalidContentType,
v3::Code::MissingContentType => v4::Code::MissingContentType,
v3::Code::MalformedPayload => v4::Code::MalformedPayload,
v3::Code::MissingPayload => v4::Code::MissingPayload,
v3::Code::UnretrievableErrorCode => v4::Code::UnretrievableErrorCode,
v3::Code::MalformedDump => v4::Code::MalformedDump,
}
}
}
impl<T> From<v3::Settings<T>> for v4::Settings<v4::Unchecked> {
fn from(settings: v3::Settings<T>) -> Self {
v4::Settings {
displayed_attributes: settings.displayed_attributes.into(),
searchable_attributes: settings.searchable_attributes.into(),
filterable_attributes: settings.filterable_attributes.into(),
sortable_attributes: settings.sortable_attributes.into(),
ranking_rules: settings.ranking_rules.into(),
stop_words: settings.stop_words.into(),
synonyms: settings.synonyms.into(),
distinct_attribute: settings.distinct_attribute.into(),
typo_tolerance: v4::Setting::NotSet,
_kind: std::marker::PhantomData,
}
}
}
#[cfg(test)]
pub(crate) mod test {
use std::fs::File;
use std::io::BufReader;
use flate2::bufread::GzDecoder;
use meili_snap::insta;
use tempfile::TempDir;
use super::*;
#[test]
fn compat_v3_v4() {
let dump = File::open("tests/assets/v3.dump").unwrap();
let dir = TempDir::new().unwrap();
let mut dump = BufReader::new(dump);
let gz = GzDecoder::new(&mut dump);
let mut archive = tar::Archive::new(gz);
archive.unpack(dir.path()).unwrap();
let mut dump = v3::V3Reader::open(dir).unwrap().to_v4();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-07 11:39:03.709153554 +00:00:00");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();
let (tasks, mut update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"79bc053583a1a7172bbaaafb1edaeb78");
assert_eq!(update_files.len(), 10);
assert!(update_files[0].is_some()); // the enqueued document addition
assert!(update_files[1..].iter().all(|u| u.is_none())); // everything already processed
let update_file = update_files.remove(0).unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(update_file), @"7b8889539b669c7b9ddba448bafa385d");
// keys
let keys = dump.keys().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(keys, { "[].uid" => "[uuid]" }), @"d751713988987e9331980363e24189ce");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies2 = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"548284a84de510f71e88e6cdea495cf5");
// movies
insta::assert_json_snapshot!(movies.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 110);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d153b5a81d8b3cdcbe1dec270b574022");
// movies2
insta::assert_json_snapshot!(movies2.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "movies_2",
"primaryKey": null,
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(movies2.settings().unwrap());
let documents = movies2.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 0);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d751713988987e9331980363e24189ce");
// spells
insta::assert_json_snapshot!(spells.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"235016433dd04262c7f2da01d1e808ce");
}
}

View file

@ -0,0 +1,467 @@
use super::v3_to_v4::{CompatIndexV3ToV4, CompatV3ToV4};
use super::v5_to_v6::CompatV5ToV6;
use crate::reader::{v4, v5, Document};
use crate::Result;
pub enum CompatV4ToV5 {
V4(v4::V4Reader),
Compat(CompatV3ToV4),
}
impl CompatV4ToV5 {
pub fn new(v4: v4::V4Reader) -> CompatV4ToV5 {
CompatV4ToV5::V4(v4)
}
pub fn to_v6(self) -> CompatV5ToV6 {
CompatV5ToV6::Compat(self)
}
pub fn version(&self) -> crate::Version {
match self {
CompatV4ToV5::V4(v4) => v4.version(),
CompatV4ToV5::Compat(compat) => compat.version(),
}
}
pub fn date(&self) -> Option<time::OffsetDateTime> {
match self {
CompatV4ToV5::V4(v4) => v4.date(),
CompatV4ToV5::Compat(compat) => compat.date(),
}
}
pub fn instance_uid(&self) -> Result<Option<uuid::Uuid>> {
match self {
CompatV4ToV5::V4(v4) => v4.instance_uid(),
CompatV4ToV5::Compat(compat) => compat.instance_uid(),
}
}
pub fn indexes(&self) -> Result<Box<dyn Iterator<Item = Result<CompatIndexV4ToV5>> + '_>> {
Ok(match self {
CompatV4ToV5::V4(v4) => {
Box::new(v4.indexes()?.map(|index| index.map(CompatIndexV4ToV5::from)))
as Box<dyn Iterator<Item = Result<CompatIndexV4ToV5>> + '_>
}
CompatV4ToV5::Compat(compat) => {
Box::new(compat.indexes()?.map(|index| index.map(CompatIndexV4ToV5::from)))
as Box<dyn Iterator<Item = Result<CompatIndexV4ToV5>> + '_>
}
})
}
pub fn tasks(
&mut self,
) -> Box<dyn Iterator<Item = Result<(v5::Task, Option<Box<crate::reader::UpdateFile>>)>> + '_>
{
let tasks = match self {
CompatV4ToV5::V4(v4) => v4.tasks(),
CompatV4ToV5::Compat(compat) => compat.tasks(),
};
Box::new(tasks.map(|task| {
task.map(|(task, content_file)| {
let task = v5::Task {
id: task.id,
content: match task.content {
v4::tasks::TaskContent::DocumentAddition {
content_uuid,
merge_strategy,
primary_key,
documents_count,
allow_index_creation,
} => v5::tasks::TaskContent::DocumentAddition {
index_uid: v5::meta::IndexUid(task.index_uid.0),
content_uuid,
merge_strategy: match merge_strategy {
v4::tasks::IndexDocumentsMethod::ReplaceDocuments => {
v5::tasks::IndexDocumentsMethod::ReplaceDocuments
}
v4::tasks::IndexDocumentsMethod::UpdateDocuments => {
v5::tasks::IndexDocumentsMethod::UpdateDocuments
}
},
primary_key,
documents_count,
allow_index_creation,
},
v4::tasks::TaskContent::DocumentDeletion(deletion) => {
v5::tasks::TaskContent::DocumentDeletion {
index_uid: v5::meta::IndexUid(task.index_uid.0),
deletion: match deletion {
v4::tasks::DocumentDeletion::Clear => {
v5::tasks::DocumentDeletion::Clear
}
v4::tasks::DocumentDeletion::Ids(ids) => {
v5::tasks::DocumentDeletion::Ids(ids)
}
},
}
}
v4::tasks::TaskContent::SettingsUpdate {
settings,
is_deletion,
allow_index_creation,
} => v5::tasks::TaskContent::SettingsUpdate {
index_uid: v5::meta::IndexUid(task.index_uid.0),
settings: settings.into(),
is_deletion,
allow_index_creation,
},
v4::tasks::TaskContent::IndexDeletion => {
v5::tasks::TaskContent::IndexDeletion {
index_uid: v5::meta::IndexUid(task.index_uid.0),
}
}
v4::tasks::TaskContent::IndexCreation { primary_key } => {
v5::tasks::TaskContent::IndexCreation {
index_uid: v5::meta::IndexUid(task.index_uid.0),
primary_key,
}
}
v4::tasks::TaskContent::IndexUpdate { primary_key } => {
v5::tasks::TaskContent::IndexUpdate {
index_uid: v5::meta::IndexUid(task.index_uid.0),
primary_key,
}
}
},
events: task
.events
.into_iter()
.map(|event| match event {
v4::tasks::TaskEvent::Created(date) => {
v5::tasks::TaskEvent::Created(date)
}
v4::tasks::TaskEvent::Batched { timestamp, batch_id } => {
v5::tasks::TaskEvent::Batched { timestamp, batch_id }
}
v4::tasks::TaskEvent::Processing(date) => {
v5::tasks::TaskEvent::Processing(date)
}
v4::tasks::TaskEvent::Succeded { result, timestamp } => {
v5::tasks::TaskEvent::Succeeded {
result: match result {
v4::tasks::TaskResult::DocumentAddition {
indexed_documents,
} => v5::tasks::TaskResult::DocumentAddition {
indexed_documents,
},
v4::tasks::TaskResult::DocumentDeletion {
deleted_documents,
} => v5::tasks::TaskResult::DocumentDeletion {
deleted_documents,
},
v4::tasks::TaskResult::ClearAll { deleted_documents } => {
v5::tasks::TaskResult::ClearAll { deleted_documents }
}
v4::tasks::TaskResult::Other => {
v5::tasks::TaskResult::Other
}
},
timestamp,
}
}
v4::tasks::TaskEvent::Failed { error, timestamp } => {
v5::tasks::TaskEvent::Failed {
error: v5::ResponseError::from(error),
timestamp,
}
}
})
.collect(),
};
(task, content_file)
})
}))
}
pub fn keys(&mut self) -> Box<dyn Iterator<Item = Result<v5::Key>> + '_> {
let keys = match self {
CompatV4ToV5::V4(v4) => v4.keys(),
CompatV4ToV5::Compat(compat) => compat.keys(),
};
Box::new(keys.map(|key| {
key.map(|key| v5::Key {
description: key.description,
name: None,
uid: v5::keys::KeyId::new_v4(),
actions: key.actions.into_iter().filter_map(|action| action.into()).collect(),
indexes: key
.indexes
.into_iter()
.map(|index| match index.as_str() {
"*" => v5::StarOr::Star,
_ => v5::StarOr::Other(v5::meta::IndexUid(index)),
})
.collect(),
expires_at: key.expires_at,
created_at: key.created_at,
updated_at: key.updated_at,
})
}))
}
}
pub enum CompatIndexV4ToV5 {
V4(v4::V4IndexReader),
Compat(CompatIndexV3ToV4),
}
impl From<v4::V4IndexReader> for CompatIndexV4ToV5 {
fn from(index_reader: v4::V4IndexReader) -> Self {
Self::V4(index_reader)
}
}
impl From<CompatIndexV3ToV4> for CompatIndexV4ToV5 {
fn from(index_reader: CompatIndexV3ToV4) -> Self {
Self::Compat(index_reader)
}
}
impl CompatIndexV4ToV5 {
pub fn metadata(&self) -> &crate::IndexMetadata {
match self {
CompatIndexV4ToV5::V4(v4) => v4.metadata(),
CompatIndexV4ToV5::Compat(compat) => compat.metadata(),
}
}
pub fn documents(&mut self) -> Result<Box<dyn Iterator<Item = Result<Document>> + '_>> {
match self {
CompatIndexV4ToV5::V4(v4) => v4
.documents()
.map(|iter| Box::new(iter) as Box<dyn Iterator<Item = Result<Document>> + '_>),
CompatIndexV4ToV5::Compat(compat) => compat
.documents()
.map(|iter| Box::new(iter) as Box<dyn Iterator<Item = Result<Document>> + '_>),
}
}
pub fn settings(&mut self) -> Result<v5::Settings<v5::Checked>> {
match self {
CompatIndexV4ToV5::V4(v4) => Ok(v5::Settings::from(v4.settings()?).check()),
CompatIndexV4ToV5::Compat(compat) => Ok(v5::Settings::from(compat.settings()?).check()),
}
}
}
impl<T> From<v4::Setting<T>> for v5::Setting<T> {
fn from(setting: v4::Setting<T>) -> Self {
match setting {
v4::Setting::Set(t) => v5::Setting::Set(t),
v4::Setting::Reset => v5::Setting::Reset,
v4::Setting::NotSet => v5::Setting::NotSet,
}
}
}
impl From<v4::ResponseError> for v5::ResponseError {
fn from(error: v4::ResponseError) -> Self {
let code = match error.error_code.as_ref() {
"index_creation_failed" => v5::Code::CreateIndex,
"index_already_exists" => v5::Code::IndexAlreadyExists,
"index_not_found" => v5::Code::IndexNotFound,
"invalid_index_uid" => v5::Code::InvalidIndexUid,
"invalid_min_word_length_for_typo" => v5::Code::InvalidMinWordLengthForTypo,
"invalid_state" => v5::Code::InvalidState,
"primary_key_inference_failed" => v5::Code::MissingPrimaryKey,
"index_primary_key_already_exists" => v5::Code::PrimaryKeyAlreadyPresent,
"max_fields_limit_exceeded" => v5::Code::MaxFieldsLimitExceeded,
"missing_document_id" => v5::Code::MissingDocumentId,
"invalid_document_id" => v5::Code::InvalidDocumentId,
"invalid_filter" => v5::Code::Filter,
"invalid_sort" => v5::Code::Sort,
"bad_parameter" => v5::Code::BadParameter,
"bad_request" => v5::Code::BadRequest,
"database_size_limit_reached" => v5::Code::DatabaseSizeLimitReached,
"document_not_found" => v5::Code::DocumentNotFound,
"internal" => v5::Code::Internal,
"invalid_geo_field" => v5::Code::InvalidGeoField,
"invalid_ranking_rule" => v5::Code::InvalidRankingRule,
"invalid_store_file" => v5::Code::InvalidStore,
"invalid_api_key" => v5::Code::InvalidToken,
"missing_authorization_header" => v5::Code::MissingAuthorizationHeader,
"no_space_left_on_device" => v5::Code::NoSpaceLeftOnDevice,
"dump_not_found" => v5::Code::DumpNotFound,
"task_not_found" => v5::Code::TaskNotFound,
"payload_too_large" => v5::Code::PayloadTooLarge,
"unretrievable_document" => v5::Code::RetrieveDocument,
"search_error" => v5::Code::SearchDocuments,
"unsupported_media_type" => v5::Code::UnsupportedMediaType,
"dump_already_processing" => v5::Code::DumpAlreadyInProgress,
"dump_process_failed" => v5::Code::DumpProcessFailed,
"invalid_content_type" => v5::Code::InvalidContentType,
"missing_content_type" => v5::Code::MissingContentType,
"malformed_payload" => v5::Code::MalformedPayload,
"missing_payload" => v5::Code::MissingPayload,
"api_key_not_found" => v5::Code::ApiKeyNotFound,
"missing_parameter" => v5::Code::MissingParameter,
"invalid_api_key_actions" => v5::Code::InvalidApiKeyActions,
"invalid_api_key_indexes" => v5::Code::InvalidApiKeyIndexes,
"invalid_api_key_expires_at" => v5::Code::InvalidApiKeyExpiresAt,
"invalid_api_key_description" => v5::Code::InvalidApiKeyDescription,
other => {
tracing::warn!("Unknown error code {}", other);
v5::Code::UnretrievableErrorCode
}
};
v5::ResponseError::from_msg(error.message, code)
}
}
impl<T> From<v4::Settings<T>> for v5::Settings<v5::Unchecked> {
fn from(settings: v4::Settings<T>) -> Self {
v5::Settings {
displayed_attributes: settings.displayed_attributes.into(),
searchable_attributes: settings.searchable_attributes.into(),
filterable_attributes: settings.filterable_attributes.into(),
sortable_attributes: settings.sortable_attributes.into(),
ranking_rules: settings.ranking_rules.into(),
stop_words: settings.stop_words.into(),
synonyms: settings.synonyms.into(),
distinct_attribute: settings.distinct_attribute.into(),
typo_tolerance: match settings.typo_tolerance {
v4::Setting::Set(typo) => v5::Setting::Set(v5::TypoTolerance {
enabled: typo.enabled.into(),
min_word_size_for_typos: match typo.min_word_size_for_typos {
v4::Setting::Set(t) => v5::Setting::Set(v5::MinWordSizeForTypos {
one_typo: t.one_typo.into(),
two_typos: t.two_typos.into(),
}),
v4::Setting::Reset => v5::Setting::Reset,
v4::Setting::NotSet => v5::Setting::NotSet,
},
disable_on_words: typo.disable_on_words.into(),
disable_on_attributes: typo.disable_on_attributes.into(),
}),
v4::Setting::Reset => v5::Setting::Reset,
v4::Setting::NotSet => v5::Setting::NotSet,
},
faceting: v5::Setting::NotSet,
pagination: v5::Setting::NotSet,
_kind: std::marker::PhantomData,
}
}
}
impl From<v4::Action> for Option<v5::Action> {
fn from(key: v4::Action) -> Self {
match key {
v4::Action::All => Some(v5::Action::All),
v4::Action::Search => Some(v5::Action::Search),
v4::Action::DocumentsAdd => Some(v5::Action::DocumentsAdd),
v4::Action::DocumentsGet => Some(v5::Action::DocumentsGet),
v4::Action::DocumentsDelete => Some(v5::Action::DocumentsDelete),
v4::Action::IndexesAdd => Some(v5::Action::IndexesAdd),
v4::Action::IndexesGet => Some(v5::Action::IndexesGet),
v4::Action::IndexesUpdate => Some(v5::Action::IndexesUpdate),
v4::Action::IndexesDelete => Some(v5::Action::IndexesDelete),
v4::Action::TasksGet => Some(v5::Action::TasksGet),
v4::Action::SettingsGet => Some(v5::Action::SettingsGet),
v4::Action::SettingsUpdate => Some(v5::Action::SettingsUpdate),
v4::Action::StatsGet => Some(v5::Action::StatsGet),
v4::Action::DumpsCreate => Some(v5::Action::DumpsCreate),
v4::Action::DumpsGet => None,
v4::Action::Version => Some(v5::Action::Version),
}
}
}
#[cfg(test)]
pub(crate) mod test {
use std::fs::File;
use std::io::BufReader;
use flate2::bufread::GzDecoder;
use meili_snap::insta;
use tempfile::TempDir;
use super::*;
#[test]
fn compat_v4_v5() {
let dump = File::open("tests/assets/v4.dump").unwrap();
let dir = TempDir::new().unwrap();
let mut dump = BufReader::new(dump);
let gz = GzDecoder::new(&mut dump);
let mut archive = tar::Archive::new(gz);
archive.unpack(dir.path()).unwrap();
let mut dump = v4::V4Reader::open(dir).unwrap().to_v5();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-06 12:53:49.131989609 +00:00:00");
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"ed9a30cded4c046ef46f7cff7450347e");
assert_eq!(update_files.len(), 10);
assert!(update_files[0].is_some()); // the enqueued document addition
assert!(update_files[1..].iter().all(|u| u.is_none())); // everything already processed
// keys
let keys = dump.keys().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(keys, { "[].uid" => "[uuid]" }), @"1384361d734fd77c23804c9696228660");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"b01c8371aea4c7171af0d4d846a2bdca");
// movies
insta::assert_json_snapshot!(movies.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 110);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"786022a66ecb992c8a2a60fee070a5ab");
// spells
insta::assert_json_snapshot!(spells.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"235016433dd04262c7f2da01d1e808ce");
}
}

View file

@ -0,0 +1,517 @@
use std::str::FromStr;
use super::v4_to_v5::{CompatIndexV4ToV5, CompatV4ToV5};
use crate::reader::{v5, v6, Document, UpdateFile};
use crate::Result;
pub enum CompatV5ToV6 {
V5(v5::V5Reader),
Compat(CompatV4ToV5),
}
impl CompatV5ToV6 {
pub fn new_v5(v5: v5::V5Reader) -> CompatV5ToV6 {
CompatV5ToV6::V5(v5)
}
pub fn version(&self) -> crate::Version {
match self {
CompatV5ToV6::V5(v5) => v5.version(),
CompatV5ToV6::Compat(compat) => compat.version(),
}
}
pub fn date(&self) -> Option<time::OffsetDateTime> {
match self {
CompatV5ToV6::V5(v5) => v5.date(),
CompatV5ToV6::Compat(compat) => compat.date(),
}
}
pub fn instance_uid(&self) -> Result<Option<uuid::Uuid>> {
match self {
CompatV5ToV6::V5(v5) => v5.instance_uid(),
CompatV5ToV6::Compat(compat) => compat.instance_uid(),
}
}
pub fn indexes(&self) -> Result<Box<dyn Iterator<Item = Result<CompatIndexV5ToV6>> + '_>> {
let indexes = match self {
CompatV5ToV6::V5(v5) => {
Box::new(v5.indexes()?.map(|index| index.map(CompatIndexV5ToV6::from)))
as Box<dyn Iterator<Item = Result<CompatIndexV5ToV6>> + '_>
}
CompatV5ToV6::Compat(compat) => {
Box::new(compat.indexes()?.map(|index| index.map(CompatIndexV5ToV6::from)))
as Box<dyn Iterator<Item = Result<CompatIndexV5ToV6>> + '_>
}
};
Ok(indexes)
}
pub fn tasks(
&mut self,
) -> Result<Box<dyn Iterator<Item = Result<(v6::Task, Option<Box<UpdateFile>>)>> + '_>> {
let instance_uid = self.instance_uid().ok().flatten();
let keys = self.keys()?.collect::<Result<Vec<_>>>()?;
let tasks = match self {
CompatV5ToV6::V5(v5) => v5.tasks(),
CompatV5ToV6::Compat(compat) => compat.tasks(),
};
Ok(Box::new(tasks.map(move |task| {
task.map(|(task, content_file)| {
let mut task_view: v5::tasks::TaskView = task.clone().into();
if task_view.status == v5::Status::Processing {
task_view.started_at = None;
}
let task = v6::Task {
uid: task_view.uid,
index_uid: task_view.index_uid,
status: match task_view.status {
v5::Status::Enqueued => v6::Status::Enqueued,
v5::Status::Processing => v6::Status::Enqueued,
v5::Status::Succeeded => v6::Status::Succeeded,
v5::Status::Failed => v6::Status::Failed,
},
kind: match task.content {
v5::tasks::TaskContent::IndexCreation { primary_key, .. } => {
v6::Kind::IndexCreation { primary_key }
}
v5::tasks::TaskContent::IndexUpdate { primary_key, .. } => {
v6::Kind::IndexUpdate { primary_key }
}
v5::tasks::TaskContent::IndexDeletion { .. } => v6::Kind::IndexDeletion,
v5::tasks::TaskContent::DocumentAddition {
merge_strategy,
allow_index_creation,
primary_key,
documents_count,
..
} => v6::Kind::DocumentImport {
primary_key,
documents_count: documents_count as u64,
method: match merge_strategy {
v5::tasks::IndexDocumentsMethod::ReplaceDocuments => {
v6::milli::update::IndexDocumentsMethod::ReplaceDocuments
}
v5::tasks::IndexDocumentsMethod::UpdateDocuments => {
v6::milli::update::IndexDocumentsMethod::UpdateDocuments
}
},
allow_index_creation,
},
v5::tasks::TaskContent::DocumentDeletion { deletion, .. } => match deletion
{
v5::tasks::DocumentDeletion::Clear => v6::Kind::DocumentClear,
v5::tasks::DocumentDeletion::Ids(documents_ids) => {
v6::Kind::DocumentDeletion { documents_ids }
}
},
v5::tasks::TaskContent::SettingsUpdate {
allow_index_creation,
is_deletion,
settings,
..
} => v6::Kind::Settings {
is_deletion,
allow_index_creation,
settings: Box::new(settings.into()),
},
v5::tasks::TaskContent::Dump { uid: _ } => {
// in v6 we compute the dump_uid from the started_at processing time
v6::Kind::DumpCreation { keys: keys.clone(), instance_uid }
}
},
canceled_by: None,
details: task_view.details.map(|details| match details {
v5::Details::DocumentAddition { received_documents, indexed_documents } => {
v6::Details::DocumentAdditionOrUpdate {
received_documents: received_documents as u64,
indexed_documents,
}
}
v5::Details::Settings { settings } => {
v6::Details::SettingsUpdate { settings: Box::new(settings.into()) }
}
v5::Details::IndexInfo { primary_key } => {
v6::Details::IndexInfo { primary_key }
}
v5::Details::DocumentDeletion {
received_document_ids,
deleted_documents,
} => v6::Details::DocumentDeletion {
provided_ids: received_document_ids,
deleted_documents,
},
v5::Details::ClearAll { deleted_documents } => {
v6::Details::ClearAll { deleted_documents }
}
v5::Details::Dump { dump_uid } => {
v6::Details::Dump { dump_uid: Some(dump_uid) }
}
}),
error: task_view.error.map(|e| e.into()),
enqueued_at: task_view.enqueued_at,
started_at: task_view.started_at,
finished_at: task_view.finished_at,
};
(task, content_file)
})
})))
}
pub fn keys(&mut self) -> Result<Box<dyn Iterator<Item = Result<v6::Key>> + '_>> {
let keys = match self {
CompatV5ToV6::V5(v5) => v5.keys()?,
CompatV5ToV6::Compat(compat) => compat.keys(),
};
Ok(Box::new(keys.map(|key| {
key.map(|key| v6::Key {
description: key.description,
name: key.name,
uid: key.uid,
actions: key.actions.into_iter().map(|action| action.into()).collect(),
indexes: key
.indexes
.into_iter()
.map(|index| match index {
v5::StarOr::Star => v6::IndexUidPattern::all(),
v5::StarOr::Other(uid) => v6::IndexUidPattern::new_unchecked(uid.as_str()),
})
.collect(),
expires_at: key.expires_at,
created_at: key.created_at,
updated_at: key.updated_at,
})
})))
}
pub fn features(&self) -> Result<Option<v6::RuntimeTogglableFeatures>> {
Ok(None)
}
}
pub enum CompatIndexV5ToV6 {
V5(v5::V5IndexReader),
Compat(CompatIndexV4ToV5),
}
impl From<v5::V5IndexReader> for CompatIndexV5ToV6 {
fn from(index_reader: v5::V5IndexReader) -> Self {
Self::V5(index_reader)
}
}
impl From<CompatIndexV4ToV5> for CompatIndexV5ToV6 {
fn from(index_reader: CompatIndexV4ToV5) -> Self {
Self::Compat(index_reader)
}
}
impl CompatIndexV5ToV6 {
pub fn new_v5(v5: v5::V5IndexReader) -> CompatIndexV5ToV6 {
CompatIndexV5ToV6::V5(v5)
}
pub fn metadata(&self) -> &crate::IndexMetadata {
match self {
CompatIndexV5ToV6::V5(v5) => v5.metadata(),
CompatIndexV5ToV6::Compat(compat) => compat.metadata(),
}
}
pub fn documents(&mut self) -> Result<Box<dyn Iterator<Item = Result<Document>> + '_>> {
match self {
CompatIndexV5ToV6::V5(v5) => v5
.documents()
.map(|iter| Box::new(iter) as Box<dyn Iterator<Item = Result<Document>> + '_>),
CompatIndexV5ToV6::Compat(compat) => compat
.documents()
.map(|iter| Box::new(iter) as Box<dyn Iterator<Item = Result<Document>> + '_>),
}
}
pub fn settings(&mut self) -> Result<v6::Settings<v6::Checked>> {
match self {
CompatIndexV5ToV6::V5(v5) => Ok(v6::Settings::from(v5.settings()?).check()),
CompatIndexV5ToV6::Compat(compat) => Ok(v6::Settings::from(compat.settings()?).check()),
}
}
}
impl<T> From<v5::Setting<T>> for v6::Setting<T> {
fn from(setting: v5::Setting<T>) -> Self {
match setting {
v5::Setting::Set(t) => v6::Setting::Set(t),
v5::Setting::Reset => v6::Setting::Reset,
v5::Setting::NotSet => v6::Setting::NotSet,
}
}
}
impl From<v5::ResponseError> for v6::ResponseError {
fn from(error: v5::ResponseError) -> Self {
let code = match error.error_code.as_ref() {
"index_creation_failed" => v6::Code::IndexCreationFailed,
"index_already_exists" => v6::Code::IndexAlreadyExists,
"index_not_found" => v6::Code::IndexNotFound,
"invalid_index_uid" => v6::Code::InvalidIndexUid,
"invalid_min_word_length_for_typo" => v6::Code::InvalidSettingsTypoTolerance,
"invalid_state" => v6::Code::InvalidState,
"primary_key_inference_failed" => v6::Code::IndexPrimaryKeyNoCandidateFound,
"index_primary_key_already_exists" => v6::Code::IndexPrimaryKeyAlreadyExists,
"max_fields_limit_exceeded" => v6::Code::MaxFieldsLimitExceeded,
"missing_document_id" => v6::Code::MissingDocumentId,
"invalid_document_id" => v6::Code::InvalidDocumentId,
"invalid_filter" => v6::Code::InvalidSettingsFilterableAttributes,
"invalid_sort" => v6::Code::InvalidSettingsSortableAttributes,
"bad_parameter" => v6::Code::BadParameter,
"bad_request" => v6::Code::BadRequest,
"database_size_limit_reached" => v6::Code::DatabaseSizeLimitReached,
"document_not_found" => v6::Code::DocumentNotFound,
"internal" => v6::Code::Internal,
"invalid_geo_field" => v6::Code::InvalidDocumentGeoField,
"invalid_ranking_rule" => v6::Code::InvalidSettingsRankingRules,
"invalid_store_file" => v6::Code::InvalidStoreFile,
"invalid_api_key" => v6::Code::InvalidApiKey,
"missing_authorization_header" => v6::Code::MissingAuthorizationHeader,
"no_space_left_on_device" => v6::Code::NoSpaceLeftOnDevice,
"dump_not_found" => v6::Code::DumpNotFound,
"task_not_found" => v6::Code::TaskNotFound,
"payload_too_large" => v6::Code::PayloadTooLarge,
"unretrievable_document" => v6::Code::UnretrievableDocument,
"unsupported_media_type" => v6::Code::UnsupportedMediaType,
"dump_already_processing" => v6::Code::DumpAlreadyProcessing,
"dump_process_failed" => v6::Code::DumpProcessFailed,
"invalid_content_type" => v6::Code::InvalidContentType,
"missing_content_type" => v6::Code::MissingContentType,
"malformed_payload" => v6::Code::MalformedPayload,
"missing_payload" => v6::Code::MissingPayload,
"api_key_not_found" => v6::Code::ApiKeyNotFound,
"missing_parameter" => v6::Code::BadRequest,
"invalid_api_key_actions" => v6::Code::InvalidApiKeyActions,
"invalid_api_key_indexes" => v6::Code::InvalidApiKeyIndexes,
"invalid_api_key_expires_at" => v6::Code::InvalidApiKeyExpiresAt,
"invalid_api_key_description" => v6::Code::InvalidApiKeyDescription,
"invalid_api_key_name" => v6::Code::InvalidApiKeyName,
"invalid_api_key_uid" => v6::Code::InvalidApiKeyUid,
"immutable_field" => v6::Code::BadRequest,
"api_key_already_exists" => v6::Code::ApiKeyAlreadyExists,
other => {
tracing::warn!("Unknown error code {}", other);
v6::Code::UnretrievableErrorCode
}
};
v6::ResponseError::from_msg(error.message, code)
}
}
impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
fn from(settings: v5::Settings<T>) -> Self {
v6::Settings {
displayed_attributes: v6::Setting::from(settings.displayed_attributes).into(),
searchable_attributes: v6::Setting::from(settings.searchable_attributes).into(),
filterable_attributes: settings.filterable_attributes.into(),
sortable_attributes: settings.sortable_attributes.into(),
ranking_rules: {
match settings.ranking_rules {
v5::settings::Setting::Set(ranking_rules) => {
let mut new_ranking_rules = vec![];
for rule in ranking_rules {
match v6::RankingRuleView::from_str(&rule) {
Ok(new_rule) => {
new_ranking_rules.push(new_rule);
}
Err(_) => {
tracing::warn!("Error while importing settings. The ranking rule `{rule}` does not exist anymore.")
}
}
}
v6::Setting::Set(new_ranking_rules)
}
v5::settings::Setting::Reset => v6::Setting::Reset,
v5::settings::Setting::NotSet => v6::Setting::NotSet,
}
},
stop_words: settings.stop_words.into(),
non_separator_tokens: v6::Setting::NotSet,
separator_tokens: v6::Setting::NotSet,
dictionary: v6::Setting::NotSet,
synonyms: settings.synonyms.into(),
distinct_attribute: settings.distinct_attribute.into(),
proximity_precision: v6::Setting::NotSet,
typo_tolerance: match settings.typo_tolerance {
v5::Setting::Set(typo) => v6::Setting::Set(v6::TypoTolerance {
enabled: typo.enabled.into(),
min_word_size_for_typos: match typo.min_word_size_for_typos {
v5::Setting::Set(t) => v6::Setting::Set(v6::MinWordSizeForTypos {
one_typo: t.one_typo.into(),
two_typos: t.two_typos.into(),
}),
v5::Setting::Reset => v6::Setting::Reset,
v5::Setting::NotSet => v6::Setting::NotSet,
},
disable_on_words: typo.disable_on_words.into(),
disable_on_attributes: typo.disable_on_attributes.into(),
}),
v5::Setting::Reset => v6::Setting::Reset,
v5::Setting::NotSet => v6::Setting::NotSet,
},
faceting: match settings.faceting {
v5::Setting::Set(faceting) => v6::Setting::Set(v6::FacetingSettings {
max_values_per_facet: faceting.max_values_per_facet.into(),
sort_facet_values_by: v6::Setting::NotSet,
}),
v5::Setting::Reset => v6::Setting::Reset,
v5::Setting::NotSet => v6::Setting::NotSet,
},
pagination: match settings.pagination {
v5::Setting::Set(pagination) => v6::Setting::Set(v6::PaginationSettings {
max_total_hits: pagination.max_total_hits.into(),
}),
v5::Setting::Reset => v6::Setting::Reset,
v5::Setting::NotSet => v6::Setting::NotSet,
},
embedders: v6::Setting::NotSet,
localized_attributes: v6::Setting::NotSet,
search_cutoff_ms: v6::Setting::NotSet,
_kind: std::marker::PhantomData,
}
}
}
impl From<v5::Action> for v6::Action {
fn from(key: v5::Action) -> Self {
match key {
v5::Action::All => v6::Action::All,
v5::Action::Search => v6::Action::Search,
v5::Action::DocumentsAll => v6::Action::DocumentsAll,
v5::Action::DocumentsAdd => v6::Action::DocumentsAdd,
v5::Action::DocumentsGet => v6::Action::DocumentsGet,
v5::Action::DocumentsDelete => v6::Action::DocumentsDelete,
v5::Action::IndexesAll => v6::Action::IndexesAll,
v5::Action::IndexesAdd => v6::Action::IndexesAdd,
v5::Action::IndexesGet => v6::Action::IndexesGet,
v5::Action::IndexesUpdate => v6::Action::IndexesUpdate,
v5::Action::IndexesDelete => v6::Action::IndexesDelete,
v5::Action::TasksAll => v6::Action::TasksAll,
v5::Action::TasksGet => v6::Action::TasksGet,
v5::Action::SettingsAll => v6::Action::SettingsAll,
v5::Action::SettingsGet => v6::Action::SettingsGet,
v5::Action::SettingsUpdate => v6::Action::SettingsUpdate,
v5::Action::StatsAll => v6::Action::StatsAll,
v5::Action::StatsGet => v6::Action::StatsGet,
v5::Action::MetricsAll => v6::Action::MetricsAll,
v5::Action::MetricsGet => v6::Action::MetricsGet,
v5::Action::DumpsAll => v6::Action::DumpsAll,
v5::Action::DumpsCreate => v6::Action::DumpsCreate,
v5::Action::Version => v6::Action::Version,
v5::Action::KeysAdd => v6::Action::KeysAdd,
v5::Action::KeysGet => v6::Action::KeysGet,
v5::Action::KeysUpdate => v6::Action::KeysUpdate,
v5::Action::KeysDelete => v6::Action::KeysDelete,
}
}
}
#[cfg(test)]
pub(crate) mod test {
use std::fs::File;
use std::io::BufReader;
use flate2::bufread::GzDecoder;
use meili_snap::insta;
use tempfile::TempDir;
use super::*;
#[test]
fn compat_v5_v6() {
let dump = File::open("tests/assets/v5.dump").unwrap();
let dir = TempDir::new().unwrap();
let mut dump = BufReader::new(dump);
let gz = GzDecoder::new(&mut dump);
let mut archive = tar::Archive::new(gz);
archive.unpack(dir.path()).unwrap();
let mut dump = v5::V5Reader::open(dir).unwrap().to_v6();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-04 15:55:10.344982459 +00:00:00");
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
// tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"41f91d3a94911b2735ec41b07540df5c");
assert_eq!(update_files.len(), 22);
assert!(update_files[0].is_none()); // the dump creation
assert!(update_files[1].is_some()); // the enqueued document addition
assert!(update_files[2..].iter().all(|u| u.is_none())); // everything already processed
// keys
let keys = dump.keys().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(keys), @"c9d2b467fe2fca0b35580d8a999808fb");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"b01c8371aea4c7171af0d4d846a2bdca");
// movies
insta::assert_json_snapshot!(movies.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 200);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"e962baafd2fbae4cdd14e876053b0c5a");
// spells
insta::assert_json_snapshot!(spells.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"235016433dd04262c7f2da01d1e808ce");
}
}

View file

@ -0,0 +1,877 @@
use std::fs::File;
use std::io::{BufReader, Read};
use flate2::bufread::GzDecoder;
use serde::Deserialize;
use tempfile::TempDir;
use self::compat::v4_to_v5::CompatV4ToV5;
use self::compat::v5_to_v6::{CompatIndexV5ToV6, CompatV5ToV6};
use self::v5::V5Reader;
use self::v6::{V6IndexReader, V6Reader};
use crate::{Result, Version};
mod compat;
mod v1;
mod v2;
mod v3;
mod v4;
mod v5;
mod v6;
pub type Document = serde_json::Map<String, serde_json::Value>;
pub type UpdateFile = dyn Iterator<Item = Result<Document>>;
pub enum DumpReader {
Current(V6Reader),
Compat(CompatV5ToV6),
}
impl DumpReader {
pub fn open(dump: impl Read) -> Result<DumpReader> {
let path = TempDir::new()?;
let mut dump = BufReader::new(dump);
let gz = GzDecoder::new(&mut dump);
let mut archive = tar::Archive::new(gz);
archive.unpack(path.path())?;
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct MetadataVersion {
pub dump_version: Version,
}
let mut meta_file = File::open(path.path().join("metadata.json"))?;
let MetadataVersion { dump_version } = serde_json::from_reader(&mut meta_file)?;
match dump_version {
Version::V1 => {
Ok(v1::V1Reader::open(path)?.to_v2().to_v3().to_v4().to_v5().to_v6().into())
}
Version::V2 => Ok(v2::V2Reader::open(path)?.to_v3().to_v4().to_v5().to_v6().into()),
Version::V3 => Ok(v3::V3Reader::open(path)?.to_v4().to_v5().to_v6().into()),
Version::V4 => Ok(v4::V4Reader::open(path)?.to_v5().to_v6().into()),
Version::V5 => Ok(v5::V5Reader::open(path)?.to_v6().into()),
Version::V6 => Ok(v6::V6Reader::open(path)?.into()),
}
}
pub fn version(&self) -> crate::Version {
match self {
DumpReader::Current(current) => current.version(),
DumpReader::Compat(compat) => compat.version(),
}
}
pub fn date(&self) -> Option<time::OffsetDateTime> {
match self {
DumpReader::Current(current) => current.date(),
DumpReader::Compat(compat) => compat.date(),
}
}
pub fn instance_uid(&self) -> Result<Option<uuid::Uuid>> {
match self {
DumpReader::Current(current) => current.instance_uid(),
DumpReader::Compat(compat) => compat.instance_uid(),
}
}
pub fn indexes(&self) -> Result<Box<dyn Iterator<Item = Result<DumpIndexReader>> + '_>> {
match self {
DumpReader::Current(current) => {
let indexes = Box::new(current.indexes()?.map(|res| res.map(DumpIndexReader::from)))
as Box<dyn Iterator<Item = Result<DumpIndexReader>> + '_>;
Ok(indexes)
}
DumpReader::Compat(compat) => {
let indexes = Box::new(compat.indexes()?.map(|res| res.map(DumpIndexReader::from)))
as Box<dyn Iterator<Item = Result<DumpIndexReader>> + '_>;
Ok(indexes)
}
}
}
pub fn tasks(
&mut self,
) -> Result<Box<dyn Iterator<Item = Result<(v6::Task, Option<Box<UpdateFile>>)>> + '_>> {
match self {
DumpReader::Current(current) => Ok(current.tasks()),
DumpReader::Compat(compat) => compat.tasks(),
}
}
pub fn keys(&mut self) -> Result<Box<dyn Iterator<Item = Result<v6::Key>> + '_>> {
match self {
DumpReader::Current(current) => Ok(current.keys()),
DumpReader::Compat(compat) => compat.keys(),
}
}
pub fn features(&self) -> Result<Option<v6::RuntimeTogglableFeatures>> {
match self {
DumpReader::Current(current) => Ok(current.features()),
DumpReader::Compat(compat) => compat.features(),
}
}
}
impl From<V6Reader> for DumpReader {
fn from(value: V6Reader) -> Self {
DumpReader::Current(value)
}
}
impl From<CompatV5ToV6> for DumpReader {
fn from(value: CompatV5ToV6) -> Self {
DumpReader::Compat(value)
}
}
impl From<V5Reader> for DumpReader {
fn from(value: V5Reader) -> Self {
DumpReader::Compat(value.to_v6())
}
}
impl From<CompatV4ToV5> for DumpReader {
fn from(value: CompatV4ToV5) -> Self {
DumpReader::Compat(value.to_v6())
}
}
pub enum DumpIndexReader {
Current(v6::V6IndexReader),
Compat(Box<CompatIndexV5ToV6>),
}
impl DumpIndexReader {
pub fn new_v6(v6: v6::V6IndexReader) -> DumpIndexReader {
DumpIndexReader::Current(v6)
}
pub fn metadata(&self) -> &crate::IndexMetadata {
match self {
DumpIndexReader::Current(v6) => v6.metadata(),
DumpIndexReader::Compat(compat) => compat.metadata(),
}
}
pub fn documents(&mut self) -> Result<Box<dyn Iterator<Item = Result<Document>> + '_>> {
match self {
DumpIndexReader::Current(v6) => v6
.documents()
.map(|iter| Box::new(iter) as Box<dyn Iterator<Item = Result<Document>> + '_>),
DumpIndexReader::Compat(compat) => compat
.documents()
.map(|iter| Box::new(iter) as Box<dyn Iterator<Item = Result<Document>> + '_>),
}
}
pub fn settings(&mut self) -> Result<v6::Settings<v6::Checked>> {
match self {
DumpIndexReader::Current(v6) => v6.settings(),
DumpIndexReader::Compat(compat) => compat.settings(),
}
}
}
impl From<V6IndexReader> for DumpIndexReader {
fn from(value: V6IndexReader) -> Self {
DumpIndexReader::Current(value)
}
}
impl From<CompatIndexV5ToV6> for DumpIndexReader {
fn from(value: CompatIndexV5ToV6) -> Self {
DumpIndexReader::Compat(Box::new(value))
}
}
#[cfg(test)]
pub(crate) mod test {
use std::fs::File;
use meili_snap::insta;
use super::*;
use crate::reader::v6::RuntimeTogglableFeatures;
#[test]
fn import_dump_v6_with_vectors() {
// dump containing two indexes
//
// "vector", configured with an embedder
// contains:
// - one document with an overriden vector,
// - one document with a natural vector
// - one document with a _vectors map containing one additional embedder name and a natural vector
// - one document with a _vectors map containing one additional embedder name and an overriden vector
//
// "novector", no embedder
// contains:
// - a document without vector
// - a document with a random _vectors field
let dump = File::open("tests/assets/v6-with-vectors.dump").unwrap();
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2024-05-16 15:51:34.151044 +00:00:00");
insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @"None");
// tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"278f63325ef06ca04d01df98d8207b94");
assert_eq!(update_files.len(), 10);
assert!(update_files[0].is_none()); // the dump creation
assert!(update_files[1].is_none());
assert!(update_files[2].is_none());
assert!(update_files[3].is_none());
assert!(update_files[4].is_none());
assert!(update_files[5].is_none());
assert!(update_files[6].is_none());
assert!(update_files[7].is_none());
assert!(update_files[8].is_none());
assert!(update_files[9].is_none());
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut vector_index = indexes.pop().unwrap();
let mut novector_index = indexes.pop().unwrap();
assert!(indexes.is_empty());
// vector
insta::assert_json_snapshot!(vector_index.metadata(), @r###"
{
"uid": "vector",
"primaryKey": "id",
"createdAt": "2024-05-16T15:33:17.240962Z",
"updatedAt": "2024-05-16T15:40:55.723052Z"
}
"###);
insta::assert_json_snapshot!(vector_index.settings().unwrap());
{
let documents: Result<Vec<_>> = vector_index.documents().unwrap().collect();
let mut documents = documents.unwrap();
assert_eq!(documents.len(), 4);
documents.sort_by_key(|doc| doc.get("id").unwrap().to_string());
{
let document = documents.pop().unwrap();
insta::assert_json_snapshot!(document);
}
{
let document = documents.pop().unwrap();
insta::assert_json_snapshot!(document);
}
{
let document = documents.pop().unwrap();
insta::assert_json_snapshot!(document);
}
{
let document = documents.pop().unwrap();
insta::assert_json_snapshot!(document);
}
}
// novector
insta::assert_json_snapshot!(novector_index.metadata(), @r###"
{
"uid": "novector",
"primaryKey": "id",
"createdAt": "2024-05-16T15:33:03.568055Z",
"updatedAt": "2024-05-16T15:33:07.530217Z"
}
"###);
insta::assert_json_snapshot!(novector_index.settings().unwrap().embedders, @"null");
{
let documents: Result<Vec<_>> = novector_index.documents().unwrap().collect();
let mut documents = documents.unwrap();
assert_eq!(documents.len(), 2);
documents.sort_by_key(|doc| doc.get("id").unwrap().to_string());
{
let document = documents.pop().unwrap();
insta::assert_json_snapshot!(document, @r###"
{
"id": "e1",
"other": "random1",
"_vectors": "toto"
}
"###);
}
{
let document = documents.pop().unwrap();
insta::assert_json_snapshot!(document, @r###"
{
"id": "e0",
"other": "random0"
}
"###);
}
}
assert_eq!(
dump.features().unwrap().unwrap(),
RuntimeTogglableFeatures { vector_store: true, ..Default::default() }
);
}
#[test]
fn import_dump_v6_experimental() {
let dump = File::open("tests/assets/v6-with-experimental.dump").unwrap();
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2023-07-06 7:10:27.21958 +00:00:00");
insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @"None");
// tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"d45cd8571703e58ae53c7bd7ce3f5c22");
assert_eq!(update_files.len(), 2);
assert!(update_files[0].is_none()); // the dump creation
assert!(update_files[1].is_none()); // the processed document addition
// keys
let keys = dump.keys().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(keys), @"13c2da155e9729c2344688cab29af71d");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut test = indexes.pop().unwrap();
assert!(indexes.is_empty());
insta::assert_json_snapshot!(test.metadata(), @r###"
{
"uid": "test",
"primaryKey": "id",
"createdAt": "2023-07-06T07:07:41.364694Z",
"updatedAt": "2023-07-06T07:07:41.396114Z"
}
"###);
assert_eq!(test.documents().unwrap().count(), 1);
assert_eq!(
dump.features().unwrap().unwrap(),
RuntimeTogglableFeatures { vector_store: true, ..Default::default() }
);
}
#[test]
fn import_dump_v5() {
let dump = File::open("tests/assets/v5.dump").unwrap();
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-04 15:55:10.344982459 +00:00:00");
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
// tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"41f91d3a94911b2735ec41b07540df5c");
assert_eq!(update_files.len(), 22);
assert!(update_files[0].is_none()); // the dump creation
assert!(update_files[1].is_some()); // the enqueued document addition
assert!(update_files[2..].iter().all(|u| u.is_none())); // everything already processed
// keys
let keys = dump.keys().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(keys), @"c9d2b467fe2fca0b35580d8a999808fb");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "2022-10-04T15:51:35.939396731Z",
"updatedAt": "2022-10-04T15:55:01.897325373Z"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"b01c8371aea4c7171af0d4d846a2bdca");
// movies
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "2022-10-04T15:51:35.291992167Z",
"updatedAt": "2022-10-04T15:55:10.33561842Z"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 200);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"e962baafd2fbae4cdd14e876053b0c5a");
// spells
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "2022-10-04T15:51:37.381094632Z",
"updatedAt": "2022-10-04T15:55:02.394503431Z"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"235016433dd04262c7f2da01d1e808ce");
assert_eq!(dump.features().unwrap(), None);
}
#[test]
fn import_dump_v4() {
let dump = File::open("tests/assets/v4.dump").unwrap();
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-06 12:53:49.131989609 +00:00:00");
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
// tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"c2445ddd1785528b80f2ba534d3bd00c");
assert_eq!(update_files.len(), 10);
assert!(update_files[0].is_some()); // the enqueued document addition
assert!(update_files[1..].iter().all(|u| u.is_none())); // everything already processed
// keys
let keys = dump.keys().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(keys, { "[].uid" => "[uuid]" }), @"d751713988987e9331980363e24189ce");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "2022-10-06T12:53:39.360187055Z",
"updatedAt": "2022-10-06T12:53:40.603035979Z"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"b01c8371aea4c7171af0d4d846a2bdca");
// movies
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "2022-10-06T12:53:38.710611568Z",
"updatedAt": "2022-10-06T12:53:49.785862546Z"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 110);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"786022a66ecb992c8a2a60fee070a5ab");
// spells
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "2022-10-06T12:53:40.831649057Z",
"updatedAt": "2022-10-06T12:53:41.116036186Z"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"235016433dd04262c7f2da01d1e808ce");
}
#[test]
fn import_dump_v3() {
let dump = File::open("tests/assets/v3.dump").unwrap();
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-07 11:39:03.709153554 +00:00:00");
assert_eq!(dump.instance_uid().unwrap(), None);
// tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"cd12efd308fe3ed226356a727ab42ed3");
assert_eq!(update_files.len(), 10);
assert!(update_files[0].is_some()); // the enqueued document addition
assert!(update_files[1..].iter().all(|u| u.is_none())); // everything already processed
// keys
let keys = dump.keys().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(keys), @"d751713988987e9331980363e24189ce");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies2 = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"548284a84de510f71e88e6cdea495cf5");
// movies
insta::assert_json_snapshot!(movies.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 110);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d153b5a81d8b3cdcbe1dec270b574022");
// movies2
insta::assert_json_snapshot!(movies2.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "movies_2",
"primaryKey": null,
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(movies2.settings().unwrap());
let documents = movies2.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 0);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d751713988987e9331980363e24189ce");
// spells
insta::assert_json_snapshot!(spells.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"235016433dd04262c7f2da01d1e808ce");
}
#[test]
fn import_dump_v2() {
let dump = File::open("tests/assets/v2.dump").unwrap();
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-09 20:27:59.904096267 +00:00:00");
assert_eq!(dump.instance_uid().unwrap(), None);
// tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"bc616290adfe7d09a624cf6065ca9069");
assert_eq!(update_files.len(), 9);
assert!(update_files[0].is_some()); // the enqueued document addition
assert!(update_files[1..].iter().all(|u| u.is_none())); // everything already processed
// keys
let keys = dump.keys().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(keys), @"d751713988987e9331980363e24189ce");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies2 = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "2022-10-09T20:27:22.688964637Z",
"updatedAt": "2022-10-09T20:27:23.951017769Z"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"548284a84de510f71e88e6cdea495cf5");
// movies
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "2022-10-09T20:27:22.197788495Z",
"updatedAt": "2022-10-09T20:28:01.93111053Z"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 110);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d153b5a81d8b3cdcbe1dec270b574022");
// movies2
insta::assert_json_snapshot!(movies2.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "movies_2",
"primaryKey": null,
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(movies2.settings().unwrap());
let documents = movies2.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 0);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d751713988987e9331980363e24189ce");
// spells
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "2022-10-09T20:27:24.242683494Z",
"updatedAt": "2022-10-09T20:27:24.312809641Z"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"235016433dd04262c7f2da01d1e808ce");
}
#[test]
fn import_dump_v2_from_meilisearch_v0_22_0_issue_3435() {
let dump = File::open("tests/assets/v2-v0.22.0.dump").unwrap();
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2023-01-30 16:26:09.247261 +00:00:00");
assert_eq!(dump.instance_uid().unwrap(), None);
// tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"2db37756d8af1fb7623436b76e8956a6");
assert_eq!(update_files.len(), 8);
assert!(update_files[0..].iter().all(|u| u.is_none())); // everything already processed
// keys
let keys = dump.keys().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(keys), @"d751713988987e9331980363e24189ce");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "2023-01-30T16:25:56.595257Z",
"updatedAt": "2023-01-30T16:25:58.70348Z"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"548284a84de510f71e88e6cdea495cf5");
// movies
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "2023-01-30T16:25:56.192178Z",
"updatedAt": "2023-01-30T16:25:56.455714Z"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"0227598af846e574139ee0b80e03a720");
// spells
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "2023-01-30T16:25:58.876405Z",
"updatedAt": "2023-01-30T16:25:59.079906Z"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"235016433dd04262c7f2da01d1e808ce");
}
#[test]
fn import_dump_v1() {
let dump = File::open("tests/assets/v1.dump").unwrap();
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
assert_eq!(dump.date(), None);
assert_eq!(dump.instance_uid().unwrap(), None);
// tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"8df6eab075a44b3c1af6b726f9fd9a43");
assert_eq!(update_files.len(), 9);
assert!(update_files[..].iter().all(|u| u.is_none())); // no update file in dump v1
// keys
let keys = dump.keys().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot!(meili_snap::json_string!(keys), @"[]");
meili_snap::snapshot_hash!(meili_snap::json_string!(keys), @"d751713988987e9331980363e24189ce");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "2022-10-02T13:23:39.976870431Z",
"updatedAt": "2022-10-02T13:27:54.353262482Z"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"b01c8371aea4c7171af0d4d846a2bdca");
// movies
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "2022-10-02T13:15:29.477512777Z",
"updatedAt": "2022-10-02T13:21:12.671204856Z"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"b63dbed5bbc059f3e32bc471ae699bf5");
// spells
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "2022-10-02T13:38:26.358882984Z",
"updatedAt": "2022-10-02T13:38:26.385609433Z"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"aa24c0cfc733d66c396237ad44263bed");
}
}

View file

@ -0,0 +1,24 @@
---
source: dump/src/reader/mod.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,38 @@
---
source: dump/src/reader/mod.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null
}

View file

@ -0,0 +1,31 @@
---
source: dump/src/reader/mod.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [
"genres",
"id"
],
"sortableAttributes": [
"genres",
"id"
],
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,23 @@
---
source: dump/src/reader/mod.rs
expression: movies2.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,23 @@
---
source: dump/src/reader/mod.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,37 @@
---
source: dump/src/reader/mod.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null
}

View file

@ -0,0 +1,24 @@
---
source: dump/src/reader/mod.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,25 @@
---
source: dump/src/reader/mod.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,39 @@
---
source: dump/src/reader/mod.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null
}

View file

@ -0,0 +1,30 @@
---
source: dump/src/reader/mod.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [
"genres",
"id"
],
"sortableAttributes": [
"release_date"
],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,25 @@
---
source: dump/src/reader/mod.rs
expression: movies2.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,25 @@
---
source: dump/src/reader/mod.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,39 @@
---
source: dump/src/reader/mod.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null
}

View file

@ -0,0 +1,31 @@
---
source: dump/src/reader/mod.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [
"genres",
"id"
],
"sortableAttributes": [
"release_date"
],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,34 @@
---
source: dump/src/reader/mod.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null,
"typoTolerance": {
"enabled": true,
"minWordSizeForTypos": {
"oneTypo": 5,
"twoTypos": 9
},
"disableOnWords": [],
"disableOnAttributes": []
}
}

View file

@ -0,0 +1,48 @@
---
source: dump/src/reader/mod.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null,
"typoTolerance": {
"enabled": true,
"minWordSizeForTypos": {
"oneTypo": 5,
"twoTypos": 9
},
"disableOnWords": [],
"disableOnAttributes": []
}
}

View file

@ -0,0 +1,40 @@
---
source: dump/src/reader/mod.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [
"genres",
"id"
],
"sortableAttributes": [
"release_date"
],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null,
"typoTolerance": {
"enabled": true,
"minWordSizeForTypos": {
"oneTypo": 5,
"twoTypos": 9
},
"disableOnWords": [],
"disableOnAttributes": []
}
}

View file

@ -0,0 +1,40 @@
---
source: dump/src/reader/mod.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null,
"typoTolerance": {
"enabled": true,
"minWordSizeForTypos": {
"oneTypo": 5,
"twoTypos": 9
},
"disableOnWords": [],
"disableOnAttributes": []
},
"faceting": {
"maxValuesPerFacet": 100
},
"pagination": {
"maxTotalHits": 1000
}
}

View file

@ -0,0 +1,54 @@
---
source: dump/src/reader/mod.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null,
"typoTolerance": {
"enabled": true,
"minWordSizeForTypos": {
"oneTypo": 5,
"twoTypos": 9
},
"disableOnWords": [],
"disableOnAttributes": []
},
"faceting": {
"maxValuesPerFacet": 100
},
"pagination": {
"maxTotalHits": 1000
}
}

View file

@ -0,0 +1,46 @@
---
source: dump/src/reader/mod.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [
"genres",
"id"
],
"sortableAttributes": [
"release_date"
],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null,
"typoTolerance": {
"enabled": true,
"minWordSizeForTypos": {
"oneTypo": 5,
"twoTypos": 9
},
"disableOnWords": [],
"disableOnAttributes": []
},
"faceting": {
"maxValuesPerFacet": 100
},
"pagination": {
"maxTotalHits": 1000
}
}

View file

@ -0,0 +1,56 @@
---
source: dump/src/reader/mod.rs
expression: vector_index.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"nonSeparatorTokens": [],
"separatorTokens": [],
"dictionary": [],
"synonyms": {},
"distinctAttribute": null,
"proximityPrecision": "byWord",
"typoTolerance": {
"enabled": true,
"minWordSizeForTypos": {
"oneTypo": 5,
"twoTypos": 9
},
"disableOnWords": [],
"disableOnAttributes": []
},
"faceting": {
"maxValuesPerFacet": 100,
"sortFacetValuesBy": {
"*": "alpha"
}
},
"pagination": {
"maxTotalHits": 1000
},
"embedders": {
"default": {
"source": "huggingFace",
"model": "BAAI/bge-base-en-v1.5",
"revision": "617ca489d9e86b49b8167676d8220688b99db36e",
"documentTemplate": "{% for field in fields %} {{ field.name }}: {{ field.value }}\n{% endfor %}"
}
},
"searchCutoffMs": null
}

View file

@ -0,0 +1,783 @@
---
source: dump/src/reader/mod.rs
expression: document
---
{
"id": "e3",
"desc": "overriden vector + map",
"_vectors": {
"default": [
0.2,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1
],
"toto": [
0.1
]
}
}

View file

@ -0,0 +1,786 @@
---
source: dump/src/reader/mod.rs
expression: document
---
{
"id": "e2",
"desc": "natural vector + map",
"_vectors": {
"toto": [],
"default": {
"embeddings": [
[
-0.05189208313822746,
-0.9273212552070618,
0.1443813145160675,
0.0932632014155388,
0.2665371894836426,
0.36266782879829407,
0.6402910947799683,
0.32014018297195435,
0.030915971845388412,
-0.9312191605567932,
-0.3718109726905823,
-0.2700554132461548,
-1.1014580726623535,
0.9154956936836244,
-0.3406888246536255,
1.0077725648880005,
0.6577560901641846,
-0.3955195546150207,
-0.4148270785808563,
0.1855088472366333,
0.5062315464019775,
-0.3632686734199524,
-0.2277890294790268,
0.2560805082321167,
-0.3853609561920166,
-0.1604762226343155,
-0.13947471976280212,
-0.20147813856601715,
-0.4466346800327301,
-0.3761846721172333,
0.1443382054567337,
0.18205296993255615,
0.49359792470932007,
-0.22538000345230105,
-0.4996317625045776,
-0.22734887897968292,
-0.6034309267997742,
-0.7857939600944519,
-0.34923747181892395,
-0.3466345965862274,
0.21176661550998688,
-0.5101462006568909,
-0.3403083384037018,
0.000315118464641273,
0.236465722322464,
-0.10246097296476364,
-1.3013339042663574,
0.3419138789176941,
-0.32963496446609497,
-0.0901619717478752,
-0.5426247119903564,
0.22656650841236117,
-0.44758284091949463,
0.14151698350906372,
-0.1089438870549202,
0.5500766634941101,
-0.670711100101471,
-0.6227269768714905,
0.3894464075565338,
-0.27609574794769287,
0.7028202414512634,
-0.19697771966457367,
0.328511506319046,
0.5063360929489136,
0.4065195322036743,
0.2614171802997589,
-0.30274391174316406,
1.0393824577331543,
-0.7742937207221985,
-0.7874112129211426,
-0.6749666929244995,
0.5190866589546204,
0.004123548045754433,
-0.28312963247299194,
-0.038731709122657776,
-1.0142987966537476,
-0.09519586712121964,
0.8755272626876831,
0.4876938760280609,
0.7811151742935181,
0.85174959897995,
0.11826585978269576,
0.5373436808586121,
0.3649002015590668,
0.19064077734947205,
-0.00287026260048151,
-0.7305403351783752,
-0.015206154435873032,
-0.7899249196052551,
0.19407285749912265,
0.08596625179052353,
-0.28976231813430786,
-0.1525907665491104,
0.3798313438892365,
0.050306469202041626,
-0.5697937607765198,
0.4219021201133728,
0.276252806186676,
0.1559903472661972,
0.10030482709407806,
-0.4043720066547394,
-0.1969818025827408,
0.5739826560020447,
0.2116064727306366,
-1.4620544910430908,
-0.7802462577819824,
-0.24739810824394223,
-0.09791352599859238,
-0.4413802027702331,
0.21549351513385773,
-0.9520436525344848,
-0.08762510865926743,
0.08154498040676117,
-0.6154940724372864,
-1.01079523563385,
0.885427713394165,
0.6967288851737976,
0.27186504006385803,
-0.43194177746772766,
-0.11248451471328735,
0.7576630711555481,
0.4998855590820313,
0.0264343973249197,
0.9872855544090272,
0.5634694695472717,
0.053698331117630005,
0.19410227239131927,
0.3570743501186371,
-0.23670297861099243,
-0.9114483594894408,
0.07884842902421951,
0.7318344116210938,
0.44630110263824463,
0.08745364099740982,
-0.347101628780365,
-0.4314247667789459,
-0.5060274004936218,
0.003706763498485088,
0.44320008158683777,
-0.00788921769708395,
-0.1368623524904251,
-0.17391923069953918,
0.14473655819892883,
0.10927865654230118,
0.6974599361419678,
0.005052129738032818,
-0.016953065991401672,
-0.1256176233291626,
-0.036742497235536575,
0.5591985583305359,
-0.37619709968566895,
0.22429119050502777,
0.5403043031692505,
-0.8603790998458862,
-0.3456307053565979,
0.9292937517166138,
0.5074859261512756,
0.6310645937919617,
-0.3091641068458557,
0.46902573108673096,
0.7891915440559387,
0.4499550759792328,
0.2744995653629303,
0.2712305784225464,
-0.04349074140191078,
-0.3638863265514374,
0.7839881777763367,
0.7352104783058167,
-0.19457511603832245,
-0.5957832932472229,
-0.43704694509506226,
-1.084769368171692,
0.4904985725879669,
0.5385226011276245,
0.1891629993915558,
0.12338479608297348,
0.8315675258636475,
-0.07830192148685455,
1.0916285514831543,
-0.28066861629486084,
-1.3585069179534912,
0.5203898549079895,
0.08678033947944641,
-0.2566044330596924,
0.09484415501356123,
-0.0180208683013916,
1.0264745950698853,
-0.023572135716676712,
0.5864979028701782,
0.7625196576118469,
-0.2543414533138275,
-0.8877770900726318,
0.7611982822418213,
-0.06220436468720436,
0.937336564064026,
0.2704363465309143,
-0.37733694911003113,
0.5076137781143188,
-0.30641937255859375,
0.6252772808074951,
-0.0823579877614975,
-0.03736555948853493,
0.4131673276424408,
-0.6514252424240112,
0.12918265163898468,
-0.4483584463596344,
0.6750786304473877,
-0.37008383870124817,
-0.02324833907186985,
0.38027650117874146,
-0.26374951004981995,
0.4346931278705597,
0.42882832884788513,
-0.48798441886901855,
1.1882442235946655,
0.5132288336753845,
0.5284568667411804,
-0.03538886830210686,
0.29620853066444397,
-1.0683696269989014,
0.25936177372932434,
0.10404160618782043,
-0.25796034932136536,
0.027896970510482788,
-0.09225251525640488,
1.4811025857925415,
0.641173779964447,
-0.13838383555412292,
-0.3437179923057556,
0.5667019486427307,
-0.5400741696357727,
0.31090837717056274,
0.6470608115196228,
-0.3747067153453827,
-0.7364534735679626,
-0.07431528717279434,
0.5173454880714417,
-0.6578747034072876,
0.7107478976249695,
-0.7918999791145325,
-0.0648345872759819,
0.609937846660614,
-0.7329513430595398,
0.9741371870040894,
0.17912346124649048,
-0.02658769302070141,
0.5162150859832764,
-0.3978803157806397,
-0.7833885550498962,
-0.6497276425361633,
-0.3898126780986786,
-0.0952848568558693,
0.2663288116455078,
-0.1604052186012268,
0.373076468706131,
-0.8357769250869751,
-0.05217683315277099,
-0.2680160701274872,
0.8389158248901367,
0.6833611130714417,
-0.6712407469749451,
0.7406917214393616,
-0.44522786140441895,
-0.34645363688468933,
-0.27384576201438904,
-0.9878405928611756,
-0.8166060447692871,
0.06268279999494553,
0.38567957282066345,
-0.3274703919887543,
0.5296315550804138,
-0.11810623109340668,
0.23029841482639313,
0.08616159111261368,
-0.2195747196674347,
0.09430307894945145,
0.4057176411151886,
0.4892159104347229,
-0.1636916548013687,
-0.6071445345878601,
0.41256585717201233,
0.622254490852356,
-0.41223976016044617,
-0.6686707139015198,
-0.7474371790885925,
-0.8509522080421448,
-0.16754287481307983,
-0.9078601002693176,
-0.29653599858283997,
-0.5020652413368225,
0.4692700505256653,
0.01281109917908907,
-0.16071580350399017,
0.03388889133930206,
-0.020511148497462273,
0.5027827024459839,
-0.20729811489582065,
0.48107290267944336,
0.33669769763946533,
-0.5275911688804626,
0.48271527886390686,
0.2738940715789795,
-0.033152539283037186,
-0.13629786670207977,
-0.05965912342071533,
-0.26200807094573975,
0.04002794995903969,
-0.34095603227615356,
-3.986898899078369,
-0.46819332242012024,
-0.422744482755661,
-0.169097900390625,
0.6008929014205933,
0.058016058057546616,
-0.11401277780532836,
-0.3077819049358368,
-0.09595538675785063,
0.6723822355270386,
0.19367831945419312,
0.28304359316825867,
0.1609862744808197,
0.7567598819732666,
0.6889985799789429,
0.06907720118761063,
-0.04188092052936554,
-0.7434936165809631,
0.13321782648563385,
0.8456063270568848,
-0.10364038497209548,
-0.45084846019744873,
-0.4758241474628449,
0.43882066011428833,
-0.6432598829269409,
0.7217311859130859,
-0.24189773201942444,
0.12737572193145752,
-1.1008601188659668,
-0.3305315673351288,
0.14614742994308472,
-0.7819333076477051,
0.5287120342254639,
-0.055538054555654526,
0.1877404749393463,
-0.6907662153244019,
0.5616975426673889,
-0.4611121714115143,
-0.26109233498573303,
-0.12898315489292145,
-0.3724522292613983,
-0.7191406488418579,
-0.4425233602523804,
-0.644108235836029,
0.8424481153488159,
0.17532426118850708,
-0.5121750235557556,
-0.6467239260673523,
-0.0008507720194756985,
0.7866212129592896,
-0.02644744887948036,
-0.005045140627771616,
0.015782782807946205,
0.16334445774555206,
-0.1913367658853531,
-0.13697923719882965,
-0.6684983372688293,
0.18346354365348816,
-0.341105580329895,
0.5427411198616028,
0.3779832422733307,
-0.6778115034103394,
-0.2931850254535675,
-0.8805161714553833,
-0.4212774932384491,
-0.5368952751159668,
-1.3937891721725464,
-1.225494146347046,
0.4276703894138336,
1.1205668449401855,
-0.6005299687385559,
0.15732505917549133,
-0.3914784789085388,
-1.357046604156494,
-0.4707142114639282,
-0.1497287154197693,
-0.25035548210144043,
-0.34328439831733704,
0.39083412289619446,
0.1623048633337021,
-0.9275814294815063,
-0.6430015563964844,
0.2973862886428833,
0.5580436587333679,
-0.6232585310935974,
-0.6611042022705078,
0.4015969038009643,
-1.0232892036437988,
-0.2585645020008087,
-0.5431421399116516,
0.5021264553070068,
-0.48601630330085754,
-0.010242084041237833,
0.5862035155296326,
0.7316920161247253,
0.4036808013916016,
0.4269520044326782,
-0.705938458442688,
0.7747307419776917,
0.10164368897676468,
0.7887958884239197,
-0.9612497091293336,
0.12755516171455383,
0.06812842190265656,
-0.022603651508688927,
0.14722754061222076,
-0.5588505268096924,
-0.20689940452575684,
0.3557641804218292,
-0.6812759637832642,
0.2860803008079529,
-0.38954633474349976,
0.1759403496980667,
-0.5678874850273132,
-0.1692986786365509,
-0.14578519761562347,
0.5711379051208496,
1.0208125114440918,
0.7759483456611633,
-0.372348427772522,
-0.5460885763168335,
0.7190321683883667,
-0.6914990544319153,
0.13365162909030914,
-0.4854792356491089,
0.4054908752441406,
0.4502798914909363,
-0.3041122555732727,
-0.06726965308189392,
-0.05570871382951737,
-0.0455719493329525,
0.4785125255584717,
0.8867972493171692,
0.4107886850833893,
0.6121342182159424,
-0.20477132499217987,
-0.5598517656326294,
-0.6443566679954529,
-0.5905212759971619,
-0.5571200251579285,
0.17573799192905426,
-0.28621870279312134,
0.1685224026441574,
0.09719007462263109,
-0.04223639518022537,
-0.28623101115226746,
-0.1449810117483139,
-0.3789580464363098,
-0.5227636098861694,
-0.049728814512491226,
0.7849089503288269,
0.16792525351047516,
0.9849340915679932,
-0.6559549570083618,
0.35723909735679626,
-0.6822739243507385,
1.2873116731643677,
0.19993330538272855,
0.03512010723352432,
-0.6972134113311768,
0.18453484773635864,
-0.2437680810689926,
0.2156416028738022,
0.5230382680892944,
0.22020135819911957,
0.8314080238342285,
0.15627102553844452,
-0.7330264449119568,
0.3888184726238251,
-0.22034703195095065,
0.5457669496536255,
-0.48084837198257446,
-0.45576658844947815,
-0.09287727624177931,
-0.06968110054731369,
0.35125672817230225,
-0.4278119504451752,
0.2038476765155792,
0.11392722278833388,
0.9433983564376832,
-0.4097744226455689,
0.035297419875860214,
-0.4274404048919678,
-0.25100165605545044,
1.0943366289138794,
-0.07634022831916809,
-0.2925529479980469,
-0.7512530088424683,
0.2649727463722229,
-0.4078235328197479,
-0.3372223973274231,
0.05190162733197212,
0.005654910113662481,
-0.0001571219472680241,
-0.35445958375930786,
-0.7837416529655457,
0.1500556766986847,
0.4383024573326111,
0.6099548935890198,
0.05951934307813645,
-0.21325334906578064,
0.0199207104742527,
-0.22704418003559113,
-0.6481077671051025,
0.37442275881767273,
-1.015955924987793,
0.38637226819992065,
-0.06489371508359909,
-0.494120329618454,
0.3469836115837097,
0.15402406454086304,
-0.7660972476005554,
-0.7053225040435791,
-0.25964751839637756,
0.014004424214363098,
-0.2860170006752014,
-0.17565494775772095,
-0.45117494463920593,
-0.0031954257283359766,
0.09676837921142578,
-0.514464259147644,
0.41698193550109863,
-0.21642713248729703,
-0.5398141145706177,
-0.3647628426551819,
0.37005379796028137,
0.239425927400589,
-0.08833975344896317,
0.934946596622467,
-0.48340797424316406,
0.6241437792778015,
-0.7253676652908325,
-0.04303571209311485,
1.1125205755233765,
-0.15692919492721558,
-0.2914651036262512,
-0.5117168426513672,
0.21365483105182648,
0.4924402534961701,
0.5269662141799927,
0.0352792888879776,
-0.149167999625206,
-0.6019760370254517,
0.08245442807674408,
0.4900692105293274,
0.518824577331543,
-0.00005570516441366635,
-0.553304135799408,
0.22217543423175812,
0.5047767758369446,
0.135724738240242,
1.1511540412902832,
-0.3541218340396881,
-0.9712511897087096,
0.8353699445724487,
-0.39227569103240967,
-0.9117669463157654,
-0.26349931955337524,
0.05597023293375969,
0.20695461332798004,
0.3178807199001312,
1.0663238763809204,
0.5062212347984314,
0.7288597822189331,
0.09899299591779707,
0.553720235824585,
0.675009548664093,
-0.20067055523395536,
0.3138423264026642,
-0.6886593103408813,
-0.2910398542881012,
-1.3186300992965698,
-0.4684459865093231,
-0.095743365585804,
-0.1257995069026947,
-0.4858281314373016,
-0.4935407340526581,
-0.3266896903514862,
-0.3928797245025635,
-0.40803104639053345,
-0.9975396394729614,
0.4229583740234375,
0.37309643626213074,
0.4431034922599793,
0.30364808440208435,
-0.3765178918838501,
0.5616499185562134,
0.16904796659946442,
-0.7343707084655762,
0.2560209631919861,
0.6166825294494629,
0.3200829327106476,
-0.4483652710914612,
0.16224201023578644,
-0.31495288014411926,
-0.42713335156440735,
0.7270734906196594,
0.7049484848976135,
-0.0571461021900177,
0.04477125033736229,
-0.6647796034812927,
1.183672308921814,
0.36199676990509033,
0.046881116926670074,
0.4515796303749085,
0.9278061985969543,
0.31471705436706543,
-0.7073333859443665,
-0.3443860113620758,
0.5440067052841187,
-0.15020819008350372,
-0.541202962398529,
0.5203295946121216,
1.2192286252975464,
-0.9983593225479126,
-0.18758884072303772,
0.2758221924304962,
-0.6511523723602295,
-0.1584404855966568,
-0.236241415143013,
0.2692437767982483,
-0.4941152036190033,
0.4987454116344452,
-0.3331359028816223,
0.3163745701313019,
0.745529294013977,
-0.2905873656272888,
0.13602906465530396,
0.4679684340953827,
1.0555986166000366,
1.075700044631958,
0.5368486046791077,
-0.5118206739425659,
0.8668332099914551,
-0.5726966857910156,
-0.7811751961708069,
0.1938626915216446,
-0.1929349899291992,
0.1757766306400299,
0.6384295225143433,
0.26462844014167786,
0.9542630314826964,
0.19313029944896695,
1.264248013496399,
-0.6304428577423096,
0.0487106591463089,
-0.16211535036563873,
-0.7894763350486755,
0.3582514822483063,
-0.04153040423989296,
0.635784387588501,
0.6554391980171204,
-0.47010496258735657,
-0.8302040696144104,
-0.1350124627351761,
0.2568812072277069,
0.13614831864833832,
-0.2563649117946625,
-1.0434694290161133,
0.3232482671737671,
0.47882452607154846,
0.4298652410507202,
1.0563770532608032,
-0.28917592763900757,
-0.8533256649971008,
0.10648339986801147,
0.6376127004623413,
-0.20832888782024384,
0.2370245456695557,
0.0018312990432605147,
-0.2034837007522583,
0.01051164511591196,
-1.105310082435608,
0.29724350571632385,
0.15604574978351593,
0.1973688006401062,
0.44394731521606445,
0.3974513411521912,
-0.13625948131084442,
0.9571986198425292,
0.2257384955883026,
0.2323588728904724,
-0.5583669543266296,
-0.7854922413825989,
0.1647188365459442,
-1.6098142862319946,
0.318587988615036,
-0.13399995863437653,
-0.2172701060771942,
-0.767514705657959,
-0.5813586711883545,
-0.3195130527019501,
-0.04894036799669266,
0.2929930090904236,
-0.8213384747505188,
0.07181350141763687,
0.7469993829727173,
0.6407455801963806,
0.16365697979927063,
0.7870153188705444,
0.6524736881256104,
0.6399973630905151,
-0.04992736503481865,
-0.03959266096353531,
-0.2512352466583252,
0.8448855876922607,
-0.1422702670097351,
0.1216789186000824,
-1.2647287845611572,
0.5931149125099182,
0.7186052203178406,
-0.06118432432413101,
-1.1942816972732544,
-0.17677085101604462,
0.31543800234794617,
-0.32252824306488037,
0.8255583047866821,
-0.14529970288276672,
-0.2695446312427521,
-0.33378756046295166,
-0.1653425395488739,
0.1454019844532013,
-0.3920115828514099,
0.912214994430542,
-0.7279734015464783,
0.7374742031097412,
0.933980405330658,
0.13429680466651917,
-0.514870285987854,
0.3989711999893189,
-0.11613689363002776,
0.4022413492202759,
-0.9990655779838562,
-0.33749932050704956,
-0.4334589838981629,
-1.376373291015625,
-0.2993924915790558,
-0.09454808384180068,
-0.01314175222069025,
-0.001090060803107917,
0.2137461006641388,
0.2938512861728668,
0.17508235573768616,
0.8260607123374939,
-0.7218498587608337,
0.2414487451314926,
-0.47296759486198425,
-0.3002610504627228,
-1.238540768623352,
0.08663805574178696,
0.6805586218833923,
0.5909030437469482,
-0.42807504534721375,
-0.22887496650218964,
0.47537800669670105,
-1.0474627017974854,
0.6338009238243103,
0.06548397243022919,
0.4971011281013489,
1.3484878540039063
]
],
"regenerate": true
}
}
}

View file

@ -0,0 +1,785 @@
---
source: dump/src/reader/mod.rs
expression: document
---
{
"id": "e1",
"desc": "natural vector",
"_vectors": {
"default": {
"embeddings": [
[
-0.2979458272457123,
-0.5288640856742859,
-0.019957859069108963,
-0.18495318293571472,
0.7429973483085632,
0.5238497257232666,
0.432366281747818,
0.32744166254997253,
0.0020762972999364138,
-0.9507834911346436,
-0.35097137093544006,
0.08469701558351517,
-1.4176613092422483,
0.4647577106952667,
-0.69340580701828,
1.0372896194458008,
0.3716741800308227,
0.06031008064746857,
-0.6152024269104004,
0.007914665155112743,
0.7954924702644348,
-0.20773003995418549,
0.09376765787601472,
0.04508133605122566,
-0.2084471583366394,
-0.1518009901046753,
0.018195509910583496,
-0.07044368237257004,
-0.18119366466999057,
-0.4480230510234833,
0.3822529911994934,
0.1911812424659729,
0.4674372375011444,
0.06963984668254852,
-0.09341949224472046,
0.005675444379448891,
-0.6774799227714539,
-0.7066726684570313,
-0.39256376028060913,
0.04005039855837822,
0.2084812968969345,
-0.7872875928878784,
-0.8205880522727966,
0.2919981777667999,
-0.06004738807678223,
-0.4907574355602264,
-1.5937862396240234,
0.24249385297298431,
-0.14709846675395966,
-0.11860740929841997,
-0.8299489617347717,
0.472964346408844,
-0.497518390417099,
-0.22205302119255063,
-0.4196169078350067,
0.32697558403015137,
-0.360930860042572,
-0.9789686799049376,
0.1887447088956833,
-0.403737336397171,
0.18524253368377688,
0.3768732249736786,
0.3666233420372009,
0.3511938452720642,
0.6985810995101929,
0.41721710562705994,
0.09754953533411026,
0.6204307079315186,
-1.0762996673583984,
-0.06263761967420578,
-0.7376511693000793,
0.6849768161773682,
-0.1745152473449707,
-0.40449759364128113,
0.20757411420345304,
-0.8424443006515503,
0.330015629529953,
0.3489064872264862,
1.0954371690750122,
0.8487558960914612,
1.1076823472976685,
0.61430823802948,
0.4155903458595276,
0.4111340939998626,
0.05753209814429283,
-0.06429877132177353,
-0.765606164932251,
-0.41703930497169495,
-0.508820652961731,
0.19859947264194489,
-0.16607828438282013,
-0.28112146258354187,
0.11032675206661224,
0.38809511065483093,
-0.36498191952705383,
-0.48671194911003113,
0.6755134463310242,
0.03958442434668541,
0.4478721618652344,
-0.10335399955511092,
-0.9546685814857484,
-0.6087718605995178,
0.17498846352100372,
0.08320838958024979,
-1.4478336572647097,
-0.605027437210083,
-0.5867993235588074,
-0.14711688458919525,
-0.5447602272033691,
-0.026259321719408035,
-0.6997418403625488,
-0.07349082082509995,
0.10638900846242905,
-0.7133527398109436,
-0.9396815299987792,
1.087092399597168,
1.1885089874267578,
0.4011896848678589,
-0.4089202582836151,
-0.10938972979784012,
0.6726722121238708,
0.24576938152313232,
-0.24247920513153076,
1.1499971151351929,
0.47813335061073303,
-0.05331678315997124,
0.32338133454322815,
0.4870913326740265,
-0.23144258558750153,
-1.2023426294326782,
0.2349330335855484,
1.080536961555481,
0.29334118962287903,
0.391574501991272,
-0.15818795561790466,
-0.2948290705680847,
-0.024689948186278343,
0.06602869182825089,
0.5937030911445618,
-0.047901444137096405,
-0.512734591960907,
-0.35780075192451477,
0.28751692175865173,
0.4298716187477112,
0.9242428541183472,
-0.17208744585514069,
0.11515070497989656,
-0.0335976779460907,
-0.3422986567020416,
0.5344581604003906,
0.19895796477794647,
0.33001241087913513,
0.6390730142593384,
-0.6074934005737305,
-0.2553696632385254,
0.9644920229911804,
0.2699219584465027,
0.6403993368148804,
-0.6380003690719604,
-0.027310986071825027,
0.638815701007843,
0.27719101309776306,
-0.13553589582443237,
0.750195324420929,
0.1224869191646576,
-0.20613941550254825,
0.8444448709487915,
0.16200250387191772,
-0.24750925600528717,
-0.739950954914093,
-0.28443849086761475,
-1.176282525062561,
0.516107976436615,
0.3774825632572174,
0.10906043648719788,
0.07962015271186829,
0.7384604215621948,
-0.051241904497146606,
1.1730090379714966,
-0.4828610122203827,
-1.404372215270996,
0.8811132311820984,
-0.3839482367038727,
0.022516896948218346,
-0.0491158664226532,
-0.43027013540267944,
1.2049334049224854,
-0.27309560775756836,
0.6883630752563477,
0.8264574408531189,
-0.5020735263824463,
-0.4874092042446137,
0.6007202863693237,
-0.4965405762195587,
1.1302915811538696,
0.032572727650403976,
-0.3731859028339386,
0.658271849155426,
-0.9023059010505676,
0.7400162220001221,
0.014550759457051754,
-0.19699542224407196,
0.2319706380367279,
-0.789058268070221,
-0.14905710518360138,
-0.5826214551925659,
0.207652747631073,
-0.4507439732551574,
-0.3163885474205017,
0.3604124188423157,
-0.45119962096214294,
0.3428427278995514,
0.3005594313144684,
-0.36026081442832947,
1.1014249324798584,
0.40884315967559814,
0.34991952776908875,
-0.1806638240814209,
0.27440476417541504,
-0.7118373513221741,
0.4645499587059021,
0.214790478348732,
-0.2343102991580963,
0.10500429570674896,
-0.28034430742263794,
1.2267805337905884,
1.0561333894729614,
-0.497364342212677,
-0.6143305897712708,
0.24963727593421936,
-0.33136463165283203,
-0.01473914459347725,
0.495918869972229,
-0.6985538005828857,
-1.0033197402954102,
0.35937801003456116,
0.6325868368148804,
-0.6808838844299316,
1.0354058742523191,
-0.7214401960372925,
-0.33318862318992615,
0.874398410320282,
-0.6594992280006409,
0.6830640435218811,
-0.18534131348133087,
0.024834271520376205,
0.19901277124881744,
-0.5992477536201477,
-1.2126628160476685,
-0.9245557188987732,
-0.3898217976093292,
-0.1286519467830658,
0.4217943847179413,
-0.1143646091222763,
0.5630772709846497,
-0.5240639448165894,
0.21152715384960177,
-0.3792001008987427,
0.8266305327415466,
1.170984387397766,
-0.8072142004966736,
0.11382893472909927,
-0.17953898012638092,
-0.1789460331201553,
-0.15078622102737427,
-1.2082908153533936,
-0.7812382578849792,
-0.10903695970773696,
0.7303897142410278,
-0.39054441452026367,
0.19511254131793976,
-0.09121843427419662,
0.22400228679180145,
0.30143046379089355,
0.1141919493675232,
0.48112115263938904,
0.7307931780815125,
0.09701362252235413,
-0.2795647978782654,
-0.3997688889503479,
0.5540812611579895,
0.564578115940094,
-0.40065160393714905,
-0.3629159033298493,
-0.3789091110229492,
-0.7298538088798523,
-0.6996853351593018,
-0.4477842152118683,
-0.289089560508728,
-0.6430277824401855,
0.2344944179058075,
0.3742927014827728,
-0.5079357028007507,
0.28841453790664673,
0.06515737622976303,
0.707315981388092,
0.09498685598373412,
0.8365515470504761,
0.10002726316452026,
-0.7695478200912476,
0.6264724135398865,
0.7562043070793152,
-0.23112858831882477,
-0.2871039807796478,
-0.25010058283805847,
0.2783474028110504,
-0.03224996477365494,
-0.9119359850883484,
-3.6940200328826904,
-0.5099936127662659,
-0.1604711413383484,
0.17453284561634064,
0.41759559512138367,
0.1419190913438797,
-0.11362407356500626,
-0.33312007784843445,
0.11511333286762238,
0.4667884409427643,
-0.0031647447030991316,
0.15879854559898376,
0.3042248487472534,
0.5404849052429199,
0.8515422344207764,
0.06286454200744629,
0.43790125846862793,
-0.8682025074958801,
-0.06363756954669952,
0.5547921657562256,
-0.01483887154608965,
-0.07361344993114471,
-0.929947018623352,
0.3502565622329712,
-0.5080993175506592,
1.0380364656448364,
-0.2017953395843506,
0.21319580078125,
-1.0763001441955566,
-0.556368887424469,
0.1949922740459442,
-0.6445739269256592,
0.6791343688964844,
0.21188358962535855,
0.3736183941364288,
-0.21800459921360016,
0.7597446441650391,
-0.3732394874095917,
-0.4710160195827484,
0.025146087631583217,
0.05341297015547752,
-0.9522109627723694,
-0.6000866889953613,
-0.08469046652317047,
0.5966026186943054,
0.3444081246852875,
-0.461188405752182,
-0.5279349088668823,
0.10296865552663804,
0.5175143480300903,
-0.20671147108078003,
0.13392412662506104,
0.4812754988670349,
0.2993808686733246,
-0.3005635440349579,
0.5141698122024536,
-0.6239235401153564,
0.2877119481563568,
-0.4452739953994751,
0.5621107816696167,
0.5047508478164673,
-0.4226335883140564,
-0.18578553199768064,
-1.1967322826385498,
0.28178197145462036,
-0.8692031502723694,
-1.1812998056411743,
-1.4526212215423584,
0.4645712077617645,
0.9327932000160216,
-0.6560136675834656,
0.461549699306488,
-0.5621527433395386,
-1.328449010848999,
-0.08676894754171371,
0.00021918353741057217,
-0.18864136934280396,
0.1259666532278061,
0.18240638077259064,
-0.14919660985469818,
-0.8965857625007629,
-0.7539900541305542,
0.013973715715110302,
0.504276692867279,
-0.704748272895813,
-0.6428424119949341,
0.6303996443748474,
-0.5404738187789917,
-0.31176653504371643,
-0.21262824535369873,
0.18736739456653595,
-0.7998970746994019,
0.039946746081113815,
0.7390344738960266,
0.4283199906349182,
0.3795057237148285,
0.07204607129096985,
-0.9230587482452391,
0.9440426230430604,
0.26272690296173096,
0.5598306655883789,
-1.0520871877670288,
-0.2677186131477356,
-0.1888762265443802,
0.30426350235939026,
0.4746131896972656,
-0.5746733546257019,
-0.4197768568992615,
0.8565112948417664,
-0.6767723560333252,
0.23448683321475983,
-0.2010004222393036,
0.4112907350063324,
-0.6497949957847595,
-0.418667733669281,
-0.4950824975967407,
0.44438859820365906,
1.026281714439392,
0.482397586107254,
-0.26220494508743286,
-0.3640787005424499,
0.5907743573188782,
-0.8771642446517944,
0.09708411991596222,
-0.3671700060367584,
0.4331349730491638,
0.619417667388916,
-0.2684665620326996,
-0.5123821496963501,
-0.1502324342727661,
-0.012190685607492924,
0.3580845892429352,
0.8617186546325684,
0.3493645489215851,
1.0270192623138428,
0.18297909200191495,
-0.5881339311599731,
-0.1733516901731491,
-0.5040576457977295,
-0.340370237827301,
-0.26767754554748535,
-0.28570041060447693,
-0.032928116619586945,
0.6029254794120789,
0.17397655546665192,
0.09346921741962431,
0.27815181016921997,
-0.46699589490890503,
-0.8148876428604126,
-0.3964351713657379,
0.3812595009803772,
0.13547226786613464,
0.7126688361167908,
-0.3473474085330963,
-0.06573959439992905,
-0.6483767032623291,
1.4808889627456665,
0.30924928188323975,
-0.5085946917533875,
-0.8613000512123108,
0.3048902451992035,
-0.4241599142551422,
0.15909206867218018,
0.5764641761779785,
-0.07879110425710678,
1.015336513519287,
0.07599356025457382,
-0.7025855779647827,
0.30047643184661865,
-0.35094937682151794,
0.2522146999835968,
-0.2338722199201584,
-0.8326804637908936,
-0.13695412874221802,
-0.03452421352267265,
0.47974953055381775,
-0.18385636806488037,
0.32438594102859497,
0.1797013282775879,
0.787494957447052,
-0.12579888105392456,
-0.07507286965847015,
-0.4389670491218567,
0.2720070779323578,
0.8138866424560547,
0.01974171027541161,
-0.3057698905467987,
-0.6709924936294556,
0.0885881632566452,
-0.2862754464149475,
0.03475658595561981,
-0.1285519152879715,
0.3838353455066681,
-0.2944154739379883,
-0.4204859137535095,
-0.4416137933731079,
0.13426260650157928,
0.36733248829841614,
0.573428750038147,
-0.14928072690963745,
-0.026076916605234143,
0.33286052942276,
-0.5340145826339722,
-0.17279052734375,
-0.01154550164937973,
-0.6620771884918213,
0.18390542268753052,
-0.08265615254640579,
-0.2489682286977768,
0.2429984211921692,
-0.044153645634651184,
-0.986578404903412,
-0.33574509620666504,
-0.5387663841247559,
0.19767941534519196,
0.12540718913078308,
-0.3403128981590271,
-0.4154576361179352,
0.17275673151016235,
0.09407442808151244,
-0.5414086580276489,
0.4393929839134216,
0.1725579798221588,
-0.4998118281364441,
-0.6926208138465881,
0.16552448272705078,
0.6659538149833679,
-0.10949844866991044,
0.986426830291748,
0.01748848147690296,
0.4003709554672241,
-0.5430638194084167,
0.35347291827201843,
0.6887399554252625,
0.08274628221988678,
0.13407137989997864,
-0.591465950012207,
0.3446292281150818,
0.6069018244743347,
0.1935492902994156,
-0.0989871397614479,
0.07008486241102219,
-0.8503749370574951,
-0.09507356584072112,
0.6259510517120361,
0.13934025168418884,
0.06392545253038406,
-0.4112265408039093,
-0.08475656062364578,
0.4974113404750824,
-0.30606114864349365,
1.111435890197754,
-0.018766529858112335,
-0.8422622680664063,
0.4325508773326874,
-0.2832120656967163,
-0.4859798848628998,
-0.41498348116874695,
0.015977520495653152,
0.5292825698852539,
0.4538311660289765,
1.1328668594360352,
0.22632671892642975,
0.7918671369552612,
0.33401933312416077,
0.7306135296821594,
0.3548600673675537,
0.12506209313869476,
0.8573207855224609,
-0.5818327069282532,
-0.6953738927841187,
-1.6171947717666626,
-0.1699674427509308,
0.6318262815475464,
-0.05671752244234085,
-0.28145185112953186,
-0.3976689279079437,
-0.2041076272726059,
-0.5495951175689697,
-0.5152917504310608,
-0.9309796094894408,
0.101932130753994,
0.1367802917957306,
0.1490798443555832,
0.5304336547851563,
-0.5082434415817261,
0.06688683480024338,
0.14657628536224365,
-0.782435953617096,
0.2962816655635834,
0.6965363621711731,
0.8496337532997131,
-0.3042965829372406,
0.04343798756599426,
0.0330701619386673,
-0.5662598013877869,
1.1086925268173218,
0.756072998046875,
-0.204134538769722,
0.2404300570487976,
-0.47848284244537354,
1.3659011125564575,
0.5645433068275452,
-0.15836156904697418,
0.43395575881004333,
0.5944653749465942,
1.0043466091156006,
-0.49446743726730347,
-0.5954391360282898,
0.5341240763664246,
0.020598189905285835,
-0.4036853015422821,
0.4473709762096405,
1.1998231410980225,
-0.9317775368690492,
-0.23321466147899628,
0.2052552700042725,
-0.7423108816146851,
-0.19917210936546328,
-0.1722569614648819,
-0.034072667360305786,
-0.00671181408688426,
0.46396249532699585,
-0.1372445821762085,
0.053376372903585434,
0.7392690777778625,
-0.38447609543800354,
0.07497968524694443,
0.5197252631187439,
1.3746477365493774,
0.9060075879096984,
0.20000585913658145,
-0.4053704142570496,
0.7497360110282898,
-0.34087055921554565,
-1.101803183555603,
0.273650586605072,
-0.5125769376754761,
0.22472351789474487,
0.480757474899292,
-0.19845178723335263,
0.8857700824737549,
0.30752456188201904,
1.1109285354614258,
-0.6768012642860413,
0.524367094039917,
-0.22495046257972717,
-0.4224412739276886,
0.40753406286239624,
-0.23133376240730288,
0.3297771215438843,
0.4905449151992798,
-0.6813114285469055,
-0.7543983459472656,
-0.5599071383476257,
0.14351597428321838,
-0.029278717935085297,
-0.3970443606376648,
-0.303079217672348,
0.24161772429943085,
0.008353390730917454,
-0.0062365154735744,
1.0824860334396362,
-0.3704061508178711,
-1.0337258577346802,
0.04638749733567238,
1.163011074066162,
-0.31737643480300903,
0.013986887410283089,
0.19223114848136905,
-0.2260770797729492,
-0.210910826921463,
-1.0191949605941772,
0.22356095910072327,
0.09353553503751756,
0.18096882104873657,
0.14867214858531952,
0.43408671021461487,
-0.33312076330184937,
0.8173948526382446,
0.6428242921829224,
0.20215003192424777,
-0.6634518504142761,
-0.4132290482521057,
0.29815030097961426,
-1.579406976699829,
-0.0981958732008934,
-0.03941014781594277,
0.1709178239107132,
-0.5481140613555908,
-0.5338194966316223,
-0.3528362512588501,
-0.11561278253793716,
-0.21793591976165771,
-1.1570470333099363,
0.2157980799674988,
0.42083489894866943,
0.9639263153076172,
0.09747201204299928,
0.15671424567699432,
0.4034591615200043,
0.6728067994117737,
-0.5216875672340393,
0.09657668322324751,
-0.2416689097881317,
0.747975766658783,
0.1021689772605896,
0.11652665585279463,
-1.0484966039657593,
0.8489304780960083,
0.7169828414916992,
-0.09012343734502792,
-1.3173753023147583,
0.057890523225069046,
-0.006231260951608419,
-0.1018214002251625,
0.936040461063385,
-0.0502331368625164,
-0.4284322261810303,
-0.38209280371665955,
-0.22668412327766416,
0.0782942995429039,
-0.4881664514541626,
0.9268959760665894,
0.001867273123934865,
0.42261114716529846,
0.8283362984657288,
0.4256294071674347,
-0.7965338826179504,
0.4840078353881836,
-0.19861412048339844,
0.33977967500686646,
-0.4604192078113556,
-0.3107339143753052,
-0.2839638590812683,
-1.5734281539916992,
0.005220232997089624,
0.09239906817674635,
-0.7828494906425476,
-0.1397123783826828,
0.2576255202293396,
0.21372435986995697,
-0.23169949650764465,
0.4016408920288086,
-0.462497353553772,
-0.2186472862958908,
-0.5617868900299072,
-0.3649831712245941,
-1.1585862636566162,
-0.08222806453704834,
0.931126832962036,
0.4327389597892761,
-0.46451422572135925,
-0.5430706143379211,
-0.27434298396110535,
-0.9479129314422609,
0.1845661848783493,
0.3972720205783844,
0.4883299469947815,
1.04031240940094
]
],
"regenerate": true
}
}
}

View file

@ -0,0 +1,780 @@
---
source: dump/src/reader/mod.rs
expression: document
---
{
"id": "e0",
"desc": "overriden vector",
"_vectors": {
"default": [
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1
]
}
}

View file

@ -0,0 +1,262 @@
use std::fs::{self, File};
use std::io::{BufRead, BufReader};
use std::path::{Path, PathBuf};
use serde::Deserialize;
use tempfile::TempDir;
use time::OffsetDateTime;
use super::compat::v1_to_v2::CompatV1ToV2;
use super::Document;
use crate::{IndexMetadata, Result, Version};
pub mod settings;
pub mod update;
pub struct V1Reader {
pub dump: TempDir,
pub db_version: String,
pub dump_version: crate::Version,
indexes: Vec<V1Index>,
}
pub struct IndexUuid {
pub name: String,
pub uid: String,
}
pub type Task = self::update::UpdateStatus;
struct V1Index {
metadata: IndexMetadataV1,
path: PathBuf,
}
impl V1Index {
pub fn new(path: PathBuf, metadata: Index) -> Self {
Self { metadata: metadata.into(), path }
}
pub fn open(&self) -> Result<V1IndexReader> {
V1IndexReader::new(&self.path, self.metadata.clone())
}
pub fn metadata(&self) -> &IndexMetadata {
&self.metadata.metadata
}
}
pub struct V1IndexReader {
metadata: IndexMetadataV1,
documents: BufReader<File>,
settings: BufReader<File>,
updates: BufReader<File>,
}
impl V1IndexReader {
pub fn new(path: &Path, metadata: IndexMetadataV1) -> Result<Self> {
Ok(V1IndexReader {
metadata,
documents: BufReader::new(File::open(path.join("documents.jsonl"))?),
settings: BufReader::new(File::open(path.join("settings.json"))?),
updates: BufReader::new(File::open(path.join("updates.jsonl"))?),
})
}
pub fn metadata(&self) -> &IndexMetadata {
&self.metadata.metadata
}
pub fn documents(&mut self) -> Result<impl Iterator<Item = Result<Document>> + '_> {
Ok((&mut self.documents)
.lines()
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
}
pub fn settings(&mut self) -> Result<self::settings::Settings> {
Ok(serde_json::from_reader(&mut self.settings)?)
}
pub fn tasks(self) -> impl Iterator<Item = Result<Task>> {
self.updates.lines().map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) })
}
}
impl V1Reader {
pub fn open(dump: TempDir) -> Result<Self> {
let meta_file = fs::read(dump.path().join("metadata.json"))?;
let metadata: Metadata = serde_json::from_reader(&*meta_file)?;
let mut indexes = Vec::new();
for index in metadata.indexes.into_iter() {
let index_path = dump.path().join(&index.uid);
indexes.push(V1Index::new(index_path, index));
}
Ok(V1Reader {
dump,
indexes,
db_version: metadata.db_version,
dump_version: metadata.dump_version,
})
}
pub fn to_v2(self) -> CompatV1ToV2 {
CompatV1ToV2 { from: self }
}
pub fn index_uuid(&self) -> Vec<IndexUuid> {
self.indexes
.iter()
.map(|index| IndexUuid {
name: index.metadata.name.to_owned(),
uid: index.metadata().uid.to_owned(),
})
.collect()
}
pub fn version(&self) -> Version {
Version::V1
}
pub fn date(&self) -> Option<OffsetDateTime> {
None
}
pub fn indexes(&self) -> Result<impl Iterator<Item = Result<V1IndexReader>> + '_> {
Ok(self.indexes.iter().map(|index| index.open()))
}
}
#[derive(Debug, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct Index {
pub name: String,
pub uid: String,
#[serde(with = "time::serde::rfc3339")]
created_at: OffsetDateTime,
#[serde(with = "time::serde::rfc3339")]
updated_at: OffsetDateTime,
pub primary_key: Option<String>,
}
#[derive(Clone)]
pub struct IndexMetadataV1 {
pub name: String,
pub metadata: crate::IndexMetadata,
}
impl From<Index> for IndexMetadataV1 {
fn from(index: Index) -> Self {
IndexMetadataV1 {
name: index.name,
metadata: crate::IndexMetadata {
uid: index.uid,
primary_key: index.primary_key,
created_at: index.created_at,
updated_at: index.updated_at,
},
}
}
}
#[derive(Debug, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct Metadata {
pub indexes: Vec<Index>,
pub db_version: String,
pub dump_version: crate::Version,
}
#[cfg(test)]
pub(crate) mod test {
use std::fs::File;
use std::io::BufReader;
use flate2::bufread::GzDecoder;
use meili_snap::insta;
use tempfile::TempDir;
use super::*;
#[test]
fn read_dump_v1() {
let dump = File::open("tests/assets/v1.dump").unwrap();
let dir = TempDir::new().unwrap();
let mut dump = BufReader::new(dump);
let gz = GzDecoder::new(&mut dump);
let mut archive = tar::Archive::new(gz);
archive.unpack(dir.path()).unwrap();
let dump = V1Reader::open(dir).unwrap();
// top level infos
assert_eq!(dump.date(), None);
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
let mut products = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut dnd_spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "2022-10-02T13:23:39.976870431Z",
"updatedAt": "2022-10-02T13:27:54.353262482Z"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"b01c8371aea4c7171af0d4d846a2bdca");
// products tasks
let tasks = products.tasks().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"91de507f206ad21964584021932ba7a7");
// movies
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "2022-10-02T13:15:29.477512777Z",
"updatedAt": "2022-10-02T13:21:12.671204856Z"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"b63dbed5bbc059f3e32bc471ae699bf5");
// movies tasks
let tasks = movies.tasks().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"55eef4de2bef7e84c5ce0bee47488f56");
// spells
insta::assert_json_snapshot!(dnd_spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "2022-10-02T13:38:26.358882984Z",
"updatedAt": "2022-10-02T13:38:26.385609433Z"
}
"###);
insta::assert_json_snapshot!(dnd_spells.settings().unwrap());
let documents = dnd_spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"aa24c0cfc733d66c396237ad44263bed");
// spells tasks
let tasks = dnd_spells.tasks().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"836dd7d64d5ad20ad901c44b1b161a4c");
}
}

View file

@ -0,0 +1,93 @@
use std::collections::{BTreeMap, BTreeSet};
use std::result::Result as StdResult;
use std::str::FromStr;
use once_cell::sync::Lazy;
use regex::Regex;
use serde::{Deserialize, Deserializer, Serialize};
#[derive(Default, Clone, Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
pub struct Settings {
#[serde(default, deserialize_with = "deserialize_some")]
pub ranking_rules: Option<Option<Vec<String>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub distinct_attribute: Option<Option<String>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub searchable_attributes: Option<Option<Vec<String>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub displayed_attributes: Option<Option<BTreeSet<String>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub stop_words: Option<Option<BTreeSet<String>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub synonyms: Option<Option<BTreeMap<String, Vec<String>>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub attributes_for_faceting: Option<Option<Vec<String>>>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SettingsUpdate {
pub ranking_rules: UpdateState<Vec<RankingRule>>,
pub distinct_attribute: UpdateState<String>,
pub primary_key: UpdateState<String>,
pub searchable_attributes: UpdateState<Vec<String>>,
pub displayed_attributes: UpdateState<BTreeSet<String>>,
pub stop_words: UpdateState<BTreeSet<String>>,
pub synonyms: UpdateState<BTreeMap<String, Vec<String>>>,
pub attributes_for_faceting: UpdateState<Vec<String>>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum UpdateState<T> {
Update(T),
Clear,
Nothing,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum RankingRule {
Typo,
Words,
Proximity,
Attribute,
WordsPosition,
Exactness,
Asc(String),
Desc(String),
}
static ASC_DESC_REGEX: Lazy<Regex> = Lazy::new(|| Regex::new(r"(asc|desc)\(([\w_-]+)\)").unwrap());
impl FromStr for RankingRule {
type Err = ();
fn from_str(s: &str) -> Result<Self, Self::Err> {
Ok(match s {
"typo" => Self::Typo,
"words" => Self::Words,
"proximity" => Self::Proximity,
"attribute" => Self::Attribute,
"wordsPosition" => Self::WordsPosition,
"exactness" => Self::Exactness,
text => {
let caps = ASC_DESC_REGEX.captures(text).ok_or(())?;
let order = caps.get(1).unwrap().as_str();
let field_name = caps.get(2).unwrap().as_str();
match order {
"asc" => Self::Asc(field_name.to_string()),
"desc" => Self::Desc(field_name.to_string()),
_ => return Err(()),
}
}
})
}
}
// Any value that is present is considered Some value, including null.
fn deserialize_some<'de, T, D>(deserializer: D) -> StdResult<Option<T>, D::Error>
where
T: Deserialize<'de>,
D: Deserializer<'de>,
{
Deserialize::deserialize(deserializer).map(Some)
}

View file

@ -0,0 +1,24 @@
---
source: dump/src/reader/v1/mod.rs
expression: dnd_spells.settings().unwrap()
---
{
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"wordsPosition",
"exactness"
],
"distinctAttribute": null,
"searchableAttributes": [
"*"
],
"displayedAttributes": [
"*"
],
"stopWords": [],
"synonyms": {},
"attributesForFaceting": []
}

View file

@ -0,0 +1,38 @@
---
source: dump/src/reader/v1/mod.rs
expression: products.settings().unwrap()
---
{
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"wordsPosition",
"exactness"
],
"distinctAttribute": null,
"searchableAttributes": [
"*"
],
"displayedAttributes": [
"*"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"attributesForFaceting": []
}

View file

@ -0,0 +1,28 @@
---
source: dump/src/reader/v1/mod.rs
expression: movies.settings().unwrap()
---
{
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"wordsPosition",
"exactness",
"asc(release_date)"
],
"distinctAttribute": null,
"searchableAttributes": [
"*"
],
"displayedAttributes": [
"*"
],
"stopWords": [],
"synonyms": {},
"attributesForFaceting": [
"id",
"genres"
]
}

View file

@ -0,0 +1,74 @@
use serde::{Deserialize, Serialize};
use time::OffsetDateTime;
use super::settings::SettingsUpdate;
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "name")]
pub enum UpdateType {
ClearAll,
Customs,
DocumentsAddition { number: usize },
DocumentsPartial { number: usize },
DocumentsDeletion { number: usize },
Settings { settings: Box<SettingsUpdate> },
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct ProcessedUpdateResult {
pub update_id: u64,
#[serde(rename = "type")]
pub update_type: UpdateType,
#[serde(skip_serializing_if = "Option::is_none")]
pub error: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub error_type: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub error_code: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub error_link: Option<String>,
pub duration: f64, // in seconds
#[serde(with = "time::serde::rfc3339")]
pub enqueued_at: OffsetDateTime,
#[serde(with = "time::serde::rfc3339")]
pub processed_at: OffsetDateTime,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct EnqueuedUpdateResult {
pub update_id: u64,
#[serde(rename = "type")]
pub update_type: UpdateType,
#[serde(with = "time::serde::rfc3339")]
pub enqueued_at: OffsetDateTime,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "camelCase", tag = "status")]
pub enum UpdateStatus {
Enqueued {
#[serde(flatten)]
content: EnqueuedUpdateResult,
},
Failed {
#[serde(flatten)]
content: ProcessedUpdateResult,
},
Processed {
#[serde(flatten)]
content: ProcessedUpdateResult,
},
}
impl UpdateStatus {
pub fn enqueued_at(&self) -> &OffsetDateTime {
match self {
UpdateStatus::Enqueued { content } => &content.enqueued_at,
UpdateStatus::Failed { content } | UpdateStatus::Processed { content } => {
&content.enqueued_at
}
}
}
}

View file

@ -0,0 +1,14 @@
use http::StatusCode;
use serde::Deserialize;
#[derive(Debug, Deserialize, Clone)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(rename_all = "camelCase")]
pub struct ResponseError {
#[serde(skip)]
pub code: StatusCode,
pub message: String,
pub error_code: String,
pub error_type: String,
pub error_link: String,
}

View file

@ -0,0 +1,18 @@
use serde::Deserialize;
use uuid::Uuid;
use super::Settings;
#[derive(Deserialize, Debug, Clone)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct IndexUuid {
pub uid: String,
pub uuid: Uuid,
}
#[derive(Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct DumpMeta {
pub settings: Settings<super::Unchecked>,
pub primary_key: Option<String>,
}

View file

@ -0,0 +1,416 @@
//! ```text
//! .
//! ├── indexes
//! │   ├── index-40d14c5f-37ae-4873-9d51-b69e014a0d30
//! │   │   ├── documents.jsonl
//! │   │   └── meta.json
//! │   ├── index-88202369-4524-4410-9b3d-3e924c867fec
//! │   │   ├── documents.jsonl
//! │   │   └── meta.json
//! │   ├── index-b7f2d03b-bf9b-40d9-a25b-94dc5ec60c32
//! │   │   ├── documents.jsonl
//! │   │   └── meta.json
//! │   └── index-dc9070b3-572d-4f30-ab45-d4903ab71708
//! │   ├── documents.jsonl
//! │   └── meta.json
//! ├── index_uuids
//! │   └── data.jsonl
//! ├── metadata.json
//! └── updates
//! ├── data.jsonl
//! └── update_files
//! └── update_202573df-718b-4d80-9a65-2ee397c23dc3
//! ```
use std::fs::{self, File};
use std::io::{BufRead, BufReader};
use std::path::Path;
use serde::{Deserialize, Serialize};
use tempfile::TempDir;
use time::OffsetDateTime;
pub mod errors;
pub mod meta;
pub mod settings;
pub mod updates;
use self::meta::{DumpMeta, IndexUuid};
use super::compat::v2_to_v3::CompatV2ToV3;
use super::Document;
use crate::{IndexMetadata, Result, Version};
pub type Settings<T> = settings::Settings<T>;
pub type Setting<T> = settings::Setting<T>;
pub type Checked = settings::Checked;
pub type Unchecked = settings::Unchecked;
pub type Task = updates::UpdateEntry;
pub type Kind = updates::UpdateMeta;
// everything related to the errors
pub type ResponseError = errors::ResponseError;
#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct Metadata {
db_version: String,
index_db_size: usize,
update_db_size: usize,
#[serde(with = "time::serde::rfc3339")]
dump_date: OffsetDateTime,
}
pub struct V2Reader {
dump: TempDir,
metadata: Metadata,
tasks: BufReader<File>,
pub index_uuid: Vec<IndexUuid>,
}
impl V2Reader {
pub fn open(dump: TempDir) -> Result<Self> {
let meta_file = fs::read(dump.path().join("metadata.json"))?;
let metadata = serde_json::from_reader(&*meta_file)?;
let index_uuid = File::open(dump.path().join("index_uuids/data.jsonl"))?;
let index_uuid = BufReader::new(index_uuid);
let index_uuid = index_uuid
.lines()
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) })
.collect::<Result<Vec<_>>>()?;
Ok(V2Reader {
metadata,
tasks: BufReader::new(
File::open(dump.path().join("updates").join("data.jsonl")).unwrap(),
),
index_uuid,
dump,
})
}
pub fn to_v3(self) -> CompatV2ToV3 {
CompatV2ToV3::new(self)
}
pub fn index_uuid(&self) -> Vec<IndexUuid> {
self.index_uuid.clone()
}
pub fn version(&self) -> Version {
Version::V2
}
pub fn date(&self) -> Option<OffsetDateTime> {
Some(self.metadata.dump_date)
}
pub fn indexes(&self) -> Result<impl Iterator<Item = Result<V2IndexReader>> + '_> {
Ok(self.index_uuid.iter().map(|index| -> Result<_> {
V2IndexReader::new(
&self.dump.path().join("indexes").join(format!("index-{}", index.uuid)),
index,
BufReader::new(
File::open(self.dump.path().join("updates").join("data.jsonl")).unwrap(),
),
)
}))
}
pub fn tasks(&mut self) -> Box<dyn Iterator<Item = Result<(Task, Option<UpdateFile>)>> + '_> {
Box::new((&mut self.tasks).lines().map(|line| -> Result<_> {
let task: Task = serde_json::from_str(&line?)?;
if !task.is_finished() {
if let Some(uuid) = task.get_content_uuid() {
let update_file_path = self
.dump
.path()
.join("updates")
.join("update_files")
.join(format!("update_{}", uuid));
Ok((task, Some(UpdateFile::new(&update_file_path)?)))
} else {
Ok((task, None))
}
} else {
Ok((task, None))
}
}))
}
}
pub struct V2IndexReader {
metadata: IndexMetadata,
settings: Settings<Checked>,
documents: BufReader<File>,
}
impl V2IndexReader {
pub fn new(path: &Path, index_uuid: &IndexUuid, tasks: BufReader<File>) -> Result<Self> {
let meta = File::open(path.join("meta.json"))?;
let meta: DumpMeta = serde_json::from_reader(meta)?;
let mut created_at = None;
let mut updated_at = None;
for line in tasks.lines() {
let task: Task = serde_json::from_str(&line?)?;
if !(task.uuid == index_uuid.uuid && task.is_finished()) {
continue;
}
let new_created_at = match task.update.meta() {
Kind::DocumentsAddition { .. } | Kind::Settings(_) => task.update.finished_at(),
_ => None,
};
let new_updated_at = task.update.finished_at();
if created_at.is_none() || created_at > new_created_at {
created_at = new_created_at;
}
if updated_at.is_none() || updated_at < new_updated_at {
updated_at = new_updated_at;
}
}
let current_time = OffsetDateTime::now_utc();
let metadata = IndexMetadata {
uid: index_uuid.uid.clone(),
primary_key: meta.primary_key,
created_at: created_at.unwrap_or(current_time),
updated_at: updated_at.unwrap_or(current_time),
};
let ret = V2IndexReader {
metadata,
settings: meta.settings.check(),
documents: BufReader::new(File::open(path.join("documents.jsonl"))?),
};
Ok(ret)
}
pub fn metadata(&self) -> &IndexMetadata {
&self.metadata
}
pub fn documents(&mut self) -> Result<impl Iterator<Item = Result<Document>> + '_> {
Ok((&mut self.documents)
.lines()
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
}
pub fn settings(&mut self) -> Result<Settings<Checked>> {
Ok(self.settings.clone())
}
}
pub struct UpdateFile {
documents: Vec<Document>,
index: usize,
}
impl UpdateFile {
fn new(path: &Path) -> Result<Self> {
let reader = BufReader::new(File::open(path)?);
Ok(UpdateFile { documents: serde_json::from_reader(reader)?, index: 0 })
}
}
impl Iterator for UpdateFile {
type Item = Result<Document>;
fn next(&mut self) -> Option<Self::Item> {
self.index += 1;
self.documents.get(self.index - 1).cloned().map(Ok)
}
}
#[cfg(test)]
pub(crate) mod test {
use std::fs::File;
use std::io::BufReader;
use flate2::bufread::GzDecoder;
use meili_snap::insta;
use tempfile::TempDir;
use super::*;
#[test]
fn read_dump_v2() {
let dump = File::open("tests/assets/v2.dump").unwrap();
let dir = TempDir::new().unwrap();
let mut dump = BufReader::new(dump);
let gz = GzDecoder::new(&mut dump);
let mut archive = tar::Archive::new(gz);
archive.unpack(dir.path()).unwrap();
let mut dump = V2Reader::open(dir).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-09 20:27:59.904096267 +00:00:00");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();
let (tasks, mut update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"ec5fc0a14bf735ad4e361d5aa8a89ac6");
assert_eq!(update_files.len(), 9);
assert!(update_files[0].is_some()); // the enqueued document addition
assert!(update_files[1..].iter().all(|u| u.is_none())); // everything already processed
let update_file = update_files.remove(0).unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(update_file), @"7b8889539b669c7b9ddba448bafa385d");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies2 = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "2022-10-09T20:27:22.688964637Z",
"updatedAt": "2022-10-09T20:27:23.951017769Z"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"548284a84de510f71e88e6cdea495cf5");
// movies
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "2022-10-09T20:27:22.197788495Z",
"updatedAt": "2022-10-09T20:28:01.93111053Z"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 110);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d153b5a81d8b3cdcbe1dec270b574022");
// movies2
insta::assert_json_snapshot!(movies2.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "movies_2",
"primaryKey": null,
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(movies2.settings().unwrap());
let documents = movies2.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 0);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d751713988987e9331980363e24189ce");
// spells
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "2022-10-09T20:27:24.242683494Z",
"updatedAt": "2022-10-09T20:27:24.312809641Z"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"235016433dd04262c7f2da01d1e808ce");
}
#[test]
fn read_dump_v2_from_meilisearch_v0_22_0_issue_3435() {
let dump = File::open("tests/assets/v2-v0.22.0.dump").unwrap();
let dir = TempDir::new().unwrap();
let mut dump = BufReader::new(dump);
let gz = GzDecoder::new(&mut dump);
let mut archive = tar::Archive::new(gz);
archive.unpack(dir.path()).unwrap();
let mut dump = V2Reader::open(dir).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2023-01-30 16:26:09.247261 +00:00:00");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"aca8ba13046272664eb3ea2da3031633");
assert_eq!(update_files.len(), 8);
assert!(update_files[0..].iter().all(|u| u.is_none())); // everything has already been processed
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "2023-01-30T16:25:56.595257Z",
"updatedAt": "2023-01-30T16:25:58.70348Z"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"548284a84de510f71e88e6cdea495cf5");
// movies
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "2023-01-30T16:25:56.192178Z",
"updatedAt": "2023-01-30T16:25:56.455714Z"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"0227598af846e574139ee0b80e03a720");
// spells
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "2023-01-30T16:25:58.876405Z",
"updatedAt": "2023-01-30T16:25:59.079906Z"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"235016433dd04262c7f2da01d1e808ce");
}
}

View file

@ -0,0 +1,269 @@
use std::collections::{BTreeMap, BTreeSet};
use std::fmt;
use std::marker::PhantomData;
use std::str::FromStr;
use serde::{Deserialize, Deserializer};
#[cfg(test)]
fn serialize_with_wildcard<S>(
field: &Setting<Vec<String>>,
s: S,
) -> std::result::Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
use serde::Serialize;
let wildcard = vec!["*".to_string()];
match field {
Setting::Set(value) => Some(value),
Setting::Reset => Some(&wildcard),
Setting::NotSet => None,
}
.serialize(s)
}
#[derive(Clone, Default, Debug)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct Checked;
#[derive(Clone, Default, Debug, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct Unchecked;
#[derive(Debug, Clone, Default, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(deny_unknown_fields)]
#[serde(rename_all = "camelCase")]
#[serde(bound(serialize = "T: serde::Serialize", deserialize = "T: Deserialize<'static>"))]
pub struct Settings<T> {
#[serde(
default,
serialize_with = "serialize_with_wildcard",
skip_serializing_if = "Setting::is_not_set"
)]
pub displayed_attributes: Setting<Vec<String>>,
#[serde(
default,
serialize_with = "serialize_with_wildcard",
skip_serializing_if = "Setting::is_not_set"
)]
pub searchable_attributes: Setting<Vec<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub filterable_attributes: Setting<BTreeSet<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub sortable_attributes: Setting<BTreeSet<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub ranking_rules: Setting<Vec<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub stop_words: Setting<BTreeSet<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub synonyms: Setting<BTreeMap<String, Vec<String>>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub distinct_attribute: Setting<String>,
#[serde(skip)]
pub _kind: PhantomData<T>,
}
impl Settings<Unchecked> {
pub fn check(self) -> Settings<Checked> {
let displayed_attributes = match self.displayed_attributes {
Setting::Set(fields) => {
if fields.iter().any(|f| f == "*") {
Setting::Reset
} else {
Setting::Set(fields)
}
}
otherwise => otherwise,
};
let searchable_attributes = match self.searchable_attributes {
Setting::Set(fields) => {
if fields.iter().any(|f| f == "*") {
Setting::Reset
} else {
Setting::Set(fields)
}
}
otherwise => otherwise,
};
Settings {
displayed_attributes,
searchable_attributes,
filterable_attributes: self.filterable_attributes,
sortable_attributes: self.sortable_attributes,
ranking_rules: self.ranking_rules,
stop_words: self.stop_words,
synonyms: self.synonyms,
distinct_attribute: self.distinct_attribute,
_kind: PhantomData,
}
}
}
#[derive(Debug, Clone, PartialEq)]
pub enum Setting<T> {
Set(T),
Reset,
NotSet,
}
impl<T> Default for Setting<T> {
fn default() -> Self {
Self::NotSet
}
}
impl<T> Setting<T> {
pub const fn is_not_set(&self) -> bool {
matches!(self, Self::NotSet)
}
pub fn map<A>(self, f: fn(T) -> A) -> Setting<A> {
match self {
Setting::Set(a) => Setting::Set(f(a)),
Setting::Reset => Setting::Reset,
Setting::NotSet => Setting::NotSet,
}
}
}
#[cfg(test)]
impl<T: serde::Serialize> serde::Serialize for Setting<T> {
fn serialize<S>(&self, serializer: S) -> std::result::Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
match self {
Self::Set(value) => Some(value),
// Usually not_set isn't serialized by setting skip_serializing_if field attribute
Self::NotSet | Self::Reset => None,
}
.serialize(serializer)
}
}
impl<'de, T: Deserialize<'de>> Deserialize<'de> for Setting<T> {
fn deserialize<D>(deserializer: D) -> std::result::Result<Self, D::Error>
where
D: Deserializer<'de>,
{
Deserialize::deserialize(deserializer).map(|x| match x {
Some(x) => Self::Set(x),
None => Self::Reset, // Reset is forced by sending null value
})
}
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum Criterion {
/// Sorted by decreasing number of matched query terms.
/// Query words at the front of an attribute is considered better than if it was at the back.
Words,
/// Sorted by increasing number of typos.
Typo,
/// Sorted by increasing distance between matched query terms.
Proximity,
/// Documents with quey words contained in more important
/// attributes are considered better.
Attribute,
/// Dynamically sort at query time the documents. None, one or multiple Asc/Desc sortable
/// attributes can be used in place of this criterion at query time.
Sort,
/// Sorted by the similarity of the matched words with the query words.
Exactness,
/// Sorted by the increasing value of the field specified.
Asc(String),
/// Sorted by the decreasing value of the field specified.
Desc(String),
}
impl Criterion {
/// Returns the field name parameter of this criterion.
pub fn field_name(&self) -> Option<&str> {
match self {
Criterion::Asc(name) | Criterion::Desc(name) => Some(name),
_otherwise => None,
}
}
}
impl FromStr for Criterion {
// since we're not going to show the custom error message we can override the
// error type.
type Err = ();
fn from_str(text: &str) -> Result<Criterion, Self::Err> {
match text {
"words" => Ok(Criterion::Words),
"typo" => Ok(Criterion::Typo),
"proximity" => Ok(Criterion::Proximity),
"attribute" => Ok(Criterion::Attribute),
"sort" => Ok(Criterion::Sort),
"exactness" => Ok(Criterion::Exactness),
text => match AscDesc::from_str(text) {
Ok(AscDesc::Asc(field)) => Ok(Criterion::Asc(field)),
Ok(AscDesc::Desc(field)) => Ok(Criterion::Desc(field)),
Err(_) => Err(()),
},
}
}
}
#[derive(Debug, Deserialize, Clone, PartialEq, Eq)]
pub enum AscDesc {
Asc(String),
Desc(String),
}
impl FromStr for AscDesc {
type Err = ();
// since we don't know if this comes from the old or new syntax we need to check
// for both syntax.
// WARN: this code doesn't come from the original meilisearch v0.22.0 but was
// written specifically to be able to import the dump of meilisearch v0.21.0 AND
// meilisearch v0.22.0.
fn from_str(text: &str) -> Result<AscDesc, Self::Err> {
if let Some((field_name, asc_desc)) = text.rsplit_once(':') {
match asc_desc {
"asc" => Ok(AscDesc::Asc(field_name.to_string())),
"desc" => Ok(AscDesc::Desc(field_name.to_string())),
_ => Err(()),
}
} else if text.starts_with("asc(") && text.ends_with(')') {
Ok(AscDesc::Asc(
text.strip_prefix("asc(").unwrap().strip_suffix(')').unwrap().to_string(),
))
} else if text.starts_with("desc(") && text.ends_with(')') {
Ok(AscDesc::Desc(
text.strip_prefix("desc(").unwrap().strip_suffix(')').unwrap().to_string(),
))
} else {
Err(())
}
}
}
impl fmt::Display for Criterion {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
use Criterion::*;
match self {
Words => f.write_str("words"),
Typo => f.write_str("typo"),
Proximity => f.write_str("proximity"),
Attribute => f.write_str("attribute"),
Sort => f.write_str("sort"),
Exactness => f.write_str("exactness"),
Asc(attr) => write!(f, "{}:asc", attr),
Desc(attr) => write!(f, "{}:desc", attr),
}
}
}

View file

@ -0,0 +1,23 @@
---
source: dump/src/reader/v2/mod.rs
expression: movies2.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,23 @@
---
source: dump/src/reader/v2/mod.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,37 @@
---
source: dump/src/reader/v2/mod.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null
}

View file

@ -0,0 +1,24 @@
---
source: dump/src/reader/v2/mod.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness",
"asc(release_date)"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,25 @@
---
source: dump/src/reader/v2/mod.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,39 @@
---
source: dump/src/reader/v2/mod.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null
}

View file

@ -0,0 +1,30 @@
---
source: dump/src/reader/v2/mod.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [
"genres",
"id"
],
"sortableAttributes": [
"release_date"
],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,240 @@
use serde::Deserialize;
use time::OffsetDateTime;
use uuid::Uuid;
use super::{ResponseError, Settings, Unchecked};
#[derive(Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct UpdateEntry {
pub uuid: Uuid,
pub update: UpdateStatus,
}
impl UpdateEntry {
pub fn is_finished(&self) -> bool {
match self.update {
UpdateStatus::Processing(_) | UpdateStatus::Enqueued(_) => false,
UpdateStatus::Processed(_) | UpdateStatus::Aborted(_) | UpdateStatus::Failed(_) => true,
}
}
pub fn get_content_uuid(&self) -> Option<&Uuid> {
match &self.update {
UpdateStatus::Enqueued(enqueued) => enqueued.content.as_ref(),
UpdateStatus::Processing(processing) => processing.from.content.as_ref(),
UpdateStatus::Processed(processed) => processed.from.from.content.as_ref(),
UpdateStatus::Aborted(aborted) => aborted.from.content.as_ref(),
UpdateStatus::Failed(failed) => failed.from.from.content.as_ref(),
}
}
}
#[derive(Debug, Clone, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
pub enum UpdateResult {
DocumentsAddition(DocumentAdditionResult),
DocumentDeletion { deleted: u64 },
Other,
}
#[derive(Debug, Deserialize, Clone)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct DocumentAdditionResult {
pub nb_documents: usize,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
#[non_exhaustive]
pub enum IndexDocumentsMethod {
/// Replace the previous document with the new one,
/// removing all the already known attributes.
ReplaceDocuments,
/// Merge the previous version of the document with the new version,
/// replacing old attributes values with the new ones and add the new attributes.
UpdateDocuments,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
#[non_exhaustive]
pub enum UpdateFormat {
/// The given update is a real **comma separated** CSV with headers on the first line.
Csv,
/// The given update is a JSON array with documents inside.
Json,
/// The given update is a JSON stream with a document on each line.
JsonStream,
}
#[allow(clippy::large_enum_variant)]
#[derive(Debug, Clone, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(tag = "type")]
pub enum UpdateMeta {
DocumentsAddition {
method: IndexDocumentsMethod,
format: UpdateFormat,
primary_key: Option<String>,
},
ClearDocuments,
DeleteDocuments {
ids: Vec<String>,
},
Settings(Settings<Unchecked>),
}
#[derive(Debug, Deserialize, Clone)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(rename_all = "camelCase")]
pub struct Enqueued {
pub update_id: u64,
pub meta: UpdateMeta,
#[serde(with = "time::serde::rfc3339")]
pub enqueued_at: OffsetDateTime,
pub content: Option<Uuid>,
}
impl Enqueued {
pub fn meta(&self) -> &UpdateMeta {
&self.meta
}
pub fn id(&self) -> u64 {
self.update_id
}
}
#[derive(Debug, Deserialize, Clone)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(rename_all = "camelCase")]
pub struct Processed {
pub success: UpdateResult,
#[serde(with = "time::serde::rfc3339")]
pub processed_at: OffsetDateTime,
#[serde(flatten)]
pub from: Processing,
}
impl Processed {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &UpdateMeta {
self.from.meta()
}
}
#[derive(Debug, Deserialize, Clone)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(rename_all = "camelCase")]
pub struct Processing {
#[serde(flatten)]
pub from: Enqueued,
#[serde(with = "time::serde::rfc3339")]
pub started_processing_at: OffsetDateTime,
}
impl Processing {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &UpdateMeta {
self.from.meta()
}
}
#[derive(Debug, Deserialize, Clone)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(rename_all = "camelCase")]
pub struct Aborted {
#[serde(flatten)]
pub from: Enqueued,
#[serde(with = "time::serde::rfc3339")]
pub aborted_at: OffsetDateTime,
}
impl Aborted {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &UpdateMeta {
self.from.meta()
}
}
#[derive(Debug, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(rename_all = "camelCase")]
pub struct Failed {
#[serde(flatten)]
pub from: Processing,
pub error: ResponseError,
#[serde(with = "time::serde::rfc3339")]
pub failed_at: OffsetDateTime,
}
impl Failed {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &UpdateMeta {
self.from.meta()
}
}
#[derive(Debug, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(tag = "status", rename_all = "camelCase")]
pub enum UpdateStatus {
Processing(Processing),
Enqueued(Enqueued),
Processed(Processed),
Aborted(Aborted),
Failed(Failed),
}
impl UpdateStatus {
pub fn id(&self) -> u64 {
match self {
UpdateStatus::Processing(u) => u.id(),
UpdateStatus::Enqueued(u) => u.id(),
UpdateStatus::Processed(u) => u.id(),
UpdateStatus::Aborted(u) => u.id(),
UpdateStatus::Failed(u) => u.id(),
}
}
pub fn meta(&self) -> &UpdateMeta {
match self {
UpdateStatus::Processing(u) => u.meta(),
UpdateStatus::Enqueued(u) => u.meta(),
UpdateStatus::Processed(u) => u.meta(),
UpdateStatus::Aborted(u) => u.meta(),
UpdateStatus::Failed(u) => u.meta(),
}
}
pub fn processed(&self) -> Option<&Processed> {
match self {
UpdateStatus::Processed(p) => Some(p),
_ => None,
}
}
pub fn finished_at(&self) -> Option<OffsetDateTime> {
match self {
UpdateStatus::Processing(_) => None,
UpdateStatus::Enqueued(_) => None,
UpdateStatus::Processed(u) => Some(u.processed_at),
UpdateStatus::Aborted(_) => None,
UpdateStatus::Failed(u) => Some(u.failed_at),
}
}
}

View file

@ -0,0 +1,52 @@
use serde::{Deserialize, Serialize};
#[allow(clippy::enum_variant_names)]
#[derive(Serialize, Deserialize, Debug, Clone, Copy)]
pub enum Code {
// index related error
CreateIndex,
IndexAlreadyExists,
IndexNotFound,
InvalidIndexUid,
// invalid state error
InvalidState,
MissingPrimaryKey,
PrimaryKeyAlreadyPresent,
MaxFieldsLimitExceeded,
MissingDocumentId,
InvalidDocumentId,
Filter,
Sort,
BadParameter,
BadRequest,
DatabaseSizeLimitReached,
DocumentNotFound,
Internal,
InvalidGeoField,
InvalidRankingRule,
InvalidStore,
InvalidToken,
MissingAuthorizationHeader,
NoSpaceLeftOnDevice,
DumpNotFound,
TaskNotFound,
PayloadTooLarge,
RetrieveDocument,
SearchDocuments,
UnsupportedMediaType,
DumpAlreadyInProgress,
DumpProcessFailed,
InvalidContentType,
MissingContentType,
MalformedPayload,
MissingPayload,
MalformedDump,
UnretrievableErrorCode,
}

View file

@ -0,0 +1,18 @@
use serde::Deserialize;
use uuid::Uuid;
use super::Settings;
#[derive(Deserialize, Debug, Clone)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct IndexUuid {
pub uid: String,
pub uuid: Uuid,
}
#[derive(Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct DumpMeta {
pub settings: Settings<super::Unchecked>,
pub primary_key: Option<String>,
}

View file

@ -0,0 +1,354 @@
//! ```text
//! .
//! ├── indexes
//! │   ├── 01d7dd17-8241-4f1f-a7d1-2d1cb255f5b0
//! │   │   ├── documents.jsonl
//! │   │   └── meta.json
//! │   ├── 78be64a3-cae1-449e-b7ed-13e77c9a8a0c
//! │   │   ├── documents.jsonl
//! │   │   └── meta.json
//! │   ├── ba553439-18fe-4733-ba53-44eed898280c
//! │   │   ├── documents.jsonl
//! │   │   └── meta.json
//! │   └── c408bc22-5859-49d1-8e9f-c88e2fa95cb0
//! │   ├── documents.jsonl
//! │   └── meta.json
//! ├── index_uuids
//! │   └── data.jsonl
//! ├── metadata.json
//! └── updates
//! ├── data.jsonl
//! └── updates_files
//! └── 66d3f12d-fcf3-4b53-88cb-407017373de7
//! ```
use std::fs::{self, File};
use std::io::{BufRead, BufReader};
use std::path::Path;
use serde::{Deserialize, Serialize};
use tempfile::TempDir;
use time::OffsetDateTime;
pub mod errors;
pub mod meta;
pub mod settings;
pub mod updates;
use self::meta::{DumpMeta, IndexUuid};
use super::compat::v3_to_v4::CompatV3ToV4;
use super::Document;
use crate::{Error, IndexMetadata, Result, Version};
pub type Settings<T> = settings::Settings<T>;
pub type Checked = settings::Checked;
pub type Unchecked = settings::Unchecked;
pub type Task = updates::UpdateEntry;
// ===== Other types to clarify the code of the compat module
// everything related to the tasks
pub type Status = updates::UpdateStatus;
pub type Kind = updates::Update;
// everything related to the settings
pub type Setting<T> = settings::Setting<T>;
// everything related to the errors
pub type Code = errors::Code;
#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct Metadata {
db_version: String,
index_db_size: usize,
update_db_size: usize,
#[serde(with = "time::serde::rfc3339")]
dump_date: OffsetDateTime,
}
pub struct V3Reader {
dump: TempDir,
metadata: Metadata,
tasks: BufReader<File>,
index_uuid: Vec<IndexUuid>,
}
impl V3Reader {
pub fn open(dump: TempDir) -> Result<Self> {
let meta_file = fs::read(dump.path().join("metadata.json"))?;
let metadata = serde_json::from_reader(&*meta_file)?;
let index_uuid = File::open(dump.path().join("index_uuids/data.jsonl"))?;
let index_uuid = BufReader::new(index_uuid);
let index_uuid = index_uuid
.lines()
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) })
.collect::<Result<Vec<_>>>()?;
Ok(V3Reader {
metadata,
tasks: BufReader::new(File::open(dump.path().join("updates").join("data.jsonl"))?),
index_uuid,
dump,
})
}
pub fn index_uuid(&self) -> Vec<IndexUuid> {
self.index_uuid.clone()
}
pub fn to_v4(self) -> CompatV3ToV4 {
CompatV3ToV4::new(self)
}
pub fn version(&self) -> Version {
Version::V3
}
pub fn date(&self) -> Option<OffsetDateTime> {
Some(self.metadata.dump_date)
}
pub fn indexes(&self) -> Result<impl Iterator<Item = Result<V3IndexReader>> + '_> {
Ok(self.index_uuid.iter().map(|index| -> Result<_> {
V3IndexReader::new(
&self.dump.path().join("indexes").join(index.uuid.to_string()),
index,
BufReader::new(
File::open(self.dump.path().join("updates").join("data.jsonl")).unwrap(),
),
)
}))
}
pub fn tasks(
&mut self,
) -> Box<dyn Iterator<Item = Result<(Task, Option<Box<super::UpdateFile>>)>> + '_> {
Box::new((&mut self.tasks).lines().map(|line| -> Result<_> {
let task: Task = serde_json::from_str(&line?)?;
if !task.is_finished() {
if let Some(uuid) = task.get_content_uuid() {
let update_file_path = self
.dump
.path()
.join("updates")
.join("updates_files")
.join(uuid.to_string());
Ok((
task,
Some(
Box::new(UpdateFile::new(&update_file_path)?) as Box<super::UpdateFile>
),
))
} else {
Ok((task, None))
}
} else {
Ok((task, None))
}
}))
}
}
pub struct V3IndexReader {
metadata: IndexMetadata,
settings: Settings<Checked>,
documents: BufReader<File>,
}
impl V3IndexReader {
pub fn new(path: &Path, index_uuid: &IndexUuid, tasks: BufReader<File>) -> Result<Self> {
let meta = File::open(path.join("meta.json"))?;
let meta: DumpMeta = serde_json::from_reader(meta)?;
let mut created_at = None;
let mut updated_at = None;
for line in tasks.lines() {
let task: Task = serde_json::from_str(&line?)?;
if !(task.uuid == index_uuid.uuid && task.is_finished()) {
continue;
}
let new_created_at = match task.update.meta() {
Kind::DocumentAddition { .. } | Kind::Settings(_) => task.update.finished_at(),
_ => None,
};
let new_updated_at = task.update.finished_at();
if created_at.is_none() || created_at > new_created_at {
created_at = new_created_at;
}
if updated_at.is_none() || updated_at < new_updated_at {
updated_at = new_updated_at;
}
}
let current_time = OffsetDateTime::now_utc();
let metadata = IndexMetadata {
uid: index_uuid.uid.clone(),
primary_key: meta.primary_key,
created_at: created_at.unwrap_or(current_time),
updated_at: updated_at.unwrap_or(current_time),
};
let ret = V3IndexReader {
metadata,
settings: meta.settings.check(),
documents: BufReader::new(File::open(path.join("documents.jsonl"))?),
};
Ok(ret)
}
pub fn metadata(&self) -> &IndexMetadata {
&self.metadata
}
pub fn documents(&mut self) -> Result<impl Iterator<Item = Result<Document>> + '_> {
Ok((&mut self.documents)
.lines()
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
}
pub fn settings(&mut self) -> Result<Settings<Checked>> {
Ok(self.settings.clone())
}
}
pub struct UpdateFile {
reader: BufReader<File>,
}
impl UpdateFile {
fn new(path: &Path) -> Result<Self> {
Ok(UpdateFile { reader: BufReader::new(File::open(path)?) })
}
}
impl Iterator for UpdateFile {
type Item = Result<Document>;
fn next(&mut self) -> Option<Self::Item> {
(&mut self.reader)
.lines()
.map(|line| {
line.map_err(Error::from)
.and_then(|line| serde_json::from_str(&line).map_err(Error::from))
})
.next()
}
}
#[cfg(test)]
pub(crate) mod test {
use std::fs::File;
use std::io::BufReader;
use flate2::bufread::GzDecoder;
use meili_snap::insta;
use tempfile::TempDir;
use super::*;
#[test]
fn read_dump_v3() {
let dump = File::open("tests/assets/v3.dump").unwrap();
let dir = TempDir::new().unwrap();
let mut dump = BufReader::new(dump);
let gz = GzDecoder::new(&mut dump);
let mut archive = tar::Archive::new(gz);
archive.unpack(dir.path()).unwrap();
let mut dump = V3Reader::open(dir).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-07 11:39:03.709153554 +00:00:00");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();
let (tasks, mut update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"63086d59c3f2074e4ab3fff7e8cc36c1");
assert_eq!(update_files.len(), 10);
assert!(update_files[0].is_some()); // the enqueued document addition
assert!(update_files[1..].iter().all(|u| u.is_none())); // everything already processed
let update_file = update_files.remove(0).unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(update_file), @"7b8889539b669c7b9ddba448bafa385d");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies2 = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "2022-10-07T11:38:54.74389899Z",
"updatedAt": "2022-10-07T11:38:55.963185778Z"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"548284a84de510f71e88e6cdea495cf5");
// movies
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "2022-10-07T11:38:54.026649575Z",
"updatedAt": "2022-10-07T11:39:04.188852537Z"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 110);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d153b5a81d8b3cdcbe1dec270b574022");
// movies2
insta::assert_json_snapshot!(movies2.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
{
"uid": "movies_2",
"primaryKey": null,
"createdAt": "[now]",
"updatedAt": "[now]"
}
"###);
insta::assert_json_snapshot!(movies2.settings().unwrap());
let documents = movies2.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 0);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d751713988987e9331980363e24189ce");
// spells
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "2022-10-07T11:38:56.265951133Z",
"updatedAt": "2022-10-07T11:38:56.521004328Z"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"235016433dd04262c7f2da01d1e808ce");
}
}

View file

@ -0,0 +1,234 @@
use std::collections::{BTreeMap, BTreeSet};
use std::marker::PhantomData;
use std::num::NonZeroUsize;
use serde::{Deserialize, Deserializer};
#[cfg(test)]
fn serialize_with_wildcard<S>(
field: &Setting<Vec<String>>,
s: S,
) -> std::result::Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
use serde::Serialize;
let wildcard = vec!["*".to_string()];
match field {
Setting::Set(value) => Some(value),
Setting::Reset => Some(&wildcard),
Setting::NotSet => None,
}
.serialize(s)
}
#[derive(Clone, Default, Debug)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct Checked;
#[derive(Clone, Default, Debug, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct Unchecked;
/// Holds all the settings for an index. `T` can either be `Checked` if they represents settings
/// whose validity is guaranteed, or `Unchecked` if they need to be validated. In the later case, a
/// call to `check` will return a `Settings<Checked>` from a `Settings<Unchecked>`.
#[derive(Debug, Clone, Default, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(deny_unknown_fields)]
#[serde(rename_all = "camelCase")]
#[serde(bound(serialize = "T: serde::Serialize", deserialize = "T: Deserialize<'static>"))]
pub struct Settings<T> {
#[serde(
default,
serialize_with = "serialize_with_wildcard",
skip_serializing_if = "Setting::is_not_set"
)]
pub displayed_attributes: Setting<Vec<String>>,
#[serde(
default,
serialize_with = "serialize_with_wildcard",
skip_serializing_if = "Setting::is_not_set"
)]
pub searchable_attributes: Setting<Vec<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub filterable_attributes: Setting<BTreeSet<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub sortable_attributes: Setting<BTreeSet<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub ranking_rules: Setting<Vec<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub stop_words: Setting<BTreeSet<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub synonyms: Setting<BTreeMap<String, Vec<String>>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub distinct_attribute: Setting<String>,
#[serde(skip)]
pub _kind: PhantomData<T>,
}
impl Settings<Checked> {
pub fn cleared() -> Settings<Checked> {
Settings {
displayed_attributes: Setting::Reset,
searchable_attributes: Setting::Reset,
filterable_attributes: Setting::Reset,
sortable_attributes: Setting::Reset,
ranking_rules: Setting::Reset,
stop_words: Setting::Reset,
synonyms: Setting::Reset,
distinct_attribute: Setting::Reset,
_kind: PhantomData,
}
}
pub fn into_unchecked(self) -> Settings<Unchecked> {
let Self {
displayed_attributes,
searchable_attributes,
filterable_attributes,
sortable_attributes,
ranking_rules,
stop_words,
synonyms,
distinct_attribute,
..
} = self;
Settings {
displayed_attributes,
searchable_attributes,
filterable_attributes,
sortable_attributes,
ranking_rules,
stop_words,
synonyms,
distinct_attribute,
_kind: PhantomData,
}
}
}
impl Settings<Unchecked> {
pub fn check(self) -> Settings<Checked> {
let displayed_attributes = match self.displayed_attributes {
Setting::Set(fields) => {
if fields.iter().any(|f| f == "*") {
Setting::Reset
} else {
Setting::Set(fields)
}
}
otherwise => otherwise,
};
let searchable_attributes = match self.searchable_attributes {
Setting::Set(fields) => {
if fields.iter().any(|f| f == "*") {
Setting::Reset
} else {
Setting::Set(fields)
}
}
otherwise => otherwise,
};
Settings {
displayed_attributes,
searchable_attributes,
filterable_attributes: self.filterable_attributes,
sortable_attributes: self.sortable_attributes,
ranking_rules: self.ranking_rules,
stop_words: self.stop_words,
synonyms: self.synonyms,
distinct_attribute: self.distinct_attribute,
_kind: PhantomData,
}
}
}
#[derive(Debug, Clone, Deserialize)]
#[allow(dead_code)] // otherwise rustc complains that the fields go unused
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(deny_unknown_fields)]
#[serde(rename_all = "camelCase")]
pub struct Facets {
pub level_group_size: Option<NonZeroUsize>,
pub min_level_size: Option<NonZeroUsize>,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum Setting<T> {
Set(T),
Reset,
NotSet,
}
impl<T> Default for Setting<T> {
fn default() -> Self {
Self::NotSet
}
}
impl<T> Setting<T> {
pub fn map<U, F>(self, f: F) -> Setting<U>
where
F: FnOnce(T) -> U,
{
match self {
Setting::Set(t) => Setting::Set(f(t)),
Setting::Reset => Setting::Reset,
Setting::NotSet => Setting::NotSet,
}
}
pub fn set(self) -> Option<T> {
match self {
Self::Set(value) => Some(value),
_ => None,
}
}
pub const fn as_ref(&self) -> Setting<&T> {
match *self {
Self::Set(ref value) => Setting::Set(value),
Self::Reset => Setting::Reset,
Self::NotSet => Setting::NotSet,
}
}
pub const fn is_not_set(&self) -> bool {
matches!(self, Self::NotSet)
}
}
#[cfg(test)]
impl<T: serde::Serialize> serde::Serialize for Setting<T> {
fn serialize<S>(&self, serializer: S) -> std::result::Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
match self {
Self::Set(value) => Some(value),
// Usually not_set isn't serialized by setting skip_serializing_if field attribute
Self::NotSet | Self::Reset => None,
}
.serialize(serializer)
}
}
impl<'de, T: Deserialize<'de>> Deserialize<'de> for Setting<T> {
fn deserialize<D>(deserializer: D) -> std::result::Result<Self, D::Error>
where
D: Deserializer<'de>,
{
Deserialize::deserialize(deserializer).map(|x| match x {
Some(x) => Self::Set(x),
None => Self::Reset, // Reset is forced by sending null value
})
}
}

View file

@ -0,0 +1,25 @@
---
source: dump/src/reader/v3/mod.rs
expression: movies2.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,25 @@
---
source: dump/src/reader/v3/mod.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,39 @@
---
source: dump/src/reader/v3/mod.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null
}

View file

@ -0,0 +1,31 @@
---
source: dump/src/reader/v3/mod.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [
"genres",
"id"
],
"sortableAttributes": [
"release_date"
],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View file

@ -0,0 +1,247 @@
use std::fmt::Display;
use serde::Deserialize;
use time::OffsetDateTime;
use uuid::Uuid;
use super::{Code, Settings, Unchecked};
#[derive(Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct UpdateEntry {
pub uuid: Uuid,
pub update: UpdateStatus,
}
impl UpdateEntry {
pub fn is_finished(&self) -> bool {
match self.update {
UpdateStatus::Processed(_) | UpdateStatus::Aborted(_) | UpdateStatus::Failed(_) => true,
UpdateStatus::Processing(_) | UpdateStatus::Enqueued(_) => false,
}
}
pub fn get_content_uuid(&self) -> Option<&Uuid> {
match self.update.meta() {
Update::DocumentAddition { content_uuid, .. } => Some(content_uuid),
Update::DeleteDocuments(_) | Update::Settings(_) | Update::ClearDocuments => None,
}
}
}
#[derive(Debug, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(tag = "status", rename_all = "camelCase")]
pub enum UpdateStatus {
Processing(Processing),
Enqueued(Enqueued),
Processed(Processed),
Aborted(Aborted),
Failed(Failed),
}
impl UpdateStatus {
pub fn id(&self) -> u64 {
match self {
UpdateStatus::Processing(u) => u.id(),
UpdateStatus::Enqueued(u) => u.id(),
UpdateStatus::Processed(u) => u.id(),
UpdateStatus::Aborted(u) => u.id(),
UpdateStatus::Failed(u) => u.id(),
}
}
pub fn meta(&self) -> &Update {
match self {
UpdateStatus::Processing(u) => u.meta(),
UpdateStatus::Enqueued(u) => u.meta(),
UpdateStatus::Processed(u) => u.meta(),
UpdateStatus::Aborted(u) => u.meta(),
UpdateStatus::Failed(u) => u.meta(),
}
}
pub fn is_finished(&self) -> bool {
match self {
UpdateStatus::Processing(_) | UpdateStatus::Enqueued(_) => false,
UpdateStatus::Aborted(_) | UpdateStatus::Failed(_) | UpdateStatus::Processed(_) => true,
}
}
pub fn processed(&self) -> Option<&Processed> {
match self {
UpdateStatus::Processed(p) => Some(p),
_ => None,
}
}
pub fn enqueued_at(&self) -> Option<OffsetDateTime> {
match self {
UpdateStatus::Processing(u) => Some(u.from.enqueued_at),
UpdateStatus::Enqueued(u) => Some(u.enqueued_at),
UpdateStatus::Processed(u) => Some(u.from.from.enqueued_at),
UpdateStatus::Aborted(u) => Some(u.from.enqueued_at),
UpdateStatus::Failed(u) => Some(u.from.from.enqueued_at),
}
}
pub fn finished_at(&self) -> Option<OffsetDateTime> {
match self {
UpdateStatus::Processing(_) => None,
UpdateStatus::Enqueued(_) => None,
UpdateStatus::Processed(u) => Some(u.processed_at),
UpdateStatus::Aborted(_) => None,
UpdateStatus::Failed(u) => Some(u.failed_at),
}
}
}
#[derive(Debug, Deserialize, Clone)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(rename_all = "camelCase")]
pub struct Enqueued {
pub update_id: u64,
pub meta: Update,
#[serde(with = "time::serde::rfc3339")]
pub enqueued_at: OffsetDateTime,
}
impl Enqueued {
pub fn meta(&self) -> &Update {
&self.meta
}
pub fn id(&self) -> u64 {
self.update_id
}
}
#[derive(Debug, Deserialize, Clone)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(rename_all = "camelCase")]
pub struct Processed {
pub success: UpdateResult,
#[serde(with = "time::serde::rfc3339")]
pub processed_at: OffsetDateTime,
#[serde(flatten)]
pub from: Processing,
}
impl Processed {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &Update {
self.from.meta()
}
}
#[derive(Debug, Deserialize, Clone)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(rename_all = "camelCase")]
pub struct Processing {
#[serde(flatten)]
pub from: Enqueued,
#[serde(with = "time::serde::rfc3339")]
pub started_processing_at: OffsetDateTime,
}
impl Processing {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &Update {
self.from.meta()
}
}
#[derive(Debug, Deserialize, Clone)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(rename_all = "camelCase")]
pub struct Aborted {
#[serde(flatten)]
pub from: Enqueued,
#[serde(with = "time::serde::rfc3339")]
pub aborted_at: OffsetDateTime,
}
impl Aborted {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &Update {
self.from.meta()
}
}
#[derive(Debug, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(rename_all = "camelCase")]
pub struct Failed {
#[serde(flatten)]
pub from: Processing,
pub msg: String,
pub code: Code,
#[serde(with = "time::serde::rfc3339")]
pub failed_at: OffsetDateTime,
}
impl Display for Failed {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
self.msg.fmt(f)
}
}
impl Failed {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &Update {
self.from.meta()
}
}
#[allow(clippy::large_enum_variant)]
#[derive(Debug, Clone, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
pub enum Update {
DeleteDocuments(Vec<String>),
DocumentAddition {
primary_key: Option<String>,
method: IndexDocumentsMethod,
content_uuid: Uuid,
},
Settings(Settings<Unchecked>),
ClearDocuments,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
#[non_exhaustive]
pub enum IndexDocumentsMethod {
/// Replace the previous document with the new one,
/// removing all the already known attributes.
ReplaceDocuments,
/// Merge the previous version of the document with the new version,
/// replacing old attributes values with the new ones and add the new attributes.
UpdateDocuments,
}
#[derive(Debug, Clone, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
pub enum UpdateResult {
DocumentsAddition(DocumentAdditionResult),
DocumentDeletion { deleted: u64 },
Other,
}
#[derive(Debug, Deserialize, Clone)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct DocumentAdditionResult {
pub nb_documents: usize,
}

View file

@ -0,0 +1,310 @@
use std::fmt;
use http::StatusCode;
use serde::{Deserialize, Serialize};
#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
#[serde(rename_all = "camelCase")]
pub struct ResponseError {
#[serde(skip)]
pub code: StatusCode,
pub message: String,
#[serde(rename = "code")]
pub error_code: String,
#[serde(rename = "type")]
pub error_type: String,
#[serde(rename = "link")]
pub error_link: String,
}
impl ResponseError {
pub fn from_msg(message: String, code: Code) -> Self {
Self {
code: code.http(),
message,
error_code: code.err_code().error_name.to_string(),
error_type: code.type_(),
error_link: code.url(),
}
}
}
impl fmt::Display for ResponseError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
self.message.fmt(f)
}
}
impl std::error::Error for ResponseError {}
impl<T> From<T> for ResponseError
where
T: ErrorCode,
{
fn from(other: T) -> Self {
Self {
code: other.http_status(),
message: other.to_string(),
error_code: other.error_name(),
error_type: other.error_type(),
error_link: other.error_url(),
}
}
}
pub trait ErrorCode: std::error::Error {
fn error_code(&self) -> Code;
/// returns the HTTP status code ascociated with the error
fn http_status(&self) -> StatusCode {
self.error_code().http()
}
/// returns the doc url ascociated with the error
fn error_url(&self) -> String {
self.error_code().url()
}
/// returns error name, used as error code
fn error_name(&self) -> String {
self.error_code().name()
}
/// return the error type
fn error_type(&self) -> String {
self.error_code().type_()
}
}
#[allow(clippy::enum_variant_names)]
enum ErrorType {
InternalError,
InvalidRequestError,
AuthenticationError,
}
impl fmt::Display for ErrorType {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
use ErrorType::*;
match self {
InternalError => write!(f, "internal"),
InvalidRequestError => write!(f, "invalid_request"),
AuthenticationError => write!(f, "auth"),
}
}
}
#[allow(clippy::enum_variant_names)]
#[derive(Serialize, Deserialize, Debug, Clone, Copy)]
pub enum Code {
// index related error
CreateIndex,
IndexAlreadyExists,
IndexNotFound,
InvalidIndexUid,
InvalidMinWordLengthForTypo,
// invalid state error
InvalidState,
MissingPrimaryKey,
PrimaryKeyAlreadyPresent,
MaxFieldsLimitExceeded,
MissingDocumentId,
InvalidDocumentId,
Filter,
Sort,
BadParameter,
BadRequest,
DatabaseSizeLimitReached,
DocumentNotFound,
Internal,
InvalidGeoField,
InvalidRankingRule,
InvalidStore,
InvalidToken,
MissingAuthorizationHeader,
NoSpaceLeftOnDevice,
DumpNotFound,
TaskNotFound,
PayloadTooLarge,
RetrieveDocument,
SearchDocuments,
UnsupportedMediaType,
DumpAlreadyInProgress,
DumpProcessFailed,
InvalidContentType,
MissingContentType,
MalformedPayload,
MissingPayload,
ApiKeyNotFound,
MissingParameter,
InvalidApiKeyActions,
InvalidApiKeyIndexes,
InvalidApiKeyExpiresAt,
InvalidApiKeyDescription,
UnretrievableErrorCode,
MalformedDump,
}
impl Code {
/// ascociate a `Code` variant to the actual ErrCode
fn err_code(&self) -> ErrCode {
use Code::*;
match self {
// index related errors
// create index is thrown on internal error while creating an index.
CreateIndex => {
ErrCode::internal("index_creation_failed", StatusCode::INTERNAL_SERVER_ERROR)
}
IndexAlreadyExists => ErrCode::invalid("index_already_exists", StatusCode::CONFLICT),
// thrown when requesting an unexisting index
IndexNotFound => ErrCode::invalid("index_not_found", StatusCode::NOT_FOUND),
InvalidIndexUid => ErrCode::invalid("invalid_index_uid", StatusCode::BAD_REQUEST),
// invalid state error
InvalidState => ErrCode::internal("invalid_state", StatusCode::INTERNAL_SERVER_ERROR),
// thrown when no primary key has been set
MissingPrimaryKey => {
ErrCode::invalid("primary_key_inference_failed", StatusCode::BAD_REQUEST)
}
// error thrown when trying to set an already existing primary key
PrimaryKeyAlreadyPresent => {
ErrCode::invalid("index_primary_key_already_exists", StatusCode::BAD_REQUEST)
}
// invalid ranking rule
InvalidRankingRule => ErrCode::invalid("invalid_ranking_rule", StatusCode::BAD_REQUEST),
// invalid database
InvalidStore => {
ErrCode::internal("invalid_store_file", StatusCode::INTERNAL_SERVER_ERROR)
}
// invalid document
MaxFieldsLimitExceeded => {
ErrCode::invalid("max_fields_limit_exceeded", StatusCode::BAD_REQUEST)
}
MissingDocumentId => ErrCode::invalid("missing_document_id", StatusCode::BAD_REQUEST),
InvalidDocumentId => ErrCode::invalid("invalid_document_id", StatusCode::BAD_REQUEST),
// error related to filters
Filter => ErrCode::invalid("invalid_filter", StatusCode::BAD_REQUEST),
// error related to sorts
Sort => ErrCode::invalid("invalid_sort", StatusCode::BAD_REQUEST),
BadParameter => ErrCode::invalid("bad_parameter", StatusCode::BAD_REQUEST),
BadRequest => ErrCode::invalid("bad_request", StatusCode::BAD_REQUEST),
DatabaseSizeLimitReached => {
ErrCode::internal("database_size_limit_reached", StatusCode::INTERNAL_SERVER_ERROR)
}
DocumentNotFound => ErrCode::invalid("document_not_found", StatusCode::NOT_FOUND),
Internal => ErrCode::internal("internal", StatusCode::INTERNAL_SERVER_ERROR),
InvalidGeoField => ErrCode::invalid("invalid_geo_field", StatusCode::BAD_REQUEST),
InvalidToken => ErrCode::authentication("invalid_api_key", StatusCode::FORBIDDEN),
MissingAuthorizationHeader => {
ErrCode::authentication("missing_authorization_header", StatusCode::UNAUTHORIZED)
}
TaskNotFound => ErrCode::invalid("task_not_found", StatusCode::NOT_FOUND),
DumpNotFound => ErrCode::invalid("dump_not_found", StatusCode::NOT_FOUND),
NoSpaceLeftOnDevice => {
ErrCode::internal("no_space_left_on_device", StatusCode::INTERNAL_SERVER_ERROR)
}
PayloadTooLarge => ErrCode::invalid("payload_too_large", StatusCode::PAYLOAD_TOO_LARGE),
RetrieveDocument => {
ErrCode::internal("unretrievable_document", StatusCode::BAD_REQUEST)
}
SearchDocuments => ErrCode::internal("search_error", StatusCode::BAD_REQUEST),
UnsupportedMediaType => {
ErrCode::invalid("unsupported_media_type", StatusCode::UNSUPPORTED_MEDIA_TYPE)
}
// error related to dump
DumpAlreadyInProgress => {
ErrCode::invalid("dump_already_processing", StatusCode::CONFLICT)
}
DumpProcessFailed => {
ErrCode::internal("dump_process_failed", StatusCode::INTERNAL_SERVER_ERROR)
}
MissingContentType => {
ErrCode::invalid("missing_content_type", StatusCode::UNSUPPORTED_MEDIA_TYPE)
}
MalformedPayload => ErrCode::invalid("malformed_payload", StatusCode::BAD_REQUEST),
InvalidContentType => {
ErrCode::invalid("invalid_content_type", StatusCode::UNSUPPORTED_MEDIA_TYPE)
}
MissingPayload => ErrCode::invalid("missing_payload", StatusCode::BAD_REQUEST),
// error related to keys
ApiKeyNotFound => ErrCode::invalid("api_key_not_found", StatusCode::NOT_FOUND),
MissingParameter => ErrCode::invalid("missing_parameter", StatusCode::BAD_REQUEST),
InvalidApiKeyActions => {
ErrCode::invalid("invalid_api_key_actions", StatusCode::BAD_REQUEST)
}
InvalidApiKeyIndexes => {
ErrCode::invalid("invalid_api_key_indexes", StatusCode::BAD_REQUEST)
}
InvalidApiKeyExpiresAt => {
ErrCode::invalid("invalid_api_key_expires_at", StatusCode::BAD_REQUEST)
}
InvalidApiKeyDescription => {
ErrCode::invalid("invalid_api_key_description", StatusCode::BAD_REQUEST)
}
InvalidMinWordLengthForTypo => {
ErrCode::invalid("invalid_min_word_length_for_typo", StatusCode::BAD_REQUEST)
}
UnretrievableErrorCode => {
ErrCode::invalid("unretrievable_error_code", StatusCode::BAD_REQUEST)
}
MalformedDump => ErrCode::invalid("malformed_dump", StatusCode::BAD_REQUEST),
}
}
/// return the HTTP status code ascociated with the `Code`
fn http(&self) -> StatusCode {
self.err_code().status_code
}
/// return error name, used as error code
fn name(&self) -> String {
self.err_code().error_name.to_string()
}
/// return the error type
fn type_(&self) -> String {
self.err_code().error_type.to_string()
}
/// return the doc url ascociated with the error
fn url(&self) -> String {
format!("https://docs.meilisearch.com/errors#{}", self.name())
}
}
/// Internal structure providing a convenient way to create error codes
struct ErrCode {
status_code: StatusCode,
error_type: ErrorType,
error_name: &'static str,
}
impl ErrCode {
fn authentication(error_name: &'static str, status_code: StatusCode) -> ErrCode {
ErrCode { status_code, error_name, error_type: ErrorType::AuthenticationError }
}
fn internal(error_name: &'static str, status_code: StatusCode) -> ErrCode {
ErrCode { status_code, error_name, error_type: ErrorType::InternalError }
}
fn invalid(error_name: &'static str, status_code: StatusCode) -> ErrCode {
ErrCode { status_code, error_name, error_type: ErrorType::InvalidRequestError }
}
}

View file

@ -0,0 +1,77 @@
use serde::Deserialize;
use time::OffsetDateTime;
pub const KEY_ID_LENGTH: usize = 8;
pub type KeyId = [u8; KEY_ID_LENGTH];
#[derive(Debug, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct Key {
#[serde(skip_serializing_if = "Option::is_none")]
pub description: Option<String>,
pub id: KeyId,
pub actions: Vec<Action>,
pub indexes: Vec<String>,
#[serde(with = "time::serde::rfc3339::option")]
pub expires_at: Option<OffsetDateTime>,
#[serde(with = "time::serde::rfc3339")]
pub created_at: OffsetDateTime,
#[serde(with = "time::serde::rfc3339")]
pub updated_at: OffsetDateTime,
}
#[derive(Copy, Clone, Deserialize, Debug, Eq, PartialEq)]
#[cfg_attr(test, derive(serde::Serialize))]
#[repr(u8)]
pub enum Action {
#[serde(rename = "*")]
All = 0,
#[serde(rename = "search")]
Search = actions::SEARCH,
#[serde(rename = "documents.add")]
DocumentsAdd = actions::DOCUMENTS_ADD,
#[serde(rename = "documents.get")]
DocumentsGet = actions::DOCUMENTS_GET,
#[serde(rename = "documents.delete")]
DocumentsDelete = actions::DOCUMENTS_DELETE,
#[serde(rename = "indexes.create")]
IndexesAdd = actions::INDEXES_CREATE,
#[serde(rename = "indexes.get")]
IndexesGet = actions::INDEXES_GET,
#[serde(rename = "indexes.update")]
IndexesUpdate = actions::INDEXES_UPDATE,
#[serde(rename = "indexes.delete")]
IndexesDelete = actions::INDEXES_DELETE,
#[serde(rename = "tasks.get")]
TasksGet = actions::TASKS_GET,
#[serde(rename = "settings.get")]
SettingsGet = actions::SETTINGS_GET,
#[serde(rename = "settings.update")]
SettingsUpdate = actions::SETTINGS_UPDATE,
#[serde(rename = "stats.get")]
StatsGet = actions::STATS_GET,
#[serde(rename = "dumps.create")]
DumpsCreate = actions::DUMPS_CREATE,
#[serde(rename = "dumps.get")]
DumpsGet = actions::DUMPS_GET,
#[serde(rename = "version")]
Version = actions::VERSION,
}
pub mod actions {
pub const SEARCH: u8 = 1;
pub const DOCUMENTS_ADD: u8 = 2;
pub const DOCUMENTS_GET: u8 = 3;
pub const DOCUMENTS_DELETE: u8 = 4;
pub const INDEXES_CREATE: u8 = 5;
pub const INDEXES_GET: u8 = 6;
pub const INDEXES_UPDATE: u8 = 7;
pub const INDEXES_DELETE: u8 = 8;
pub const TASKS_GET: u8 = 9;
pub const SETTINGS_GET: u8 = 10;
pub const SETTINGS_UPDATE: u8 = 11;
pub const STATS_GET: u8 = 12;
pub const DUMPS_CREATE: u8 = 13;
pub const DUMPS_GET: u8 = 14;
pub const VERSION: u8 = 15;
}

View file

@ -0,0 +1,140 @@
use std::fmt::{self, Display, Formatter};
use std::marker::PhantomData;
use std::str::FromStr;
use serde::de::Visitor;
use serde::{Deserialize, Deserializer};
use uuid::Uuid;
use super::settings::{Settings, Unchecked};
#[derive(Deserialize, Debug)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct IndexUuid {
pub uid: String,
pub index_meta: IndexMeta,
}
#[derive(Deserialize, Debug)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct IndexMeta {
pub uuid: Uuid,
pub creation_task_id: usize,
}
// There is one in each indexes under `meta.json`.
#[derive(Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct DumpMeta {
pub settings: Settings<Unchecked>,
pub primary_key: Option<String>,
}
#[derive(Deserialize, Debug, Clone, PartialEq, Eq)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct IndexUid(pub String);
impl TryFrom<String> for IndexUid {
type Error = IndexUidFormatError;
fn try_from(uid: String) -> Result<Self, Self::Error> {
if !uid.chars().all(|x| x.is_ascii_alphanumeric() || x == '-' || x == '_')
|| uid.is_empty()
|| uid.len() > 400
{
Err(IndexUidFormatError { invalid_uid: uid })
} else {
Ok(IndexUid(uid))
}
}
}
impl FromStr for IndexUid {
type Err = IndexUidFormatError;
fn from_str(uid: &str) -> Result<IndexUid, IndexUidFormatError> {
uid.to_string().try_into()
}
}
impl From<IndexUid> for String {
fn from(uid: IndexUid) -> Self {
uid.into_inner()
}
}
#[derive(Debug)]
pub struct IndexUidFormatError {
pub invalid_uid: String,
}
impl Display for IndexUidFormatError {
fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
write!(
f,
"invalid index uid `{}`, the uid must be an integer \
or a string containing only alphanumeric characters \
a-z A-Z 0-9, hyphens - and underscores _, \
and can not be more than 400 bytes.",
self.invalid_uid,
)
}
}
impl std::error::Error for IndexUidFormatError {}
/// A type that tries to match either a star (*) or
/// any other thing that implements `FromStr`.
#[derive(Debug)]
#[cfg_attr(test, derive(serde::Serialize))]
pub enum StarOr<T> {
Star,
Other(T),
}
impl<'de, T, E> Deserialize<'de> for StarOr<T>
where
T: FromStr<Err = E>,
E: Display,
{
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
/// Serde can't differentiate between `StarOr::Star` and `StarOr::Other` without a tag.
/// Simply using `#[serde(untagged)]` + `#[serde(rename="*")]` will lead to attempting to
/// deserialize everything as a `StarOr::Other`, including "*".
/// [`#[serde(other)]`](https://serde.rs/variant-attrs.html#other) might have helped but is
/// not supported on untagged enums.
struct StarOrVisitor<T>(PhantomData<T>);
impl<'de, T, FE> Visitor<'de> for StarOrVisitor<T>
where
T: FromStr<Err = FE>,
FE: Display,
{
type Value = StarOr<T>;
fn expecting(&self, formatter: &mut Formatter) -> std::fmt::Result {
formatter.write_str("a string")
}
fn visit_str<SE>(self, v: &str) -> Result<Self::Value, SE>
where
SE: serde::de::Error,
{
match v {
"*" => Ok(StarOr::Star),
v => {
let other = FromStr::from_str(v).map_err(|e: T::Err| {
SE::custom(format!("Invalid `other` value: {}", e))
})?;
Ok(StarOr::Other(other))
}
}
}
}
deserializer.deserialize_str(StarOrVisitor(PhantomData))
}
}

View file

@ -0,0 +1,338 @@
use std::fs::{self, File};
use std::io::{BufRead, BufReader, ErrorKind};
use std::path::Path;
use serde::{Deserialize, Serialize};
use tempfile::TempDir;
use time::OffsetDateTime;
use uuid::Uuid;
pub mod errors;
pub mod keys;
pub mod meta;
pub mod settings;
pub mod tasks;
use self::meta::{DumpMeta, IndexMeta, IndexUuid};
use super::compat::v4_to_v5::CompatV4ToV5;
use crate::{Error, IndexMetadata, Result, Version};
pub type Document = serde_json::Map<String, serde_json::Value>;
pub type Settings<T> = settings::Settings<T>;
pub type Checked = settings::Checked;
pub type Unchecked = settings::Unchecked;
pub type Task = tasks::Task;
pub type Key = keys::Key;
// everything related to the settings
pub type Setting<T> = settings::Setting<T>;
// everything related to the api keys
pub type Action = keys::Action;
// everything related to the errors
pub type ResponseError = errors::ResponseError;
pub type Code = errors::Code;
#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct Metadata {
db_version: String,
index_db_size: usize,
update_db_size: usize,
#[serde(with = "time::serde::rfc3339")]
dump_date: OffsetDateTime,
}
pub struct V4Reader {
dump: TempDir,
metadata: Metadata,
tasks: BufReader<File>,
keys: BufReader<File>,
index_uuid: Vec<IndexUuid>,
}
impl V4Reader {
pub fn open(dump: TempDir) -> Result<Self> {
let meta_file = fs::read(dump.path().join("metadata.json"))?;
let metadata = serde_json::from_reader(&*meta_file)?;
let index_uuid = File::open(dump.path().join("index_uuids/data.jsonl"))?;
let index_uuid = BufReader::new(index_uuid);
let index_uuid = index_uuid
.lines()
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) })
.collect::<Result<Vec<_>>>()?;
Ok(V4Reader {
metadata,
tasks: BufReader::new(
File::open(dump.path().join("updates").join("data.jsonl")).unwrap(),
),
keys: BufReader::new(File::open(dump.path().join("keys"))?),
index_uuid,
dump,
})
}
pub fn to_v5(self) -> CompatV4ToV5 {
CompatV4ToV5::new(self)
}
pub fn version(&self) -> Version {
Version::V4
}
pub fn date(&self) -> Option<OffsetDateTime> {
Some(self.metadata.dump_date)
}
pub fn instance_uid(&self) -> Result<Option<Uuid>> {
match fs::read_to_string(self.dump.path().join("instance-uid")) {
Ok(uuid) => Ok(Some(Uuid::parse_str(&uuid)?)),
Err(e) if e.kind() == ErrorKind::NotFound => Ok(None),
Err(e) => Err(e.into()),
}
}
pub fn indexes(&self) -> Result<impl Iterator<Item = Result<V4IndexReader>> + '_> {
Ok(self.index_uuid.iter().map(|index| -> Result<_> {
V4IndexReader::new(
index.uid.clone(),
&self.dump.path().join("indexes").join(index.index_meta.uuid.to_string()),
&index.index_meta,
BufReader::new(
File::open(self.dump.path().join("updates").join("data.jsonl")).unwrap(),
),
)
}))
}
pub fn tasks(
&mut self,
) -> Box<dyn Iterator<Item = Result<(Task, Option<Box<super::UpdateFile>>)>> + '_> {
Box::new((&mut self.tasks).lines().map(|line| -> Result<_> {
let task: Task = serde_json::from_str(&line?)?;
if !task.is_finished() {
if let Some(uuid) = task.get_content_uuid() {
let update_file_path = self
.dump
.path()
.join("updates")
.join("updates_files")
.join(uuid.to_string());
Ok((
task,
Some(
Box::new(UpdateFile::new(&update_file_path)?) as Box<super::UpdateFile>
),
))
} else {
Ok((task, None))
}
} else {
Ok((task, None))
}
}))
}
pub fn keys(&mut self) -> Box<dyn Iterator<Item = Result<Key>> + '_> {
Box::new(
(&mut self.keys).lines().map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }),
)
}
}
pub struct V4IndexReader {
metadata: IndexMetadata,
settings: Settings<Checked>,
documents: BufReader<File>,
}
impl V4IndexReader {
pub fn new(
name: String,
path: &Path,
index_metadata: &IndexMeta,
tasks: BufReader<File>,
) -> Result<Self> {
let meta = File::open(path.join("meta.json"))?;
let meta: DumpMeta = serde_json::from_reader(meta)?;
let mut created_at = None;
let mut updated_at = None;
for line in tasks.lines() {
let task: Task = serde_json::from_str(&line?)?;
if task.index_uid.to_string() == name {
// The first task to match our index_uid that succeeded (ie. processed_at returns Some)
// is our `last_updated_at`.
if updated_at.is_none() {
updated_at = task.processed_at()
}
// Once we reach the `creation_task_id` we can stop iterating on the task queue and
// this task represents our `created_at`.
if task.id as usize == index_metadata.creation_task_id {
created_at = task.created_at();
break;
}
}
}
let current_time = OffsetDateTime::now_utc();
let metadata = IndexMetadata {
uid: name,
primary_key: meta.primary_key,
created_at: created_at.unwrap_or(current_time),
updated_at: updated_at.unwrap_or(current_time),
};
let ret = V4IndexReader {
metadata,
settings: meta.settings.check(),
documents: BufReader::new(File::open(path.join("documents.jsonl"))?),
};
Ok(ret)
}
pub fn metadata(&self) -> &IndexMetadata {
&self.metadata
}
pub fn documents(&mut self) -> Result<impl Iterator<Item = Result<Document>> + '_> {
Ok((&mut self.documents)
.lines()
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
}
pub fn settings(&mut self) -> Result<Settings<Checked>> {
Ok(self.settings.clone())
}
}
pub struct UpdateFile {
reader: BufReader<File>,
}
impl UpdateFile {
fn new(path: &Path) -> Result<Self> {
Ok(UpdateFile { reader: BufReader::new(File::open(path)?) })
}
}
impl Iterator for UpdateFile {
type Item = Result<Document>;
fn next(&mut self) -> Option<Self::Item> {
(&mut self.reader)
.lines()
.map(|line| {
line.map_err(Error::from)
.and_then(|line| serde_json::from_str(&line).map_err(Error::from))
})
.next()
}
}
#[cfg(test)]
pub(crate) mod test {
use std::fs::File;
use std::io::BufReader;
use flate2::bufread::GzDecoder;
use meili_snap::insta;
use tempfile::TempDir;
use super::*;
#[test]
fn read_dump_v4() {
let dump = File::open("tests/assets/v4.dump").unwrap();
let dir = TempDir::new().unwrap();
let mut dump = BufReader::new(dump);
let gz = GzDecoder::new(&mut dump);
let mut archive = tar::Archive::new(gz);
archive.unpack(dir.path()).unwrap();
let mut dump = V4Reader::open(dir).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-06 12:53:49.131989609 +00:00:00");
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();
let (tasks, mut update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"f4efacbea0c1a4400873f4b2ee33f975");
assert_eq!(update_files.len(), 10);
assert!(update_files[0].is_some()); // the enqueued document addition
assert!(update_files[1..].iter().all(|u| u.is_none())); // everything already processed
let update_file = update_files.remove(0).unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(update_file), @"7b8889539b669c7b9ddba448bafa385d");
// keys
let keys = dump.keys().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot_hash!(meili_snap::json_string!(keys, { "[].uid" => "[uuid]" }), @"9240300dca8f962cdf58359ef4c76f09");
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut products = indexes.pop().unwrap();
let mut movies = indexes.pop().unwrap();
let mut spells = indexes.pop().unwrap();
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "2022-10-06T12:53:39.360187055Z",
"updatedAt": "2022-10-06T12:53:40.603035979Z"
}
"###);
insta::assert_json_snapshot!(products.settings().unwrap());
let documents = products.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"b01c8371aea4c7171af0d4d846a2bdca");
// movies
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "2022-10-06T12:53:38.710611568Z",
"updatedAt": "2022-10-06T12:53:49.785862546Z"
}
"###);
insta::assert_json_snapshot!(movies.settings().unwrap());
let documents = movies.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 110);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"786022a66ecb992c8a2a60fee070a5ab");
// spells
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "2022-10-06T12:53:40.831649057Z",
"updatedAt": "2022-10-06T12:53:41.116036186Z"
}
"###);
insta::assert_json_snapshot!(spells.settings().unwrap());
let documents = spells.documents().unwrap().collect::<Result<Vec<_>>>().unwrap();
assert_eq!(documents.len(), 10);
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"235016433dd04262c7f2da01d1e808ce");
}
}

View file

@ -0,0 +1,262 @@
use std::collections::{BTreeMap, BTreeSet};
use std::marker::PhantomData;
use std::num::NonZeroUsize;
use serde::{Deserialize, Deserializer};
#[cfg(test)]
fn serialize_with_wildcard<S>(
field: &Setting<Vec<String>>,
s: S,
) -> std::result::Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
use serde::Serialize;
let wildcard = vec!["*".to_string()];
match field {
Setting::Set(value) => Some(value),
Setting::Reset => Some(&wildcard),
Setting::NotSet => None,
}
.serialize(s)
}
#[derive(Clone, Default, Debug, PartialEq, Eq)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct Checked;
#[derive(Clone, Default, Debug, Deserialize, PartialEq, Eq)]
#[cfg_attr(test, derive(serde::Serialize))]
pub struct Unchecked;
#[derive(Debug, Clone, Default, Deserialize, PartialEq, Eq)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(deny_unknown_fields)]
#[serde(rename_all = "camelCase")]
pub struct MinWordSizeTyposSetting {
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub one_typo: Setting<u8>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub two_typos: Setting<u8>,
}
#[derive(Debug, Clone, Default, Deserialize, PartialEq, Eq)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(deny_unknown_fields)]
#[serde(rename_all = "camelCase")]
pub struct TypoSettings {
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub enabled: Setting<bool>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub min_word_size_for_typos: Setting<MinWordSizeTyposSetting>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub disable_on_words: Setting<BTreeSet<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub disable_on_attributes: Setting<BTreeSet<String>>,
}
/// Holds all the settings for an index. `T` can either be `Checked` if they represents settings
/// whose validity is guaranteed, or `Unchecked` if they need to be validated. In the later case, a
/// call to `check` will return a `Settings<Checked>` from a `Settings<Unchecked>`.
#[derive(Debug, Clone, Default, Deserialize, PartialEq, Eq)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(deny_unknown_fields)]
#[serde(rename_all = "camelCase")]
#[serde(bound(serialize = "T: serde::Serialize", deserialize = "T: Deserialize<'static>"))]
pub struct Settings<T> {
#[serde(
default,
serialize_with = "serialize_with_wildcard",
skip_serializing_if = "Setting::is_not_set"
)]
pub displayed_attributes: Setting<Vec<String>>,
#[serde(
default,
serialize_with = "serialize_with_wildcard",
skip_serializing_if = "Setting::is_not_set"
)]
pub searchable_attributes: Setting<Vec<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub filterable_attributes: Setting<BTreeSet<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub sortable_attributes: Setting<BTreeSet<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub ranking_rules: Setting<Vec<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub stop_words: Setting<BTreeSet<String>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub synonyms: Setting<BTreeMap<String, Vec<String>>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub distinct_attribute: Setting<String>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
pub typo_tolerance: Setting<TypoSettings>,
#[serde(skip)]
pub _kind: PhantomData<T>,
}
impl Settings<Checked> {
pub fn cleared() -> Settings<Checked> {
Settings {
displayed_attributes: Setting::Reset,
searchable_attributes: Setting::Reset,
filterable_attributes: Setting::Reset,
sortable_attributes: Setting::Reset,
ranking_rules: Setting::Reset,
stop_words: Setting::Reset,
synonyms: Setting::Reset,
distinct_attribute: Setting::Reset,
typo_tolerance: Setting::Reset,
_kind: PhantomData,
}
}
pub fn into_unchecked(self) -> Settings<Unchecked> {
let Self {
displayed_attributes,
searchable_attributes,
filterable_attributes,
sortable_attributes,
ranking_rules,
stop_words,
synonyms,
distinct_attribute,
typo_tolerance,
..
} = self;
Settings {
displayed_attributes,
searchable_attributes,
filterable_attributes,
sortable_attributes,
ranking_rules,
stop_words,
synonyms,
distinct_attribute,
typo_tolerance,
_kind: PhantomData,
}
}
}
impl Settings<Unchecked> {
pub fn check(self) -> Settings<Checked> {
let displayed_attributes = match self.displayed_attributes {
Setting::Set(fields) => {
if fields.iter().any(|f| f == "*") {
Setting::Reset
} else {
Setting::Set(fields)
}
}
otherwise => otherwise,
};
let searchable_attributes = match self.searchable_attributes {
Setting::Set(fields) => {
if fields.iter().any(|f| f == "*") {
Setting::Reset
} else {
Setting::Set(fields)
}
}
otherwise => otherwise,
};
Settings {
displayed_attributes,
searchable_attributes,
filterable_attributes: self.filterable_attributes,
sortable_attributes: self.sortable_attributes,
ranking_rules: self.ranking_rules,
stop_words: self.stop_words,
synonyms: self.synonyms,
distinct_attribute: self.distinct_attribute,
typo_tolerance: self.typo_tolerance,
_kind: PhantomData,
}
}
}
#[allow(dead_code)] // otherwise rustc complains that the fields go unused
#[derive(Debug, Clone, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(deny_unknown_fields)]
#[serde(rename_all = "camelCase")]
pub struct Facets {
pub level_group_size: Option<NonZeroUsize>,
pub min_level_size: Option<NonZeroUsize>,
}
#[derive(Debug, Clone, PartialEq, Eq, Copy)]
pub enum Setting<T> {
Set(T),
Reset,
NotSet,
}
impl<T> Default for Setting<T> {
fn default() -> Self {
Self::NotSet
}
}
impl<T> Setting<T> {
pub fn set(self) -> Option<T> {
match self {
Self::Set(value) => Some(value),
_ => None,
}
}
pub const fn as_ref(&self) -> Setting<&T> {
match *self {
Self::Set(ref value) => Setting::Set(value),
Self::Reset => Setting::Reset,
Self::NotSet => Setting::NotSet,
}
}
pub const fn is_not_set(&self) -> bool {
matches!(self, Self::NotSet)
}
/// If `Self` is `Reset`, then map self to `Set` with the provided `val`.
pub fn or_reset(self, val: T) -> Self {
match self {
Self::Reset => Self::Set(val),
otherwise => otherwise,
}
}
}
#[cfg(test)]
impl<T: serde::Serialize> serde::Serialize for Setting<T> {
fn serialize<S>(&self, serializer: S) -> std::result::Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
match self {
Self::Set(value) => Some(value),
// Usually not_set isn't serialized by setting skip_serializing_if field attribute
Self::NotSet | Self::Reset => None,
}
.serialize(serializer)
}
}
impl<'de, T: Deserialize<'de>> Deserialize<'de> for Setting<T> {
fn deserialize<D>(deserializer: D) -> std::result::Result<Self, D::Error>
where
D: Deserializer<'de>,
{
Deserialize::deserialize(deserializer).map(|x| match x {
Some(x) => Self::Set(x),
None => Self::Reset, // Reset is forced by sending null value
})
}
}

Some files were not shown because too many files have changed in this diff Show more