MeiliSearch

mirror of https://github.com/meilisearch/MeiliSearch synced 2025-07-04 04:17:10 +02:00

No description

Find a file

bors[bot] 293a246af8 Merge #601 601: Introduce snapshot tests r=Kerollmops a=loiclec # Pull Request ## What does this PR do? Introduce snapshot tests into milli, by using the `insta` crate. This implements the idea described by #597 See: [insta.rs](https://insta.rs) ## Design There is now a new file, `snapshot_tests.rs`, which is compiled only under `#[cfg(test)]`. It exposes the `db_snap!` macro, which is used to snapshot the content of a database. When running `cargo test`, `insta` will check that the value of the current snapshot is the same as the previous one (on the file system). If they are the same, the test passes. If they are different, the test fails and you are asked to review the new snapshot to approve or reject it. We don't want to save very large snapshots to the file system, because it will pollute the git repository and increase its size too much. Instead, we only save their `md5` hashes under the name `<snapshot_name>.hash.snap`. There is a new environment variable called `MILLI_TEST_FULL_SNAPS` which can be set to `true` in order to also save the full content of the snapshot under the name `<snapshot_name>.full.snap`. However, snapshots with the extension `.full.snap` are never saved to the git repository. ## Example ```rust // In e.g. facets.rs #[test] fn my_test() { // create an index let index = TempIndex::new(): index.add_documents(...); index.update_settings(\|settings\| ...); // then snapshot the content of one of its databases // the snapshot will be saved at the current folder under facets.rs/my_test/facet_id_string_docids.snap db_snap!(index, facet_id_string_docids); index.add_documents(...); // we can also name the snapshot to ensure there is no conflict // this snapshot will be saved at facets.rs/my_test/updated/facet_id_string_docids.snap db_snap!(index, facet_id_string, docids, "updated"); // and we can also use "inline" snapshots, which insert their content in the given string literal db_snap!(index, field_distributions, `@"");` // once the snapshot is approved, it will automatically get transformed to, e.g.: // db_snap!(index, field_distributions, `@"` // my_facet 21 // other_field 3 // "); // now let's add many documents index.add_documents(...); // because the snapshot is too big, its hash is saved instead // if the MILLI_TEST_FULL_SNAPS env variable is set to true, then the full snapshot will also be saved // at facets.rs/my_test/large/facet_id_string_docids.full.snap db_snap!(index, facet_id_string_docids, "large", `@"5348bbc46b5384455b6a900666d2a502");` } ``` Co-authored-by: Loïc Lecrenier <loic@meilisearch.com>		2022-08-16 11:57:09 +00:00
.github	deny warnings in CI	2022-04-28 15:35:12 +02:00
benchmarks	Update version for next release (v0.32.0)	2022-07-21 13:20:02 +04:00
cli	Update version for next release (v0.32.0)	2022-07-21 13:20:02 +04:00
filter-parser	Update filter-parser/fuzz/.gitignore	2022-07-21 16:12:01 +02:00
flatten-serde-json	Merge branch 'filter/field-exist'	2022-07-21 14:51:41 +02:00
helpers	Update version for next release (v0.32.0)	2022-07-21 13:20:02 +04:00
http-ui	Update version for next release (v0.32.0)	2022-07-21 13:20:02 +04:00
infos	Merge branch 'filter/field-exist'	2022-07-21 14:51:41 +02:00
json-depth-checker	Update version for next release (v0.32.0)	2022-07-21 13:20:02 +04:00
milli	Add type annotations to remove compiler error	2022-08-16 09:19:30 +02:00
script	format the whole project	2021-06-16 18:33:33 +02:00
.gitignore	Add snapshot tests for Facets::execute	2022-08-10 15:53:46 +02:00
.rustfmt.toml	format the whole project	2021-06-16 18:33:33 +02:00
bors.toml	Update bors toml	2022-04-26 17:36:04 +02:00
Cargo.toml	create the json-depth-checker crate	2022-04-14 11:14:08 +02:00
CONTRIBUTING.md	Remove the wip section part of the contributing file	2022-05-04 14:44:51 +02:00
LICENSE	Update LICENSE	2022-02-15 15:52:50 +01:00
README.md	Update README.md	2022-04-25 18:14:43 +02:00

README.md

a concurrent indexer combined with fast and relevant search algorithms

Introduction

This repository contains the core engine used in Meilisearch.

It contains a library that can manage one and only one index. Meilisearch manages the multi-index itself. Milli is unable to store updates in a store: it is the job of something else above and this is why it is only able to process one update at a time.

This repository contains crates to quickly debug the engine:

There are benchmarks located in the benchmarks crate.
The cli crate is a simple command-line interface that helps run flamegraph on top of it.
The filter-parser crate contains the parser for the Meilisearch filter syntax.
The flatten-serde-json crate contains the library that flattens serde-json Value objects like Elasticsearch does.
The helpers crate is only used to do operations on the database.
The http-ui crate is a simple HTTP dashboard to test the features like for real!
The infos crate is used to dump the internal data-structure and ensure correctness.
The json-depth-checker crate is used to indicate if a JSON must be flattened.

How to use it?

Milli is a library that does search things, it must be embedded in a program. You can compute the documentation of it by using cargo doc --open.

Here is an example usage of the library where we insert documents into the engine and search for one of them right after.

let path = tempfile::tempdir().unwrap();
let mut options = EnvOpenOptions::new();
options.map_size(10 * 1024 * 1024); // 10 MB
let index = Index::new(options, &path).unwrap();

let mut wtxn = index.write_txn().unwrap();
let content = documents!([
    {
        "id": 2,
        "title": "Prideand Prejudice",
        "au{hor": "Jane Austin",
        "genre": "romance",
        "price$": "3.5$",
    },
    {
        "id": 456,
        "title": "Le Petit Prince",
        "au{hor": "Antoine de Saint-Exupéry",
        "genre": "adventure",
        "price$": "10.0$",
    },
    {
        "id": 1,
        "title": "Wonderland",
        "au{hor": "Lewis Carroll",
        "genre": "fantasy",
        "price$": "25.99$",
    },
    {
        "id": 4,
        "title": "Harry Potter ing fantasy\0lood Prince",
        "au{hor": "J. K. Rowling",
        "genre": "fantasy\0",
    },
]);

let config = IndexerConfig::default();
let indexing_config = IndexDocumentsConfig::default();
let mut builder =
    IndexDocuments::new(&mut wtxn, &index, &config, indexing_config.clone(), |_| ())
        .unwrap();
builder.add_documents(content).unwrap();
builder.execute().unwrap();
wtxn.commit().unwrap();


// You can search in the index now!
let mut rtxn = index.read_txn().unwrap();
let mut search = Search::new(&rtxn, &index);
search.query("horry");
search.limit(10);

let result = search.execute().unwrap();
assert_eq!(result.documents_ids.len(), 1);

Contributing

We're glad you're thinking about contributing to this repository! Feel free to pick an issue, and to ask any question you need. Some points might not be clear and we are available to help you!

Also, we recommend following the CONTRIBUTING.md to create your PR.