Go to file
bors[bot] 293a246af8
Merge #601
601: Introduce snapshot tests r=Kerollmops a=loiclec

# Pull Request
## What does this PR do?
Introduce snapshot tests into milli, by using the `insta` crate. This implements the idea described by #597 

See: [insta.rs](https://insta.rs)

## Design
There is now a new file, `snapshot_tests.rs`, which is compiled only under `#[cfg(test)]`. It exposes the `db_snap!` macro, which is used to snapshot the content of a database.

When running `cargo test`, `insta` will check that the value of the current snapshot is the same as the previous one (on the file system). If they are the same, the test passes. If they are different, the test fails and you are asked to review the new snapshot to approve or reject it.

We don't want to save very large snapshots to the file system, because it will pollute the git repository and increase its size too much. Instead, we only save their `md5` hashes under the name `<snapshot_name>.hash.snap`. There is a new environment variable called `MILLI_TEST_FULL_SNAPS` which can be set to `true` in order to *also* save the full content of the snapshot under the name `<snapshot_name>.full.snap`. However, snapshots with the extension `.full.snap` are never saved to the git repository.

## Example
```rust
// In e.g. facets.rs
#[test]
fn my_test() {
    // create an index
    let index = TempIndex::new():
    index.add_documents(...);
    index.update_settings(|settings| ...);
    
    // then snapshot the content of one of its databases
    // the snapshot will be saved at the current folder under facets.rs/my_test/facet_id_string_docids.snap
    db_snap!(index, facet_id_string_docids);

    index.add_documents(...);   

    // we can also name the snapshot to ensure there is no conflict
    // this snapshot will be saved at facets.rs/my_test/updated/facet_id_string_docids.snap
    db_snap!(index, facet_id_string, docids, "updated");
    
    // and we can also use "inline" snapshots, which insert their content in the given string literal
    db_snap!(index, field_distributions, `@"");`
    // once the snapshot is approved, it will automatically get transformed to, e.g.:
    // db_snap!(index, field_distributions, `@"`
    // my_facet        21
    // other_field     3
    // ");
    
    // now let's add **many** documents
    index.add_documents(...);
    
    // because the snapshot is too big, its hash is saved instead
    // if the MILLI_TEST_FULL_SNAPS env variable is set to true, then the full snapshot will also be saved
    // at facets.rs/my_test/large/facet_id_string_docids.full.snap
    db_snap!(index, facet_id_string_docids, "large", `@"5348bbc46b5384455b6a900666d2a502");`
}
```

Co-authored-by: Loïc Lecrenier <loic@meilisearch.com>
2022-08-16 11:57:09 +00:00
.github deny warnings in CI 2022-04-28 15:35:12 +02:00
benchmarks Update version for next release (v0.32.0) 2022-07-21 13:20:02 +04:00
cli Update version for next release (v0.32.0) 2022-07-21 13:20:02 +04:00
filter-parser Update filter-parser/fuzz/.gitignore 2022-07-21 16:12:01 +02:00
flatten-serde-json Merge branch 'filter/field-exist' 2022-07-21 14:51:41 +02:00
helpers Update version for next release (v0.32.0) 2022-07-21 13:20:02 +04:00
http-ui Update version for next release (v0.32.0) 2022-07-21 13:20:02 +04:00
infos Merge branch 'filter/field-exist' 2022-07-21 14:51:41 +02:00
json-depth-checker Update version for next release (v0.32.0) 2022-07-21 13:20:02 +04:00
milli Add type annotations to remove compiler error 2022-08-16 09:19:30 +02:00
script format the whole project 2021-06-16 18:33:33 +02:00
.gitignore Add snapshot tests for Facets::execute 2022-08-10 15:53:46 +02:00
.rustfmt.toml format the whole project 2021-06-16 18:33:33 +02:00
bors.toml Update bors toml 2022-04-26 17:36:04 +02:00
Cargo.toml create the json-depth-checker crate 2022-04-14 11:14:08 +02:00
CONTRIBUTING.md Remove the wip section part of the contributing file 2022-05-04 14:44:51 +02:00
LICENSE Update LICENSE 2022-02-15 15:52:50 +01:00
README.md Update README.md 2022-04-25 18:14:43 +02:00

the milli logo

a concurrent indexer combined with fast and relevant search algorithms

Introduction

This repository contains the core engine used in Meilisearch.

It contains a library that can manage one and only one index. Meilisearch manages the multi-index itself. Milli is unable to store updates in a store: it is the job of something else above and this is why it is only able to process one update at a time.

This repository contains crates to quickly debug the engine:

  • There are benchmarks located in the benchmarks crate.
  • The cli crate is a simple command-line interface that helps run flamegraph on top of it.
  • The filter-parser crate contains the parser for the Meilisearch filter syntax.
  • The flatten-serde-json crate contains the library that flattens serde-json Value objects like Elasticsearch does.
  • The helpers crate is only used to do operations on the database.
  • The http-ui crate is a simple HTTP dashboard to test the features like for real!
  • The infos crate is used to dump the internal data-structure and ensure correctness.
  • The json-depth-checker crate is used to indicate if a JSON must be flattened.

How to use it?

Milli is a library that does search things, it must be embedded in a program. You can compute the documentation of it by using cargo doc --open.

Here is an example usage of the library where we insert documents into the engine and search for one of them right after.

let path = tempfile::tempdir().unwrap();
let mut options = EnvOpenOptions::new();
options.map_size(10 * 1024 * 1024); // 10 MB
let index = Index::new(options, &path).unwrap();

let mut wtxn = index.write_txn().unwrap();
let content = documents!([
    {
        "id": 2,
        "title": "Prideand Prejudice",
        "au{hor": "Jane Austin",
        "genre": "romance",
        "price$": "3.5$",
    },
    {
        "id": 456,
        "title": "Le Petit Prince",
        "au{hor": "Antoine de Saint-Exupéry",
        "genre": "adventure",
        "price$": "10.0$",
    },
    {
        "id": 1,
        "title": "Wonderland",
        "au{hor": "Lewis Carroll",
        "genre": "fantasy",
        "price$": "25.99$",
    },
    {
        "id": 4,
        "title": "Harry Potter ing fantasy\0lood Prince",
        "au{hor": "J. K. Rowling",
        "genre": "fantasy\0",
    },
]);

let config = IndexerConfig::default();
let indexing_config = IndexDocumentsConfig::default();
let mut builder =
    IndexDocuments::new(&mut wtxn, &index, &config, indexing_config.clone(), |_| ())
        .unwrap();
builder.add_documents(content).unwrap();
builder.execute().unwrap();
wtxn.commit().unwrap();


// You can search in the index now!
let mut rtxn = index.read_txn().unwrap();
let mut search = Search::new(&rtxn, &index);
search.query("horry");
search.limit(10);

let result = search.execute().unwrap();
assert_eq!(result.documents_ids.len(), 1);

Contributing

We're glad you're thinking about contributing to this repository! Feel free to pick an issue, and to ask any question you need. Some points might not be clear and we are available to help you!

Also, we recommend following the CONTRIBUTING.md to create your PR.