Go to file
bors[bot] 2000f7958d
Merge #604
604: Speed up debug builds r=Kerollmops a=loiclec

Note: this draft PR is based on https://github.com/meilisearch/milli/pull/601 , for no particular reason.

## What does this PR do?
Make a series of changes with the goal of speeding up debug builds:

1. Add an `all_languages` feature which compiles charabia with its `default` features activated.
The `all_languages` feature is activated by default. But running:
```
cargo build --no-default-features
```
on `milli` is now much faster.

2. Reduce the debug optimisation level from 3 to 0, except for a few critical dependencies.

3.  Compile the build dependencies quicker as well. Previously, all build dependencies were compiled with `opt-level = 3`. Now, only the critical build dependencies are compiled with optimisations.

4. Reduce the amount of code generated by the `documents!` macro

5. Make the "progress update" closure provided to indexing functions a trait object instead of a generic parameter. This avoids monomorphising the indexing code multiple times needlessly.

## Results
Initial build times on my computer before and after these changes:
|        | cargo check | cargo check --no-default-features | cargo test | cargo test --lib | cargo test --no-default-features | cargo test --lib --no-default-features |
|--------|-------------|-----------------------------------|------------|------------------|----------------------------------|----------------------------------------|
| before | 1m05s       | 1m05s                             | 2m06s      | 1m47s            | 2m06                             | 1m47s                                  |
| after  | 28.9s       | 13.1s                             | 40s      | 38s            | 23s                              | 21s                                  |



Co-authored-by: Loïc Lecrenier <loic@meilisearch.com>
2022-10-12 08:54:48 +00:00
.github Upgrade ubuntu-18.04 to 20.04 2022-09-08 14:58:06 +02:00
assets chore: move logo to (new) assets folder 2022-10-04 12:20:24 +02:00
benchmarks Make milli's default features optional in other executable targets 2022-10-12 09:22:05 +02:00
cli Make milli's default features optional in other executable targets 2022-10-12 09:22:05 +02:00
filter-parser Update version for the next release (v0.33.4) in Cargo.toml files 2022-09-13 13:55:50 +00:00
flatten-serde-json Update version for the next release (v0.33.4) in Cargo.toml files 2022-09-13 13:55:50 +00:00
json-depth-checker Update version for the next release (v0.33.4) in Cargo.toml files 2022-09-13 13:55:50 +00:00
milli Simplify documents! macro to reduce compile times 2022-10-12 09:22:05 +02:00
script format the whole project 2021-06-16 18:33:33 +02:00
.gitignore chore: move logo to (new) assets folder 2022-10-04 12:20:24 +02:00
.rustfmt.toml format the whole project 2021-06-16 18:33:33 +02:00
bors.toml Upgrade ubuntu-18.04 to 20.04 2022-09-08 14:58:06 +02:00
Cargo.toml Optimize a few performance sensitive dependencies on debug builds 2022-10-12 09:22:05 +02:00
CONTRIBUTING.md Update CONTRIBUTING.md 2022-10-05 19:19:03 +02:00
LICENSE Update LICENSE 2022-02-15 15:52:50 +01:00
README.md chore: move logo to (new) assets folder 2022-10-04 12:20:24 +02:00

the milli logo

a concurrent indexer combined with fast and relevant search algorithms

Introduction

This repository contains the core engine used in Meilisearch.

It contains a library that can manage one and only one index. Meilisearch manages the multi-index itself. Milli is unable to store updates in a store: it is the job of something else above and this is why it is only able to process one update at a time.

This repository contains crates to quickly debug the engine:

  • There are benchmarks located in the benchmarks crate.
  • The cli crate is a simple command-line interface that helps run flamegraph on top of it.
  • The filter-parser crate contains the parser for the Meilisearch filter syntax.
  • The flatten-serde-json crate contains the library that flattens serde-json Value objects like Elasticsearch does.
  • The json-depth-checker crate is used to indicate if a JSON must be flattened.

How to use it?

Milli is a library that does search things, it must be embedded in a program. You can compute the documentation of it by using cargo doc --open.

Here is an example usage of the library where we insert documents into the engine and search for one of them right after.

let path = tempfile::tempdir().unwrap();
let mut options = EnvOpenOptions::new();
options.map_size(10 * 1024 * 1024); // 10 MB
let index = Index::new(options, &path).unwrap();

let mut wtxn = index.write_txn().unwrap();
let content = documents!([
    {
        "id": 2,
        "title": "Prideand Prejudice",
        "author": "Jane Austin",
        "genre": "romance",
        "price$": "3.5$",
    },
    {
        "id": 456,
        "title": "Le Petit Prince",
        "author": "Antoine de Saint-Exupéry",
        "genre": "adventure",
        "price$": "10.0$",
    },
    {
        "id": 1,
        "title": "Wonderland",
        "author": "Lewis Carroll",
        "genre": "fantasy",
        "price$": "25.99$",
    },
    {
        "id": 4,
        "title": "Harry Potter ing fantasy\0lood Prince",
        "author": "J. K. Rowling",
        "genre": "fantasy\0",
    },
]);

let config = IndexerConfig::default();
let indexing_config = IndexDocumentsConfig::default();
let mut builder =
    IndexDocuments::new(&mut wtxn, &index, &config, indexing_config.clone(), |_| ())
        .unwrap();
builder.add_documents(content).unwrap();
builder.execute().unwrap();
wtxn.commit().unwrap();


// You can search in the index now!
let mut rtxn = index.read_txn().unwrap();
let mut search = Search::new(&rtxn, &index);
search.query("horry");
search.limit(10);

let result = search.execute().unwrap();
assert_eq!(result.documents_ids.len(), 1);

Contributing

We're glad you're thinking about contributing to this repository! Feel free to pick an issue, and to ask any question you need. Some points might not be clear and we are available to help you!

Also, we recommend following the CONTRIBUTING.md to create your PR.