Go to file
meili-bors[bot] c9b3f80947
Merge #3780
3780: Be able to sort facet values by alpha or count r=dureuill a=Kerollmops

This PR introduces a new `sortFacetValuesBy` settings parameter to expose the facet distribution in either count or lexicographic/alpha order.

## Mini Spec of the `sortFacetValuesBy` Settings Parameter

This parameter can be set in the settings to change how the engine returns the facet values. There are two possible values to this parameter.

Please note that the current behavior changed a bit, and keys are returned in lexicographic order instead of undefined order. The previous order wasn't defined as we were using a `HashMap`, which returns entries in hash order (undefined), and we are now using an `IndexMap`, which returns them in insertion order (the order we actually want).

Also, note that there are performance issues when the dataset is enormous. Here are the timings of the engine running on my Macbook Pro M1 (16Go of RAM). [The dataset is 40 million songs file](https://www.notion.so/meilisearch/Songs-from-MusicBrainz-686e31b2bd3845898c7746f502a6e117), and the database size is about 50GiB. Even if you think 800ms is not that high, don't forget that the API is public, and anybody can ask for multiple facets in a single query.

| Search Kind | Get Facets | Max Values per Facet | Time for Alpha | Time for Count | Count but with #3788 |
|------------:|------------|----------------------|:--------------:|----------------|----------------------|
| Placeholder | genres     | default (100)        | 7ms            | 187ms          | 122ms                |
| Placeholder | genres     | 20                   | 6ms            | 124ms          | 75ms                 |
| Placeholder | album      | default (100)        | 9ms            | 808ms          | 677ms                |
| Placeholder | album      | 20                   | 8ms            | 579ms          | 446ms                |
| Placeholder | artist     | default (100)        | 9ms            | 462ms          | 344ms                |
| Placeholder | artist     | 20                   | 9ms            | 341ms          | 246ms                |

### Order Values in Alphanumeric Order

This is the default one. Values will be returned by lexicographic order, ascending from A to Z.

```bash
# First, update the settings
curl 'localhost:7700/indexes/movies/settings/facetting' \
  -H "Content-Type: application/json"  \
  -d '{ "sortFacetValuesBy": { "*": "alpha" } }'

# Then, ask for the facet distribution
curl 'localhost:7700/indexes/movies/search?facets=genres'
```

```json5
{
    "hits": [
        /* list of results */
    ],
    "query": "",
    "processingTimeMs": 0,
    "limit": 20,
    "offset": 0,
    "estimatedTotalHits": 1000,
    "facetDistribution": {
        "genres": {
            "Action": 3215,
            "Adventure": 1972,
            "Animation": 1577,
            "Comedy": 5883,
            "Crime": 1808,
            // ...
        }
    },
    "facetStats": {}
}
```

### Order Values in Count Order

Facet values are sorted by decreasing count. The count is the number of records containing this facet value in the query results.

```bash
# First, update the settings
curl 'localhost:7700/indexes/movies/settings/facetting' \
  -H "Content-Type: application/json"  \
  -d '{ "sortFacetValuesBy": { "*": "count" } }'

# Then, ask for the facet distribution
curl 'localhost:7700/indexes/movies/search?facets=genres'
```

```json5
{
    "hits": [
        /* list of results */
    ],
    "query": "",
    "processingTimeMs": 0,
    "limit": 20,
    "offset": 0,
    "estimatedTotalHits": 1000,
    "facetDistribution": {
        "genres": {
            "Drama": 7337,
            "Comedy": 5883,
            "Action": 3215,
            "Thriller": 3189,
            "Romance": 2507,
            // ...
        }
    },
    "facetStats": {}
}
```

## Todo List
 - [x] Add tests
 - [x] Send analytics when a user change the `sortFacetValuesBy`
 - [x] Create a prototype and announce it in https://github.com/meilisearch/product/discussions/519.

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-06-29 12:43:25 +00:00
.github Merge #3799 2023-06-20 13:35:33 +00:00
assets Update dashboard generated from grafana 2023-06-28 11:22:16 +02:00
benchmarks Allow to disable specialized tokenizations (again) 2023-05-04 15:45:40 +02:00
dump Add a non default value to the faceting settings of the dump tests 2023-06-29 14:33:33 +02:00
file-store Upgrade the compatible versions of the dependencies 2023-04-24 17:50:52 +02:00
filter-parser Merge #3571 2023-04-27 13:14:00 +00:00
flatten-serde-json Fix the tests to make flattening work 2023-03-15 14:12:34 +01:00
fuzzers Stop the fuzzer after an hour 2023-06-12 15:30:51 +02:00
index-scheduler Update the Vector Store product feature link 2023-06-27 12:32:42 +02:00
json-depth-checker Use the workspace inheritance feature of rust 1.64 2023-02-15 13:51:07 +01:00
meili-snap Upgrade the compatible versions of the dependencies 2023-04-24 17:50:52 +02:00
meilisearch Cargo fmt 2023-06-29 14:37:18 +02:00
meilisearch-auth Merge #3811 2023-06-06 13:10:24 +00:00
meilisearch-types Fix typos 2023-06-29 14:33:32 +02:00
milli Document that the LevelEntry fields order is important 2023-06-29 14:33:32 +02:00
permissive-json-pointer Use the workspace inheritance feature of rust 1.64 2023-02-15 13:51:07 +01:00
.dockerignore Revert "Improve docker cache" 2023-05-25 11:48:26 +02:00
.gitignore edit gitignore to ignore .idea and .vscode folders 2023-02-10 11:42:19 +04:00
.rustfmt.toml Introduce a rustfmt file 2022-10-27 11:35:05 +02:00
bors.toml Remove macos-latest and windows-latest usages 2022-12-20 11:10:09 +01:00
Cargo.lock Replace the BTreeMap by an IndexMap to return values in order 2023-06-29 14:33:31 +02:00
Cargo.toml move the fuzzer to its own crate 2023-05-29 12:27:39 +02:00
CODE_OF_CONDUCT.md Create CODE_OF_CONDUCT.md 2020-04-30 20:16:02 +02:00
config.toml Merge branch 'main' into tmp-release-v1.2.0 2023-06-05 18:36:28 +02:00
CONTRIBUTING.md Update links of the docs 2023-05-03 19:14:57 +02:00
Cross.toml Cross build with action-rs 2021-10-10 02:21:30 +08:00
Dockerfile Revert "Improve docker cache" 2023-05-25 11:48:26 +02:00
download-latest.sh Update links of the docs 2023-05-03 19:14:57 +02:00
LICENSE Update LICENSE 2022-02-15 15:54:45 +01:00
README.md Merge #3855 2023-06-27 12:29:36 +00:00
SECURITY.md docs(security): Fix Supported 2022-05-31 14:21:34 -05:00

Website | Roadmap | Meilisearch Cloud | Blog | Documentation | FAQ | Discord

Dependency status License Bors enabled

A lightning-fast search engine that fits effortlessly into your apps, websites, and workflow 🔍

Meilisearch helps you shape a delightful search experience in a snap, offering features that work out-of-the-box to speed up your workflow.

A bright colored application for finding movies screening near the user A dark colored application for finding movies screening near the user

🔥 Try it! 🔥

Features

  • Search-as-you-type: find search results in less than 50 milliseconds
  • Typo tolerance: get relevant matches even when queries contain typos and misspellings
  • Filtering and faceted search: enhance your user's search experience with custom filters and build a faceted search interface in a few lines of code
  • Sorting: sort results based on price, date, or pretty much anything else your users need
  • Synonym support: configure synonyms to include more relevant content in your search results
  • Geosearch: filter and sort documents based on geographic data
  • Extensive language support: search datasets in any language, with optimized support for Chinese, Japanese, Hebrew, and languages using the Latin alphabet
  • Security management: control which users can access what data with API keys that allow fine-grained permissions handling
  • Multi-Tenancy: personalize search results for any number of application tenants
  • Highly Customizable: customize Meilisearch to your specific needs or use our out-of-the-box and hassle-free presets
  • RESTful API: integrate Meilisearch in your technical stack with our plugins and SDKs
  • Easy to install, deploy, and maintain

📖 Documentation

You can consult Meilisearch's documentation at https://www.meilisearch.com/docs.

🚀 Getting started

For basic instructions on how to set up Meilisearch, add documents to an index, and search for documents, take a look at our Quick Start guide.

You may also want to check out Meilisearch 101 for an introduction to some of Meilisearch's most popular features.

Supercharge your Meilisearch experience

Say goodbye to server deployment and manual updates with Meilisearch Cloud. No credit card required.

🧰 SDKs & integration tools

Install one of our SDKs in your project for seamless integration between Meilisearch and your favorite language or framework!

Take a look at the complete Meilisearch integration list.

Logos belonging to different languages and frameworks supported by Meilisearch, including React, Ruby on Rails, Go, Rust, and PHP

⚙️ Advanced usage

Experienced users will want to keep our API Reference close at hand.

We also offer a wide range of dedicated guides to all Meilisearch features, such as filtering, sorting, geosearch, API keys, and tenant tokens.

Finally, for more in-depth information, refer to our articles explaining fundamental Meilisearch concepts such as documents and indexes.

📊 Telemetry

Meilisearch collects anonymized data from users to help us improve our product. You can deactivate this whenever you want.

To request deletion of collected data, please write to us at privacy@meilisearch.com. Don't forget to include your Instance UID in the message, as this helps us quickly find and delete your data.

If you want to know more about the kind of data we collect and what we use it for, check the telemetry section of our documentation.

📫 Get in touch!

Meilisearch is a search engine created by Meili, a software development company based in France and with team members all over the world. Want to know more about us? Check out our blog!

🗞 Subscribe to our newsletter if you don't want to miss any updates! We promise we won't clutter your mailbox: we only send one edition every two months.

💌 Want to make a suggestion or give feedback? Here are some of the channels where you can reach us:

Thank you for your support!

👩‍💻 Contributing

Meilisearch is, and will always be, open-source! If you want to contribute to the project, please take a look at our contribution guidelines.

📦 Versioning

Meilisearch releases and their associated binaries are available in this GitHub page.

The binaries are versioned following SemVer conventions. To know more, read our versioning policy.

Differently from the binaries, crates in this repository are not currently available on crates.io and do not follow SemVer conventions.