3687: Allow to disable specialized tokenizations (again) r=Kerollmops a=jirutka

In PR #2773, I added the `chinese`, `hebrew`, `japanese` and `thai` feature flags to allow melisearch to be built without huge specialed tokenizations that took up 90% of the melisearch binary size. Unfortunately, due to some recent changes, this doesn't work anymore. The problem lies in excessive use of the `default` feature flag, which infects the dependency graph.

Instead of adding `default-features = false` here and there, it's easier and more future-proof to not declare `default` in `milli` and `meilisearch-types`. I've renamed it to `all-tokenizers`, which also makes it a bit clearer what it's about.


Co-authored-by: Jakub Jirutka <jakub@jirutka.cz>
This commit is contained in:
meili-bors[bot] 2023-05-04 14:48:01 +00:00 committed by GitHub
commit 9f7981df28
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
7 changed files with 12 additions and 10 deletions

View file

@ -201,12 +201,14 @@ pub fn build_dfa(word: &str, typos: u8, is_prefix: bool) -> DFA {
#[cfg(test)]
mod test {
#[allow(unused_imports)]
use super::*;
use crate::index::tests::TempIndex;
#[cfg(feature = "default")]
#[cfg(feature = "japanese")]
#[test]
fn test_kanji_language_detection() {
use crate::index::tests::TempIndex;
let index = TempIndex::new();
index

View file

@ -4,7 +4,7 @@ pub mod distinct;
pub mod exactness;
pub mod geo_sort;
pub mod integration;
#[cfg(feature = "default")]
#[cfg(feature = "all-tokenizations")]
pub mod language;
pub mod ngram_split_words;
pub mod proximity;

View file

@ -1581,7 +1581,7 @@ mod tests {
assert_eq!(count, 4);
}
#[cfg(feature = "default")]
#[cfg(feature = "chinese")]
#[test]
fn test_meilisearch_1714() {
let index = TempIndex::new();