Commit Graph

9990 Commits

Author SHA1 Message Date
Tamo
a081da0d90 add support for the json format in the stream route 2024-02-14 15:34:39 +01:00
ManyTheFish
78e04520fc Update charabia version 2024-02-14 15:16:16 +01:00
meili-bors[bot]
72c1674a31
Merge #4350
4350: Make several indexing optimizations r=Kerollmops a=ManyTheFish

# Summary

Implement several enhancements to reduce the indexing time.

# Steps

- Compute the indexing chunk size dynamically based on the available threads and the data size
- Remove the merging step before the writing step and merge at the writing time
- Remove append function
- Make Facet search indexing incremental

# Running Indexing process

## `main`
Each type of data is written after a merging phase:
![Capture d’écran 2024-01-23 à 10 18 08](https://github.com/meilisearch/meilisearch/assets/6482087/6203c3ce-407c-46b4-8b83-04282da1bb16)

> Highlighted parts are the writings

## `remove-merging-phase-from-indexing`
When the extraction of a chunk is finished, the data is written:
![Capture d’écran 2024-01-23 à 10 18 18](https://github.com/meilisearch/meilisearch/assets/6482087/ab1307b4-d0a9-42ac-abbb-fdeb27ddf0d4)

> Highlighted parts are the writings

## Related

This PR removes the appending writes on several indexing parts, which may fix https://github.com/meilisearch/meilisearch/issues/4300. However, all of the appending writes are not removed. There are 2 remaining calls that could trigger this bug:
- When [putting embedders in the settings](b6fc181993/milli/src/update/settings.rs (L996))
- when [bulk indexing the facets](b6fc181993/milli/src/update/facet/bulk.rs (L150))


Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2024-02-14 14:12:48 +00:00
ManyTheFish
03bb6372af Change is_batchable_with by mergeable_with 2024-02-14 11:50:22 +01:00
ManyTheFish
3beda8833d Fix and add logs 2024-02-14 11:46:30 +01:00
Tamo
3b6544db6d Implement the experimental log mode cli flag 2024-02-13 18:09:15 +01:00
ManyTheFish
55e942cd45 buggy 2024-02-13 15:26:30 +01:00
ManyTheFish
48026aa75c fix PR comments 2024-02-13 15:19:01 +01:00
Many the fish
e5e811e2c9
Update milli/src/update/index_documents/extract/mod.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-13 14:22:21 +01:00
Many the fish
55de96f74e
Update milli/src/update/facet/mod.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-13 14:22:10 +01:00
meili-bors[bot]
82b43e9a7f
Merge #4400
4400: Upgrade rustls to 0.21.10 and ring to 0.17 r=curquiza a=hack3ric

# Pull Request

## What does this PR do?
- Upgrade dependencies that uses ring 0.16 so that they rely on ring 0.17 instead
- Use rustls 0.21 for actix-{http,tls}, since newer versions of rustls uses ring 0.17
- Fix some trivial breaking API changes caused by above

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Eric Long <i@hack3r.moe>
2024-02-12 13:17:40 +00:00
meili-bors[bot]
15dafde21d
Merge #4401
4401: Update version for the next release (v1.7.0) in Cargo.toml r=irevoire a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: irevoire <irevoire@users.noreply.github.com>
2024-02-12 10:17:10 +00:00
irevoire
290f6d15e7 Update version for the next release (v1.7.0) in Cargo.toml 2024-02-12 10:15:00 +00:00
ManyTheFish
39c83cb3d9 fix clippy 2024-02-12 09:12:54 +01:00
Louis Dureuil
7efb1cae11 yield in loop when the channel is not disconnected 2024-02-12 09:12:54 +01:00
Louis Dureuil
7877788510 fix logs 2024-02-12 09:12:54 +01:00
Eric Long
c02d585f5b
Upgrade rustls to 0.21.10 and ring to 0.17 2024-02-12 14:32:29 +08:00
ManyTheFish
be1b054b05
Compute chunk size based on the input data size ant the number of indexing threads 2024-02-08 17:28:37 +01:00
meili-bors[bot]
023c2d755f
Merge #4391
4391: Tracing r=dureuill a=irevoire

# Pull Request

- [ ] Hide the parameters of the process batch
- [x] Make actix-web trace every call on every route
- [x] Remove all `env_logger`/`logs` dependencies
- [x] Be able to enable or disable the memory measurement using the `/logs` route parameters

See the following product discussion: https://github.com/orgs/meilisearch/discussions/721

Supersedes https://github.com/meilisearch/meilisearch/pull/4338

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4317

## What does this PR do?

Update the format of the logs from:
```
[2024-02-06T14:54:11Z INFO  actix_server::builder] starting 10 workers
```

to

```
2024-02-06T13:58:14.710803Z  INFO actix_server::builder: 200: starting 10 workers
```

First, run meilisearch with the route enabled via the feature flag:
- `cargo run --experimental-enable-logs-route`
- Or at runtime by sending the following payload:
```
curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json'  \
--data-binary '{
    "logsRoute": true
  }'
```

Then gather data from meilisearch by calling for example:
```
curl \
	-X POST http://localhost:7700/logs \
	-H 'Content-Type: application/json' \
	--data-binary '{
	    "mode": "fmt",
            "target": "milli=trace"
    }'
```

Once your operation is over, tell meilisearch to stop the route:
```
curl \
	-X DELETE http://localhost:7700/logs
```

----

In the case you’re profiling code, you will be interested by the next command that converts the output of the route to a format that the firefox profiler can understand.

```bash
cargo run --release --bin trace-to-firefox -- 2024-01-17_17:07:55-indexing-trace.json
```

Then go to https://profiler.firefox.com and load it.
Note that we can also share the profiles using the https://share.firefox.dev website.


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-08 14:16:56 +00:00
Louis Dureuil
407ad753ed
rust fmt 2024-02-08 15:11:42 +01:00
Tamo
285aa15d2f
make the mode camelCase instead of lowercase 2024-02-08 15:04:06 +01:00
Tamo
bf43a3f60a
fix typo 2024-02-08 15:04:06 +01:00
Tamo
2c88131bb1
rename the fmt mode to human 2024-02-08 15:04:06 +01:00
Tamo
35aa9d5904
fix an error message 2024-02-08 15:04:06 +01:00
Tamo
cfb3e6b51f
update the actix-web trace 2024-02-08 15:04:06 +01:00
Tamo
1502382316
use debug instead of debug_span 2024-02-08 15:04:06 +01:00
Louis Dureuil
ef994d84d0
Change error messages and fix tests 2024-02-08 15:04:06 +01:00
Louis Dureuil
1b74010e9e
Remove "with_line_numbers" 2024-02-08 15:04:06 +01:00
Tamo
08af0e690c
Structures a bunch of logs 2024-02-08 15:04:06 +01:00
Louis Dureuil
d71b77f18b
Add panic hook to log panics 2024-02-08 15:04:06 +01:00
Louis Dureuil
c443ed7e3f
delete inner .gitignore 2024-02-08 15:04:06 +01:00
Louis Dureuil
db722d201a
Write entries into database downgraded to trace level 2024-02-08 15:04:05 +01:00
Louis Dureuil
91eb67e981
logs route: make memory profiling toggling usable 2024-02-08 15:04:05 +01:00
Louis Dureuil
902d700a24
Tracing trace: toggle the profiling of memory at runtime 2024-02-08 15:04:05 +01:00
Tamo
f70a615ed9
update the github discussion links 2024-02-08 15:04:05 +01:00
Tamo
7ff722b72e
get rids of the log dependencies everywhere 2024-02-08 15:04:05 +01:00
Tamo
bcf7909bba
add a profile_memory parameter disabled by default 2024-02-08 15:04:05 +01:00
Tamo
ceb211c515
move the /logs route to the /logs/stream route 2024-02-08 15:04:05 +01:00
Clément Renault
f3c34d5b8c
Simplify MemoryStats fetching 2024-02-08 15:04:05 +01:00
Tamo
4de2db6786
add back the actix-web logs 2024-02-08 15:04:05 +01:00
Louis Dureuil
661baa716b
logs route profile mode: don't barf bytes if the buffer is not empty 2024-02-08 15:04:05 +01:00
Clément Renault
02dcaf07db
Replace the procfs by libproc 2024-02-08 15:04:05 +01:00
Louis Dureuil
d78ada07b5
spanstats: change field names 2024-02-08 15:04:05 +01:00
Louis Dureuil
bc097d90cb
tracing-trace: Spanstats deserializable + public fields 2024-02-08 15:04:05 +01:00
Clément Renault
b393823f36
Replace stats_alloc with procfs 2024-02-08 15:04:05 +01:00
Tamo
e773dfa9ba
get rids of log in milli and add logs for the bucket sort 2024-02-08 15:04:05 +01:00
Tamo
f158e96fe7
fix the auth 2024-02-08 15:04:05 +01:00
Tamo
e23ec4886d
fix the tests and add tests on the experimental features 2024-02-08 15:04:03 +01:00
Tamo
7793ba67a4
hide the route logs behind a feature flag 2024-02-08 15:03:33 +01:00
Tamo
80774148fd
handle and tests errors 2024-02-08 15:03:33 +01:00