2057: fix(dump): Uncompress the dump IN the data.ms  r=irevoire a=irevoire

When loading a dump with docker, we had two problems.
After creating a tempdirectory, uncompressing and re-indexing the dump:
1. We try to `move` the new “data.ms” onto the currently present
   one. The problem is that if the `data.ms` is a mount point because
   that's what peoples do with docker usually. We can't override
   a mount point, and thus we were throwing an error.
2. The tempdir is created in `/tmp`, which is usually quite small AND may not
   be on the same partition as the `data.ms`. This means when we tried to move
   the dump over the `data.ms`, it was also failing because we can't move data
   between two partitions.
------------------
1 was fixed by deleting the *content* of the `data.ms` and moving the *content*
of the tempdir *inside* the `data.ms`. If someone tries to create volumes inside
the `data.ms` that's his problem, not ours.
2 was fixed by creating the tempdir *inside* of the `data.ms`. If a user mounted
its `data.ms` on a large partition, there is no reason he could not load a big
dump because his `/tmp` was too small. This solves the issue; now the dump is
extracted and indexed on the same partition the `data.ms` will lay.

fix #1833

Co-authored-by: Tamo <tamo@meilisearch.com>
This commit is contained in:
bors[bot] 2022-01-10 17:57:16 +00:00 committed by GitHub
commit 1818026a84
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 24 additions and 4 deletions

View File

@ -185,7 +185,7 @@ pub fn load_dump(
let mut meta_file = File::open(&meta_path)?;
let meta: MetadataVersion = serde_json::from_reader(&mut meta_file)?;
let tmp_dst = tempfile::tempdir()?;
let tmp_dst = tempfile::tempdir_in(dst_path.as_ref())?;
info!(
"Loading dump {}, dump database version: {}, dump version: {}",
@ -225,14 +225,34 @@ pub fn load_dump(
indexer_opts,
)?,
}
// Persist and atomically rename the db
let persisted_dump = tmp_dst.into_path();
// Delete everything in the `data.ms` except the tempdir.
if dst_path.as_ref().exists() {
warn!("Overwriting database at {}", dst_path.as_ref().display());
std::fs::remove_dir_all(&dst_path)?;
for file in dst_path.as_ref().read_dir().unwrap() {
let file = file.unwrap().path();
if file.file_name() == persisted_dump.file_name() {
continue;
}
if file.is_file() {
std::fs::remove_file(&file)?;
} else {
std::fs::remove_dir_all(&file)?;
}
}
}
std::fs::rename(&persisted_dump, &dst_path)?;
// Move the whole content of the tempdir into the `data.ms`.
for file in persisted_dump.read_dir().unwrap() {
let file = file.unwrap().path();
std::fs::rename(&file, &dst_path.as_ref().join(file.file_name().unwrap()))?;
}
// Delete the empty tempdir.
std::fs::remove_dir_all(&persisted_dump)?;
Ok(())
}