[Borgbackup] Fw: pve-zsync + borgbackup "changed size" discrepancy
Thomas Waldmann
tw at waldmann-edv.de
Mon Mar 4 11:06:05 EST 2019
> shouldn't we need some "backup efficiency advisor" which should be able to give a better hint on "how to get the most out of borg" ?
feel free to code one. :)
just be aware that not everybody has the goal of "take as little repo
space as possible".
there are also:
- "do not make my machine run out of memory"
- "be as fast as possible".
- "be able to process a huge amount of data"
> what about a tool which analyses the borg repo and giving a hint for more space efficient usage ?
i guess if you want to determine good chunker params you would need to
backup the source data using different chunker params and compare the
results.
which is a rather expensive operation if you want to try a lot of
different combinations (and maybe your input data is also not small).
also, the chunker is seeded with a random value, so the results might
not be 100% reproducable.
> at least i would have found it useful to have such information at hands (in the manpage for example) what tuning knob we need to look at to make backups of large-files-with-tiny-changes much more space efficient...
if you can improve the docs, do a pull request.
> mind that even with "--chunker-params 10,23,16,4095" borg backup diff grows up to 7.98MB where xdelta3 reports a diff of only 158218 . so this is still not optimal, but a value i can live with....
compared to other tools, borg's goal not finding the minimal diff
between two files, but rather:
- speed
- logically stable chunk cutting points (good if data gets inserted /
removed)
- produce a manageable amount of chunks (with default chunker params)
borg tries to cut chunks of roughly some give target size (default 2MB).
so if your file is 100MB, that is ~ 50 chunks.
you can totally spoil the dedup by 50 times changing 1 byte in each chunk.
if you produce 500 chunks and have the same 50 changes, your dedup still
works in 90% of chunks. but you'll need 10x as much memory to manage the
chunks.
a specialised tool (i don't know xdelta3, but i assume this might be
one) can produce a very small diff by comparing the 2 files - but that
is not how borg works.
--
GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393
More information about the Borgbackup
mailing list