[Borgbackup] Deduplication of tar files - doesn't seem to be giving good performance
Marcus Schopen
lists at localguru.de
Thu Apr 21 05:50:29 EDT 2016
Hi,
Am Donnerstag, den 21.04.2016, 05:45 +0000 schrieb William Gogan:
> I'm trying borgbackup out, and so far it's performing really well in
> almost all tests.
>
>
> The one item where I'm seeing odd performance is for tar files. It
> appears not to be deduplicating except within the current archive.
>
>
> Background: Our VM tool kicks out a .tar file per container. It
> compresses (lzo) the .tar. For discussion purposes, let's pretend it's
> called vm.tar.lzo
>
>
> So, I call `lzop vm.tar.lzo -d --to-stdout | borg create --verbose
> --stats --progress --chunker-params 19,23,21,4095 --compression
> lz4 /dir/borg/::2016-04-21-01-38 -` - I assumed lzo would wreck borg's
> dedupe, so I pipe in the decompressed version.
>
>
> Even if I generate a .tar file, then immediately generate a second one
> (within <30s of the first), and then feed them both to borgbackup, it
> shows about 80% of the blocks as non-duplicates despite 99% of the
> files not having changed on the disk (and so should not have changed
> in the .tar)
>
>
> I looked at the FAQ, and it does make specific mention of doing well
> at VM backups, so I'm wondering if I'm doing something wrong.
>
>
> What can I do to get better dedupe performance? I considered adding
> tar to the mix and untarring the file before piping it to borg, but
> that seems suboptimal.
>
>
> If anyone has any suggestions, I'd welcome them!
I have a similar deduplication problem with partclone images I'd like to
backup. Andy ideas of another dumper (instead of raw dd)?
Ciao
Marcus
More information about the Borgbackup
mailing list