[Borgbackup] No de-duplication on big VM raw-backups/images

Roland @web.de devzero at web.de
Thu Feb 20 07:31:54 EST 2020


Am 20.02.20 um 07:33 schrieb Heiko Helmle:
> You're right.
>
> Borg's default chunker params and proxmox are not working together at all.
>
> I did some benchmarking when setting up our proxmox and we settled for 16,23,16,4095.
>
> Results were similar - gaining from almost no dedup to almost full dedup.
>
> The same values work pretty well for mongodump's output too!
>
> Would it make sense to gather experience values with the chunker params somewhere (wiki?), so people could look up some good starting params for their specific workload?
yes, please, i'd also like some "real world data params advisor"

also see https://mail.python.org/pipermail/borgbackup/2019q1/001280.html
and https://pve.proxmox.com/pipermail/pve-user/2019-March/170454.html
>
> The current default values seem to be optimized for vmdk and raw VM volume data by the looks of it.
are they?

regards
roland


>
> Best Regards
>   Heiko
>
> -----Original Message-----
> From: Borgbackup <borgbackup-bounces+heiko.helmle=horiba.com at python.org> On Behalf Of Mateusz Kijowski
> Sent: Mittwoch, 19. Februar 2020 13:24
> To: Stefan Bauer <sb at plzk.de>
> Cc: borgbackup at python.org
> Subject: Re: [Borgbackup] No de-duplication on big VM raw-backups/images
>
> Hi, I have the same use-case,
>
> see this thread https://mail.python.org/pipermail/borgbackup/2017q4/000940.html
>
> TL;DR; version is: it's most likely the chunking parameters that prevent you from efficient dedupe. The chunker-params I currently use
> are: 10,22,16,4095
>
> Regards,
>
> Mateusz
>
> śr., 19 lut 2020 o 11:07 Stefan Bauer <sb at plzk.de> napisał(a):
>> Hi,
>>
>>
>> use case is sending proxmox backups to remote site for disaster-recovery.
>>
>>
>> given is a single file in a directory:
>>
>>
>> -rw-r--r-- 1 root root 5.5G Feb 19 10:36
>> vzdump-qemu-202-2020_02_19-10_02_38.vma
>>
>>
>> It's uncompressed. First borg backup with --compression zlib,6 sends ~ 2,7GB to remote site. That is good.
>>
>>
>> Next day, we have two files in the directory. Almost no changes between the files.
>>
>>
>> -rw-r--r-- 1 root root 5.5G Feb 19 10:36
>> vzdump-qemu-202-2020_02_19-10_02_38.vma
>>
>> -rw-r--r-- 1 root root 5.5G Feb 20 10:36
>> vzdump-qemu-202-2020_02_20-10_02_38.vma
>>
>>
>> Now, borg sends _again_ the complete ~ 2,7GB of the new file to remote site. Why?
>>
>>
>> I would expect that _at least_, not all again, will be transfered.
>>
>>
>> Am i doing something wrong?
>>
>>
>> Thank you.
>>
>>
>> Stefan
>>
>> _______________________________________________
>> Borgbackup mailing list
>> Borgbackup at python.org
>> https://mail.python.org/mailman/listinfo/borgbackup
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
> Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup


More information about the Borgbackup mailing list