[Borgbackup] Why does borg delete/prune write a bunch of new data?

John Goerzen jgoerzen at complete.org
Tue Dec 20 18:09:48 EST 2016


I've done some digging into this, and it seems the reason is
compact_segments() in repository.py.

It both deletes the segments that are completely unused, and also (if
I'm understanding correctly), takes segments containing some objects
that are unused and some objects that are still used and writes new
segments containing only the used objects.

The end result is some space savings, at the cost of a lot of I/O.  I
wonder how hard it would be to support deleting unused segments without
bothering to rewrite segments that are partially used?

thanks,

John

On 12/20/2016 08:28 AM, John Goerzen wrote:
> Hi folks,
>
> So I'm doing some testing of Borg.  My ultimate aim is to rsync the
> backups to a dumb (WebDAV or S3-type) host.
>
> I made a run of borg over a real subset of my data, about 80GB worth. 
> I then cleaned up and deleted a good chunk of data throughout that
> area, and made another archive with borg create.
>
> So far so good.  Now I ran borg delete to remove the archive with all
> the extra data.  Sure enough, about 2GB freed up on the disk after.
>
> However, watching the process with strace and examining the
> filesystem, I observed it wrote a considerable amount of new segments
> to the data directory.  A little analysis with ls and du shows it
> wrote right around 2GB of new segments.  (It also, of course, unlinked
> a considerable number of segments.)
>
> Having to rsync 2GB of new data every time I delete data is going to
> be rather sub-optimal on my poor DSL.  Any ideas why it's doing this? 
> FWIW the index file is only a few tens of MBs.
>
> I'm using encryption and lzma compression.  I did double the
> max_segment_size from 5MB to 10MB (a lot of experience with obnam
> suggested this would improve the performance over the rsync situation)
>
> Thanks,
>
> John
>
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup



More information about the Borgbackup mailing list