[Borgbackup] faster / better deletion, for a bounty?

Thomas Waldmann tw at waldmann-edv.de
Wed Dec 21 08:31:06 EST 2016


Hi Mario,

> But now the disk is ~98% full

Avoid that it fills up completely, borg needs free space, even for delete.

> (1) My archive is now 3.4 TB (reported with 'du'), but borg list says
>     the deduplicated archive size is 1.82 TB. Why are the two numbers
>     off by 50%? Below the full output of my borg list.

Did you activate append-only mode for the repo?

While append-only is set, borg prune/delete will not be able to really
remove data.

> (2) In the last months, my backup size went up quite a lot, even though
>     I did not change anything in borg. So I'd like to reverse engineer
>     which archives (or which files) contribute to the sudden increase in
>     size. I tried "borg list" on all archives, but only 7 have ~3 GB of
>     deduplicated space, and all others have less than 1 GB of dedup space!
>     I assumed 533 archives of ~1 GB dedup size = 533 GB total,

No, that is only the sum of the space ONLY used by a single archive.

As soon as the same chunks are used by more than 1 archive, it does not
show up as "unique chunks" any more.

>     How would I find the archives that free most space when deleted?

For a single archive deletion, that is the unique chunks space
("deduplicated size") of that archive.

For multiple archive deletion there is no easy way to see beforehands.

> (3) borg delete was incredibly slow for me. I killed it after two hours,
>     and it had read 500GB of the archive by then (reported with iotop).
>     I understood from IRC discussion that both prune and delete would
>     require reading the full 3.4 TB once per run, to sanitize some index?

No, they usually do not need to read all your data.

The worst case might be that, though.

>     Are there tricks or workarounds, for example when
>     deleting only from localhost?

If you use borg with encryption (default), you'ld need to use the
encryption key on the repo machine. It depends on how much you trust
that machine whether you want to do that or not.

>     I'd like to offer a bounty of ~€20-€25 for a better solution, or a
>     generally much faster delete and/or much faster prune.

Some improvements will come with borg 1.1 (which is currently still in
beta, so be very careful).

> PS: My preferred deletion pattern would keep an increasing number of
>     archives over time, like monthly backups from the past 10 years,
>     weekly from the past year, and  daily from past month.

That's how borg prune works.

>     I can build
>     this list of deletions with bash easily! But borg delete or prune
>     are currently *way* to slow to be used this way :-(

borg prune (when it deletes more than 1 archive per run) is faster than
borg delete. It uses delete internally, but doing multiple deletes at
once is a bit more efficient.

Cheers,

Thomas

-- 

GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393



More information about the Borgbackup mailing list