[Borgbackup] extended prune (was: faster / better deletion, for a bounty?)

Mario Emmenlauer mario at emmenlauer.de
Fri Dec 23 18:21:00 EST 2016


Hi,

whow that was a quick reply! :-) Thanks! :-)

On 23.12.2016 23:59, John Goerzen wrote:
> Just a couple quick comments:
> 
> 1) Would those prune patches better be a 'borg delete' patch allowing
> specification of multiple archives to zap?

I'm fully open to that. My logic was that prune and delete are
separated by the fact that prune performs multiple deletions in
one go, and so my patch would fit prune more than delete. But
I'm really new to borg, and any advise is happily accepted!
What do others think? Is my patch worthwhile at all?


> 2) You might try a hostname-monthly- or hostname-weekly- pattern in your
> archive naming to let you achieve what you want with prune.

Yes, as long as I'd keep a similar pattern its true. But I'm very
fond of the new option because for my "huge" prune I was able to pick
a crude mix of hand-picked archives together with different patterns
for different times of different hosts. With some bash-foo that was
less than ten minutes of work, and I could pass them all to prune in
one go (hopefully making the best use of disk I/O). In fact it would
be trivial to pick any other wild selection like the X largest archives
or whatever, by combining borg's statistics with the new --remove-list
option.

But really this is just be me, and admittedly I did not invest too
much time to try to understand prune's current behaviour :-)

Oh and a related note: Thomas mentioned in IRC the idea that borg
could use a garbage collection instead of immediate deletions. I very
much cherish this idea because deletions could be instant, and disk
space can be freed with "borg gc" whenever suitable (i.e. after a
long repo re-organization with deletions, renames, new backups etc).

Cheers,

    Mario



> John
> 
> On 12/23/2016 04:53 PM, Mario Emmenlauer wrote:
>> Dear All,
>>
>> it seems I am pretty lucky this year to have an early Christmas
>> present. First, I found the large archives in my repo by pure
>> chance and could free 50% of disk space with only a few deletions.
>> Now the actual disk usage is at 1.8TB again, which matches borg's
>> report of deduplicated size.
>>
>> Furthermore, it seems that those huge deletions where the only
>> "slow" ones, because later I could prune another ~200 of ~500
>> archives in just little over 10 minutes, with borg 1.0.9.
>>
>> Finally, I seemed unable to get prune do exactly what I hoped for.
>> It might be me, but I did not find exactly the right combination of
>> options. I take backups once per week, and if they are older than
>> one year, I'd like to keep only every other week.
>> In any case I was also curious to enable prune to handle a manual
>> selection of archives, so I tried, and got it working pretty easily.
>> I extended archive.py and helper.py with two new prune options
>> --keep-list and --remove-list, where the former takes a list of
>> archives to keep (all others are pruned) and the latter takes a
>> list of archives to prune (all others are kept). My patch against
>> borg 1.0.9 is available here
>>    https://github.com/emmenlau/borg/tree/emmenlau_better_prune
>> and I'm happy to make a PR if anyone is interested (sorry for the
>> bold name, its really just a very minor extension to prune).
>>
>>
>> Finally, thanks a lot again for the very nice borg! Your code was
>> very easy to read, and I found very helpful compile instructions
>> in the readme! This allowed me to get productive within a few
>> minutes! Nice work! It would be awesome to add the pyinstaller
>> instructions to the readme, but they where sufficiently easy to
>> find in an github issue report.
>>
>> Thanks, and happy holidays,
>>
>>     Mario Emmenlauer



Viele Gruesse,

    Mario Emmenlauer


--
BioDataAnalysis GmbH, Mario Emmenlauer      Tel. Buero: +49-89-74677203
Balanstr. 43                   mailto: memmenlauer * biodataanalysis.de
D-81669 München                          http://www.biodataanalysis.de/


More information about the Borgbackup mailing list