[Borgbackup] Recreate segments after changing max_segment_size

Sebastian Felis sebastian at silef.de
Thu Jan 14 12:35:09 EST 2021


Hi again,

I stumbled accross issue #3631 borg recreate optimisations [1] which explains my slow speed of recreate a whole repo. It says that the current algorithm reads archive per archive, byte by byte. So deduplicated files are read several times from the repo.


Just to clarify my usecase of concatenating segements and to speed up things: Regarding the data structures [2] the segements are just a successive list of log entries prefixed with a 'BORG_SEG' magic header. So it should be possible to concat serveal segements by removing the magic header except the first segement.

According to the index doc, the index, hints and integrity files can be deleted and are rebuilt on the next run [3]

It should be also possible to rename the segments as long as the sequence of log entries remains.

e.g. given sements 0, 1, 2, 3 will be concatenated to segement 0

tail -c +9 data/0/1 >> data/0/0
tail -c +9 data/0/2 >> data/0/0
tail -c +9 data/0/3 >> data/0/0
rm data/0/[123]
rm hints.* index.* integrity.*

Question:

Is there any risk doeing this?

Sebastian

[1] https://github.com/borgbackup/borg/issues/3631
[2] https://borgbackup.readthedocs.io/en/stable/internals/data-structures.html#segments
[3] https://borgbackup.readthedocs.io/en/stable/internals/data-structures.html#index-hints-and-integrity

Am Mittwoch, Januar 13, 2021 09:32 CET, schrieb Sebastian Felis via Borgbackup <borgbackup at python.org>:
 Hi,

first of all: Thank you so much for this awesome backup tool. It's
really a joy using it.


I have several borg repos with small segment sizes of 5 MB. I would like
to recreate the segments to a larger size for better repo backup with
rsync/rclone. I am aware of the recreate command and its documentation
and I am using borg 1.1.14 on debian buster.

A conversion was successful for a smaller repo with "recreate
--recompress always" after changing the max_segment_size. For my TB
sized media repo it seems to be a bit slow.

Without the option "--recompress always" the smaller segments are not
repacked to larger segments.


My questions:

1) What is the best way just to repack all chunks to a new segment size
for a repo?

2) Does the recreate command on a repo honor deduplication and repack
the unique chunks in segments only once? Even with "--recompress always"
option?

BR

Sebastian
_______________________________________________
Borgbackup mailing list
Borgbackup at python.org
https://mail.python.org/mailman/listinfo/borgbackup


 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/borgbackup/attachments/20210114/9800483b/attachment.html>


More information about the Borgbackup mailing list