[Borgbackup] BorgBackup and GlusterFS storage ?

Thomas Waldmann tw at waldmann-edv.de
Thu Mar 15 09:57:58 EDT 2018


> Are there any known issues/concerns with storing Borg repositories on a GlusterFS volume (Erasure Coded)? 

I didn't use / test GlusterFS yet. But the general rule for borg repo FS
is that borg expects a sane and consistent POSIX-like fs behaviour.

The docs have some infos about what we expect from the FS.

> Any limits/recommendations for size of repos and data?

The biggest segment file number can be ~ 2^32.
The max. configurable segment file size is ~4GB, default is 500MB (borg
1.1). So, multiplied, we are at ~ 2^64B.

That's the only "repo limit" I am aware of right now.

But there is an archive limit:

http://borgbackup.readthedocs.io/en/stable/internals/data-structures.html#archives

Besides these, the more files you have in the backup set and the more
total chunks you have in the repo, the more memory you will need for the
files cache and chunks index.

http://borgbackup.readthedocs.io/en/stable/internals/data-structures.html#indexes-caches-memory-usage

> Some groups may have ~300TB of data to backup. Is it advisable to split it between multiple repos?

If you want to run borg check, that would take rather long for 300TB.

So yes, better make multiple smaller repos.

It's also better for concurrent access as borg will lock repos when
working with them.

It is also faster to not backup into the same repo from multiple client
machines because borg then does not need to resync its chunks cache.

Considering the scale (and FS), I am not sure how many people used borg
and that large repos (on GlusterFS) before, so be careful.

> I plan to populate 1PT with Borg backups (multiple repos) and considering my options. 

Quite a lot of data.

Did you estimate how much would you save by using borg and its
compression/dedup?

In any case, I would be very much interested in the outcome of this, so
keep us updated.

> If I understand it correctly  BorgBackup is good fit for GlusterFS's EC volumes - > the segments don't change much (at all?) once created and only used
for RO operations

This completely true for append_only mode (== never effectively deleting
anything).

For normal mode, borg will run compact_segments() after doing write /
delete operations to the repo. This will read segment files with unused
entries and rewrite the used entries to new segment files. For borg 1.1
this will only happen above some hardcoded threshold unused/used ratio.

> I also like to increase max_segment_size from 512MB to a large value (2GB),> is it as simple as 524288000*4?

Yes.

> My goal is to have less files on GlusterFS volume(s).

Be aware that for compacting such a large segment file, it will read it
completely and write the new compacted one to storage again.

The threshold will make sure that this won't happen for only tiny
amounts of unused entries though.

> I've been using Borg for at least a year now and it seems to work> very well for all my other projects involving backing data or Linux
systems...

Great. We always try to fix severe bugs ASAP.

Cheers, Thomas

--

GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393



More information about the Borgbackup mailing list