[Borgbackup] Baffling behavior on low cpu system with borg backup

tmhikaru at gmail.com tmhikaru at gmail.com
Sun Jun 19 16:25:07 EDT 2016


	I accidentally replied directly to Thomas Waldmann several days ago
and did not realize I hadn't sent my reply to the mailing list.  Sorry about
that, here it is. I would appreciate feedback on trying to determine exactly
where the problem lies, or alternately, proving that none does.

Tim McGrath

On Thu, Jun 16, 2016 at 10:41:34AM +0200, Thomas Waldmann wrote:
<snip>
> Yes, this is expected.
> 
> If you want to avoid it, have a separate repo for the rpi
Not what I want, but I somewhat expected this response. If I absolutely have
to do it this way, I will.


> Don't you think it could be rather related to how much data there is (in
> total) in the repository at the time when you resync?
> 
> If there isn't much yet, it goes quickly.
> But if you have a lot of archives in there, with a lot of data, resyncing
> the chunks cache takes quite some time (even on much more powerful
> machines).

I can't guarantee it's not related to the amount of data that's in the
repository, but I really have a feeling something is going wrong when it
tries to sync large archives when cache already exists.  The amount of data
in the repository I have, along with the amount of archives, has only gotten
larger since I started using borg to do backups on the Rpi.  If the client
side syncs from a totally nonexistent cache (rm -r'd .cache) it takes quite
a while to finish the sync, but it actually finishes each archive sync
within minutes, rather unlike what happens if cache already exists.  When
cache already exists, it usually gets stuck at some point early on with the
first of the larger archives during the 'Merging into master chunks index'
part, and rather than taking a few minutes to process synchronizing the
archive like it does when cache is empty, it'll happily spin at 100% cpu for
hours seemingly with no progress - there is no noticable disk I/O, memory
pressure, or network traffic for this time, something that is quite a
different story when it's syncing from an empty cache.  I don't know what
it's doing, but it's clearly not working well, if at all.  I would very much
like to know what it's getting hung up on - is there some kind of verbosity
setting I could use to find out?  Even if this is merely it getting bogged
down by the size of the repository, I'd like to verify that it is in fact
doing that rather than guessing at its behavior.

To be clear, every single time I have blown away the cache and retried the
backup operation, borg has synchronized and completed the backup
successfully.  It's only when the cache already exists that it winds up
stuck.  I want to know why that makes a difference, and how I could work
around it without having to delete the cache every time I run it.


>borg uses locally cached per-archive chunk-indexes (except if you do the
>hack to save space by disallowing this) to save some data-transfer from
>remote and also to only have to do this computation once per archive.  The
>code that merges these single-archive indexes into the global index is pure
>C and quite fast.

Before I go and do something monumentally stupid, if I wanted to test if
it's getting hung up in this per archive chunk index generation you're
talking about, would performing this hack be a good way to find out?  I
don't care about speed or disk space at this point, I want to find out
what's going on.


> Attic did not work like you think. Maybe you just read some over-simplified
> explanation of it somewhere.
> 
> There is a "borgception" ticket in our issue tracker that describes a
> similar idea, but it is not implemented yet.

Thank you for clearing that up, I apologize for my ignorance.


> Borg (and attic) do not store secret keys or process unencrypted data on the
> server (except the latter, obviously, if you do not use encryption). Thus,
> it is not able to compute the chunk index.
> 
> This is a design decision as the repo storage is assumed to be potentially
> untrusted (e.g. a 3rd party machine, a usb disk).

Pity. I don't use encryption on my backups, at least not yet - I figured I'd
run into potential problems and didn't want to deal with that can of worms
complicating things even further.  I understand now why borg cannot do this
however, thank you for explaining.


> >Sshfs does have a major disadvantage though, in that borgs -x switch doesn't
> > work properly for backups done through it, so I had to add
> >specific exclusions for things like /proc, /sys, /dev/, /var/run, etc.  Not
> >fun.
> 
> Not sure what you mean.

When running on the actual hardware and using the -x switch borg will not go
from, say / into /proc.  An sshfs mount however seems to show up as one
giant filesystem as far as borg is concerned, so using -x does *not* prevent
it from going into the mounted /proc filesystem inside the sshfs mount!  -
My workaround for this was to make explicit excludes for the sshfs mount. 
It's kludgy, but it works.  I don't think this is a bug in borg, just... 
unexpected behavior.  This is the first time I've used sshfs, so this is
likely my own ignorance of how it works showing.

> The little RAM on the rpi might also get you into trouble, if you have a lot
> of data in the repo, see the formula in the docs.

Believe it or not, I thought the rpi would be impossible to run borg - but
although on my main server borg runs with a bogglesome ~2.1GB of ram
allocated, which would *never* fit on the Rpi which has ~490MB of ram
available, working on the same remote repository I've seen it use as little
as ~130MB and at most a little more than 300MB while doing its thing, and
I'm not using any space saving switches either.  The remote repo has ~1TB of
data in it, and this doesn't seem to push borg's memory constraints too far
on the rpi.  I was very surprised and impressed to say the least, I wasn't
expecting it to work at all.  Now I want to make it work better.


Thank you for taking the time to explain things to me, I appreciate it.
Tim McGrath
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 465 bytes
Desc: Digital signature
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20160619/2590ccf2/attachment.sig>


More information about the Borgbackup mailing list