[Borgbackup] Backup process of a huge archive is sometimes fast, and sometimes very slow

Fri May 17 06:45:24 EDT 2019

Hello Thomas, thank you for your answer !

Guess you inserted the wrong log here, this was also a slow run.
>

I indeed copy/paste the exact same log, sorry for that. This quicker one is
this one :

Time (start): Wed, 2019-05-15 09:09:13
Time (end):   Wed, 2019-05-15 09:21:23
Duration: 12 minutes 9.79 seconds
Number of files: 1648114
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated
size
This archive:              722.00 GB            701.46 GB              3.90
GB
All archives:               44.88 TB             43.45 TB            724.68
GB

                       Unique chunks         Total chunks
Chunk index:                 1687853            125875061
------------------------------------------------------------------------------

In both cases the unit of deduplicated column was in GB, so not so much
difference in size.

Run the borg create with --list option, so you'll see the status for
> each file.
>

Great idea ! I'll try with that, hoping the log size will not kill my
system :).

> I suspect that for the slow runs, borg detects many (all?) files as
> potentially changed.
>
> That can be either a content change or a metadata change size / ctime /
> inode number (size of course also means content change in any case,
> ctime could be also just a metadata change like acls or xattrs and
> inodes could just be unstable due to the filesystem you are using -
> network filesystems often have unstable inodes).
>

Once I got the result of --list option, it would be easier to check these
parameters. I may also try to map all the inodes from the tree to check if
they have changed

> Not the best setup for time measurements, though.
> Also, when you have a cloud server, there might be a lot of other
> circumstances influencing your measurement.
>

Actually I have one borg repo for each of my clients. This one is the
bigger one and only this one is causing me trouble. I mean, maybe the
others can have huge time difference too, but they are too small compared
to this fat one to be noticeable in the overall process. I have others
repos of 500GB or 400GB and I have not noticed any performances issues with
these ones

So the most likely is there could be a huge change of metadata with this
specific client

source filesystem with the data you backup is ...?
>

It's ext4 over LVM.

Thank you again for your time.

On Fri, May 17, 2019 at 11:46 AM Thomas Waldmann <tw at waldmann-edv.de> wrote:

> > Most of the time, the process is quick enough, only take 15 minutes to
> > complete:
> >
> > Time (start): Thu, 2019-05-16 03:53:42
> > Time (end):   Thu, 2019-05-16 10:55:10
> > Duration: 7 hours 1 minutes 27.98 seconds
>
> Guess you inserted the wrong log here, this was also a slow run.
>
> > This archive:              726.59 GB            706.02 GB
> 4.60
>
> Also, the unit for the rightmost column got truncated, but would be
> important.
>
> > I obviously do not control what files/folder are changed in the tree.
>
> Run the borg create with --list option, so you'll see the status for
> each file.
>
> I suspect that for the slow runs, borg detects many (all?) files as
> potentially changed.
>
> That can be either a content change or a metadata change size / ctime /
> inode number (size of course also means content change in any case,
> ctime could be also just a metadata change like acls or xattrs and
> inodes could just be unstable due to the filesystem you are using -
> network filesystems often have unstable inodes).
>
> ls -i shows the inode number for a file.
>
> > I run two borg backups in parallel to save to two different backup
> > server. They process the files at the same time wether the result is
> > fast or slow.
>
> Not the best setup for time measurements, though.
> Also, when you have a cloud server, there might be a lot of other
> circumstances influencing your measurement.
>
> Maybe not the root cause for this huge difference, just saying.
>
> > The source server is on AWS, it's an EBS running on a m5.large server.
>
> source filesystem with the data you backup is ...?
>
>
> --
>
> GPG ID: 9F88FB52FAF7B393
> GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20190517/26b33a43/attachment-0001.html>