[Borgbackup] Test : Borg vs Restic

Melkor Lord melkor.lord at gmail.com
Mon Sep 11 22:12:52 EDT 2017


Hi,

I'm evaluating the right backup service for my needs. This boils down to
comparing Restic and Borg as they represent the "top of the shelf"
solutions currently available on Linux.

Borg : 1.1.0rc3 (from the borg-linux64 binary)
Restic : 0.7.1 (from the restic linux_amd64 binary)

The backup data consists of a live mail repository using the maildir format
and holding 139 GB (2327 dirs, 665456 files).

Keep in mind that this is the result of my own experience, for my own needs
and this is in no way thorough nor exhaustive.

BORG :
======

Note: Encryption is "repokey-blake2"

* First pass

Shell# time ./borg-linux64 create --info --stats --progress
/path/to/BackupTests/Borg::{now:%Y%m%d-%H%M%S} /path/to/Mail/
Enter passphrase for key /path/to/BackupTests/Borg:
------------------------------------------------------------------------------
Archive name: 20170911-164308
Archive fingerprint:
ea043fb5154c60ecdcb42e3be238cfa2ad040e03349f5ae5cab6a9f9f8fd48fe
Time (start): Mon, 2017-09-11 16:43:11
Time (end):   Mon, 2017-09-11 17:58:32
Duration: 1 hours 15 minutes 20.67 seconds
Number of files: 646835
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated
size
This archive:              137.47 GB            112.55 GB            102.30
GB
All archives:              137.47 GB            112.55 GB            102.30
GB

                       Unique chunks         Total chunks
Chunk index:                  639574               680719
------------------------------------------------------------------------------

real    75m33.037s
user    23m22.756s
sys     3m51.228s

* Second pass

Shell# time ./borg-linux64 create --info --stats --progress
/path/to/BackupTests/Borg::{now:%Y%m%d-%H%M%S} /path/to/Mail/
Enter passphrase for key /path/to/BackupTests/Borg:
------------------------------------------------------------------------------
Archive name: 20170911-181622
Archive fingerprint:
67e9f0e14fa092d274e99833806ca789eb88df890190ec37cedb5e4af20107a0
Time (start): Mon, 2017-09-11 18:16:25
Time (end):   Mon, 2017-09-11 18:18:27
Duration: 2 minutes 1.73 seconds
Number of files: 646861
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated
size
This archive:              137.47 GB            112.55 GB             14.57
MB
All archives:              274.94 GB            225.10 GB            102.32
GB

                       Unique chunks         Total chunks
Chunk index:                  639728              1361461
------------------------------------------------------------------------------

real    2m13.070s
user    1m55.448s
sys     0m12.652s


RESTIC :
========

* Fist pass

Shell# time ./restic_0.7.1_linux_amd64 backup -r
/path/to/BackupTests/Restic/ /path/to/Mail/
enter password for repository:
scan [/path/to/Mail]
scanned 2327 directories, 665464 files in 0:02
[1:48:16] 100.00%  21.440 MiB/s  136.009 GiB / 136.001 GiB  667813 / 667791
items  0 errors  ETA 0:00
duration: 1:48:16, 21.44MiB/s
snapshot 9abedefd saved

real    108m23.314s
user    48m2.328s
sys     6m12.984s

* Second pass

Shell# time ./restic_0.7.1_linux_amd64 -r /path/to/BackupTests/Restic/
backup /path/to/Mail/
enter password for repository:
using parent snapshot 9abedefd
scan [/path/to/Mail]
scanned 2327 directories, 665575 files in 0:04
[0:47] 100.00%  2.855 GiB/s  136.010 GiB / 136.010 GiB  667902 / 667902
items  0 errors  ETA 0:00
duration: 0:47, 2920.94MiB/s
snapshot 6c90edf6 saved

real    0m55.859s
user    2m10.312s
sys     0m9.364s

BORG vs RESTIC on Backup :
==========================

- Borg is way faster on first pass (1h15m vs 1h48m) but significantly
slower on second pass (2m1s vs 47s)

- Borg repo size (103 GB) is smaller than Restic repo size (121 GB)

BORG vs RESTIC on mounted archives :
====================================

* Simple access to the mounted repositories :

Shell# time ls -l BorgMount/20170911-181622/
total 0
drwxr-xr-x 1 root root 0 sept. 11 19:48 path

real    0m22.383s
user    0m0.000s
sys     0m0.000s

Shell# time ls -l ResticMount/snapshots/2017-09-11T18\:15\:18+02\:00/
total 0
drwx------ 3 mail mail 0 déc.   7  2014 Mail

real    0m0.003s
user    0m0.000s
sys     0m0.000s

- Borg needs 22 seconds to internally build the directory tree, Restic is
instant.

- Interesting note : The first visible directory is exactly the specified
backup path "path" (/path/to/Mail) for Borg whereas Restic only keeps the
last path component "Mail" (/path/to/Mail).

* Extract some path from the mounted repositories :

- Shell# time cp -a BorgMount/20170911-181622/path/to/[...]/Trash
BorgRestore/

real    3m36.534s
user    0m0.396s
sys     0m7.944s

NOTE: CPU usage was spiking at 100% when no disk activity (building
internal listings is my guess) and jumping between 31~67% for disk activity
(actual copy process)

- Shell# time cp -a
ResticMount/snapshots/2017-09-11T18\:15\:18+02\:00/[...]/Trash
ResticRestore/

real    6m23.970s
user    0m0.496s
sys     0m13.708s

NOTE: CPU usage never spikes and constantly jumps between 21~53% for the
whole process

- The "Trash" directory is 6.3 GB big with 47945 files in it.

- Borg is faster by a factor of 2 to restore the exact same data using
about 2x more CPU.


* Fetch deep info on mounted repositories :

Shell# time du -s --si ResticMount/snapshots/2017-09-11T18\:15\:18+02\:00/
147G    ResticMount/snapshots/2017-09-11T18:15:18+02:00/

real    1m18.590s
user    0m0.800s
sys     0m4.036s

NOTE: CPU usage around 46% for the whole process

Shell# time du -s --si BorgMount/20170911-181622/
138G    BorgMount/20170911-181622/

real    5m30.143s
user    0m0.864s
sys     0m4.956s

NOTE: CPU usage at 100% for the whole process

- BORG is about 5x slower to get the same information


BORG vs RESTIC trying to backup while having mounted archives :
===============================================================

NOTE: Typical use case would be trying to restore a very big file in a very
nested/complex directory hierarchy that would make this impractical using
the "extract/restore" command. Retrieving the said file would be so time
consuming that it would overlap with the next scheduled backup for example.

Shell# ./borg-linux64 create --info --stats --progress
/path/to/BackupTests/Borg::{now:%Y%m%d-%H%M%S} /path/to/Mail/
Failed to create/acquire the lock /path/to/BackupTests/Borg/lock (timeout).

Shell# ./restic_0.7.1_linux_amd64 -r /path/to/BackupTests/Restic backup
/path/to/Mail/
enter password for repository:
using parent snapshot 6c90edf6
scan [/path/to/Mail]
scanned 2327 directories, 665655 files in 0:03
[0:38] 100.00%  3.518 GiB/s  136.039 GiB / 136.039 GiB  667982 / 667982
items  0 errors  ETA 0:00
duration: 0:38, 3581.52MiB/s
snapshot 64106e49 saved

NOTE: Here the Restic design have a clear advantage. Quoting the doc : "All
files in a repository are only written once and never modified afterwards.
This allows accessing and even writing to the repository with multiple
clients in parallel".


Features I like in Restic :
===========================

- Nothing is written outside the repository
- The "views" on mounted repositories (host, snapshots and tags)
- Multiple "keys" per repository (like LUKS)


Features I *DISLIKE* in Restic :
================================

- The "mount" command blocks the shell and waits for "CTRL-C" to end.
Trying to unmount while mountpoint is busy (ie: cd /path/to/mountpoint in a
different shell) ends up badly :
"unable to umount (maybe already umounted?): exit status 1: fusermount:
failed to unmount /path/to/BackupTests/ResticMount: Device or resource busy"
and needs manual intervention. Borg also complains but a second invocation
to "umount" command works once the "business" state is lifted.
- No support for sparse files (AFAIK) which makes it not usable for VM
images and such.


Features I *DISLIKE* in Borg :
==============================

- Writes several files OUTSIDE the repository, ~/.config/borg and
~/.cache/borg and AFAIK, there's no option to use another paths for these
files.
- The "several seconds or more" delays when mounting repositories and
scanning deeper directories.

I can live with the delays but I really wish there was an option to
relocate the ".config" and ".cache" data. I need this because it makes it
easier to copy the data offsite without forgetting anything! I know that
".cache" is disposable bug having this data available when restoring in
case of disaster recovery is a huge gain of time.

Features I *DISLIKE* in BOTH tools :
====================================

- Their design geared at "backup-and-push-to-repository" which is nice but
not desired in my environment. I need a
"repository-pulls-backup-from-agent" design. There could be in both tools
an additional "agent" command that would :
  * Use ssh transport by default to contact an host and the ssh keys
benefits (authorized keys, )
  * Spawn a Borg/Restic instance to make the backup on the remote host
(like a normal Borg call) but feed the result back to the calling Borg,
which holds the repository
  * A way to securely transmit the repokey data to the remote instance so
the local Borg can mount/check the local repository

  Of course, it would be of the administrator responsability to setup
everything accordingly to use either one repokey for every remote host or
script something a bit smarter to use a repokey per host or group of hosts,
whatever suits the needs.

  Why such a setup?

  Because, in my case at least, the backup server is of critical importance
and network isolated from the other hosts. I really don't want the
"all-hosts-can-contact-the-backup-server" style but the
"only-backup-server-can-contact-hosts" kind of behavior. This also helps to
limit the strain on the backup server. Having all the hosts, with no
predictable backup size, hammering the backup server at the same time
(cronjob) is not desirable, especially on sites with storage on budget :-)

  For instance, I currently use a very spartan/crude system but which is
rock solid and never failed once in over two decades. A simple script
which, in sequence, connects via SSH to each host and uses the remote tar
command to perform the backup. SSH's piped stdout/stderr allows to retrieve
the tarball as well as errors and act accordingly. This is not scalable but
highly effective, battle tested and disaster recovery proven! Booting a new
server with some rescue OS and restoring from a tarball works in ALL
conditions, no matter how long it takes :-) But now, I need encryption and
deduplication given the huge sizes of the data to backup, hence my tests
with Borg/Restic which both have nice features *AND* provide a single file
binary for disaster scenarios.


-- 
Unix _IS_ user friendly, it's just selective about who its friends are.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20170912/f47ce5b9/attachment-0001.html>


More information about the Borgbackup mailing list