[Borgbackup] Lots of files that change rarely and predictably

Thomas Levine _ at thomaslevine.com
Fri Oct 20 06:47:31 EDT 2017


Indeed, this is the annoying thing about MH format, but everything else
about it is so nice.

> I am not sure this is doable. You'ld still have to look into the
> directory for new files. Borg's files cache lookup only needs to know
> mtime, size and inode number to decide which files did not change.

I think I was unclear. The recent mail folders and files are practically
the only ones that ever change, so I want to tell borg to assume that a
file has stayed the same if it is outside of the recent mail directory.
I do not have to look at the mtime, size, nor inode number any old file
because I already know that I did not change it.

If I ever change the old emails, I will run the normal command.

The recent mail folders presently contain about 6,765 emails total, and
this is far less than the total 670,683 among all of the files.



I arrived on an approach of making two types of archives, one with all
670,683 files, (Call this the "full" backup.) and another with just the
6,765. (Call this the "recent" backup.) I would usually run the recent
backup, and I would run the full backup only when I changed files in
other directories. When I restore backups, I first extract the newest
full backup, and then I extract the newest recent backup on top of that.

I compared these two backup styles in borg 1.0.11 with the following
commands, run in succession. The first one is the recent backup, and
the second is the full backup.

  $ time borg create --compression lzma,9 \
    --exclude ,\* -v -e=repokey --exclude-caches \
    /repository/mh::recent-2017-10-20-laxar.laxask \
    context folders drafts inbox archive/2017-07/ a b c current sent

  # 1m06.12s real     0m33.13s user     0m30.90s system

  $ time borg create --compression lzma,9 \
    --exclude ,\* -v -e=repokey --exclude-caches \
    /repository/mh::full-2017-10-20-laxar.laxask \

  # 3m39.66s real     2m30.03s user     1m04.77s system

While the difference seems significant, it is not very large. In this
comparison I used SSD as the storage medium. I think the difference
could matter only on slow storage media.



I am going to stick with just doing full backups, as that they're don't
seem much slower and they gives me less to think about. If I find
myself using slow storage media for these data, I'll compare them again.
I was using MicroSDHC with ext2 filesystem as the storage medium when
I originally inquired about this style of backup, and so I may have at
the time had the slowest possible email storage stack.


More information about the Borgbackup mailing list