From tw at waldmann-edv.de Tue Jun 30 05:19:10 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Tue, 30 Jun 2015 11:19:10 +0200 Subject: [borgbackup] Migration path from attic References: Message-ID: <55925F0E.9050503@waldmann-edv.de> Hi Manuel, > is there a migration path from attic? Can I reuse my attic archive and > just access it with Borg? Currently not, but with some coding, it would be currently possible, see this ticket: https://github.com/borgbackup/borg/issues/21#issuecomment-111222023 Cheers, Thomas --- GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. From manuel.faux at conf.at Tue Jun 30 03:47:53 2015 From: manuel.faux at conf.at (Manuel Faux) Date: Tue, 30 Jun 2015 07:47:53 +0000 Subject: Migration path from attic In-Reply-To: References: Message-ID: Hi, is there a migration path from attic? Can I reuse my attic archive and just access it with Borg? BR, Manuel -------------- next part -------------- An HTML attachment was scrubbed... URL: From tw at waldmann-edv.de Thu Jun 11 18:16:36 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Fri, 12 Jun 2015 00:16:36 +0200 Subject: borgbackup release 0.23.0 In-Reply-To: <557A08C4.1050005@waldmann-edv.de> References: <557A08C4.1050005@waldmann-edv.de> Message-ID: <557A08C4.1050005@waldmann-edv.de> I just released the first release of borgbackup! \o/ Changes: https://github.com/borgbackup/borg/blob/master/CHANGES Download: https://pypi.python.org/pypi/borgbackup Please try it (carefully) and give feedback on the issue tracker, on the mailing list or on IRC. Homepage: https://borgbackup.github.io/ From tw at waldmann-edv.de Fri Sep 25 16:51:30 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Fri, 25 Sep 2015 22:51:30 +0200 Subject: [borgbackup] Pruning question References: Message-ID: <5605B3D2.4080006@waldmann-edv.de> On 09/24/2015 04:06 PM, Alex Gorbachev wrote: > Good day, seeing odd behavior with pruning: > > borg prune -v --keep-daily=6 -p vault1 /vault/repo1 > Keeping archive: vault1-2015-09-23-0900 Wed Sep 23 > 23:34:55 2015 > Keeping archive: vault1-2015-09-21-1017 Tue Sep 22 > 00:09:54 2015 > Keeping archive: vault1-2015-09-20-1136 Mon Sep 21 > 04:34:15 2015 > Pruning archive: vault1-2015-09-22-1110 Wed Sep 23 > 01:57:36 2015 > > Since 9/22 was two days ago why are we pruning it? Is that archive list complete? Did you use --prefix=vault1- ? Pruning depends on all the archive timestamps (that match the given prefix OR all archives if you give none). -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From ag at iss-integration.com Fri Sep 25 22:44:02 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Fri, 25 Sep 2015 22:44:02 -0400 Subject: [borgbackup] Pruning question References: <5605B3D2.4080006@waldmann-edv.de> Message-ID: Hi Thomas, On Fri, Sep 25, 2015 at 4:51 PM, Thomas Waldmann wrote: > On 09/24/2015 04:06 PM, Alex Gorbachev wrote: > > Good day, seeing odd behavior with pruning: > > > > borg prune -v --keep-daily=6 -p vault1 /vault/repo1 > > Keeping archive: vault1-2015-09-23-0900 Wed Sep 23 > > 23:34:55 2015 > > Keeping archive: vault1-2015-09-21-1017 Tue Sep 22 > > 00:09:54 2015 > > Keeping archive: vault1-2015-09-20-1136 Mon Sep 21 > > 04:34:15 2015 > > Pruning archive: vault1-2015-09-22-1110 Wed Sep 23 > > 01:57:36 2015 > > > > Since 9/22 was two days ago why are we pruning it? > > Is that archive list complete? > Yes, these are all the backups > > Did you use --prefix=vault1- ? > I think the -p vault1 is the same as --prefix, at least as per documentation > > Pruning depends on all the archive timestamps (that match the given > prefix OR all archives if you give none). > That is what I assumed - but doing pruning on September 24 and it is deleting an archive with the timestamp of September 23, this is what confused me. Thanks, Alex > > > -- > > GPG ID: FAF7B393 > GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leo at famulari.name Thu Sep 17 21:12:01 2015 From: leo at famulari.name (Leo Famulari) Date: Thu, 17 Sep 2015 21:12:01 -0400 Subject: [PATCH 0/1] Borg 0.25 packaged for Guix In-Reply-To: References: Message-ID: Hi, This is a package definition of Borg 0.25 for the Guix package manager [1]. I have not submitted this patch to Guix yet because I am wary of including a rapidly changing backup program into an operating system's official repositories. I appreciate the motivations for forking from Attic and I look forward to the improvements being developed in Borg but in the meantime I prefer my backup system to be rather boring. However, I am not sure if Borg is actually unstable or not. I would like some advice on this matter. Nevertheless, I have found Guix to be a great way to install and manage software on my systems. It offers a lot of improvements to established package managers like apt and yum. It also provides a much-needed replacement for the dismal experience of language-specific package managers. I have been watching Borg's progress from afar because I am not interested in installing, learning, and maintaining another package manager just for Python programs. This package is not yet included in the official Guix repos, but if you are using Guix and have a working development environment, you can apply the patch like so from the root of your Guix source tree: $ patch -p1 < 0001-gnu-Add-Borg-backup-program.patch Then, you can install it for testing and use. Please note that this might involve building lz4 from source, and that will take a while due to lz4's test suite. $ guix environment guix $ ./pre-inst-env guix package --install borg Now that I have this package definition, I can start testing different versions of Borg easily. I hope that some will find this patch useful, and also that some will have a nice introduction to Guix. [1] http://www.gnu.org/software/guix/ Guix is a package manager (and operating system called GuixSD) based on the Nix package manager. It uses Guile Scheme instead of the Nix DSL and is focused on free software. You can run it on top of another Linux distribution without conflicting or interacting with that distribution's package management. Leo Famulari (1): gnu: Add Borg backup program. gnu/packages/backup.scm | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) -- 2.4.3 From leo at famulari.name Thu Sep 17 21:12:02 2015 From: leo at famulari.name (Leo Famulari) Date: Thu, 17 Sep 2015 21:12:02 -0400 Subject: [PATCH 1/1] gnu: Add Borg backup program. References: Message-ID: * gnu/packages/backup.scm (borg): New variable. --- gnu/packages/backup.scm | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/gnu/packages/backup.scm b/gnu/packages/backup.scm index 84d27c0..5363bef 100644 --- a/gnu/packages/backup.scm +++ b/gnu/packages/backup.scm @@ -352,3 +352,40 @@ deduplication technique used makes Attic suitable for daily backups since only changes are stored.") (home-page "https://attic-backup.org/") (license license:bsd-3))) + +(define-public borg + (package + (name "borg") + (version "0.25.0") + (source (origin + (method url-fetch) + (uri (string-append + "https://pypi.python.org/packages/source/b/borgbackup/" + "borgbackup-" version ".tar.gz")) + (sha256 + (base32 + "1q5dbrya6bwbbmhyg0yxj81h1frsbjdwqml3s8n30ff4f0z5p77q")))) + (build-system python-build-system) + (arguments + `(#:phases + (modify-phases %standard-phases + (add-before + 'build 'set-openssl-prefix + (lambda* (#:key inputs #:allow-other-keys) + (setenv "BORG_OPENSSL_PREFIX" (assoc-ref inputs "openssl")) + #t))))) + (inputs + `(("acl" ,acl) + ("lz4" ,lz4) + ("openssl" ,openssl) + ("python-llfuse" ,python-llfuse) + ("python-msgpack" ,python-msgpack))) + (synopsis "Deduplicated, encrypted, authenticated and compressed backups") + (description "Borg is a deduplicating backup program. Optionally, it +supports compression and authenticated encryption. The main goal of Borg is to +provide an efficient and secure way to backup data. The data deduplication +technique used makes Borg suitable for daily backups since only changes are +stored. The authenticated encryption technique makes it suitable for backups +to not fully trusted targets.") + (home-page "https://borgbackup.github.io/borgbackup/") + (license license:bsd-3))) -- 2.4.3 From leo at famulari.name Thu Sep 17 21:32:05 2015 From: leo at famulari.name (Leo Famulari) Date: Thu, 17 Sep 2015 21:32:05 -0400 Subject: [borgbackup] [PATCH 0/1] Borg 0.25 packaged for Guix References: Message-ID: <1442539925.2251828.386861369.14C0BD2A@webmail.messagingengine.com> One more thing that I forgot to mention: this package definition does not run the tests. I need to fix that, especially because I get intermittent FUSE errors with `borg mount`. And when those errors happen, I can reproduce them with `attic mount`. On Thu, Sep 17, 2015, at 21:12, Leo Famulari wrote: > Hi, > > This is a package definition of Borg 0.25 for the Guix package manager > [1]. > > I have not submitted this patch to Guix yet because I am wary of > including a > rapidly changing backup program into an operating system's official > repositories. I appreciate the motivations for forking from Attic and I > look > forward to the improvements being developed in Borg but in the meantime I > prefer my backup system to be rather boring. However, I am not sure if > Borg is > actually unstable or not. I would like some advice on this matter. > > Nevertheless, I have found Guix to be a great way to install and manage > software on my systems. It offers a lot of improvements to established > package > managers like apt and yum. It also provides a much-needed replacement for > the > dismal experience of language-specific package managers. I have been > watching > Borg's progress from afar because I am not interested in installing, > learning, > and maintaining another package manager just for Python programs. > > This package is not yet included in the official Guix repos, but if you > are > using Guix and have a working development environment, you can apply the > patch > like so from the root of your Guix source tree: > $ patch -p1 < 0001-gnu-Add-Borg-backup-program.patch > > Then, you can install it for testing and use. Please note that this might > involve building lz4 from source, and that will take a while due to lz4's > test > suite. > $ guix environment guix > $ ./pre-inst-env guix package --install borg > > Now that I have this package definition, I can start testing different > versions > of Borg easily. > > I hope that some will find this patch useful, and also that some will > have a > nice introduction to Guix. > > [1] http://www.gnu.org/software/guix/ > Guix is a package manager (and operating system called GuixSD) based on > the Nix > package manager. It uses Guile Scheme instead of the Nix DSL and is > focused on > free software. You can run it on top of another Linux distribution > without > conflicting or interacting with that distribution's package management. > > Leo Famulari (1): > gnu: Add Borg backup program. > > gnu/packages/backup.scm | 37 +++++++++++++++++++++++++++++++++++++ > 1 file changed, 37 insertions(+) > > -- > 2.4.3 > From ag at iss-integration.com Thu Sep 24 10:06:25 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Thu, 24 Sep 2015 10:06:25 -0400 Subject: Pruning question In-Reply-To: References: Message-ID: Good day, seeing odd behavior with pruning: borg prune -v --keep-daily=6 -p vault1 /vault/repo1 Keeping archive: vault1-2015-09-23-0900 Wed Sep 23 23:34:55 2015 Keeping archive: vault1-2015-09-21-1017 Tue Sep 22 00:09:54 2015 Keeping archive: vault1-2015-09-20-1136 Mon Sep 21 04:34:15 2015 Pruning archive: vault1-2015-09-22-1110 Wed Sep 23 01:57:36 2015 Since 9/22 was two days ago why are we pruning it? Thank you, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From tw at waldmann-edv.de Tue Sep 1 08:24:28 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Tue, 1 Sep 2015 14:24:28 +0200 Subject: [borgbackup] Borg speed tuning on large files References: <55DC2790.3000005@waldmann-edv.de> <55DCEFF1.1050408@waldmann-edv.de> <55E1B248.1060409@waldmann-edv.de> Message-ID: <55E598FC.3030306@waldmann-edv.de> > Indeed, with lz4 the compression speed is the fastest it has been. On > the first run, data is 35% of original size (same 2TB volume) and it > took 17 hours to compress vs. the previous 33 with LZMA. Ah, that's impressive. lz4 seems to like your data (usually it doesn't compress to 35%). :D A zlib,1 comparison value would have been nice here (as that is the fastest compression zlib can do [but usually slower than lz4]). lzma is known to be rather slow (but high compression). > Computed speed is 12 MB/s vs. the previous 5 MB/s, and we are not at > all disk bound (we do 100+ MB/s network transfers from it). If I divide 2TB original data by 17h backup time, I get 32MB/s. Your data rate is based on the compressed data. -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From ag at iss-integration.com Tue Sep 1 07:29:08 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Tue, 1 Sep 2015 07:29:08 -0400 Subject: [borgbackup] Borg speed tuning on large files References: <55DC2790.3000005@waldmann-edv.de> <55DCEFF1.1050408@waldmann-edv.de> <55E1B248.1060409@waldmann-edv.de> Message-ID: >> >> With 0.25.0 you could try: >> - lz4 = superfast, but low compression >> - lzma = slow/expensive, but high compression >> - none - no compression, no overhead (this is not zlib,0 any more) > > Started lz4 trials tonight, will update! Indeed, with lz4 the compression speed is the fastest it has been. On the first run, data is 35% of original size (same 2TB volume) and it took 17 hours to compress vs. the previous 33 with LZMA. Computed speed is 12 MB/s vs. the previous 5 MB/s, and we are not at all disk bound (we do 100+ MB/s network transfers from it). I will run an incremental run shortly. Thanks, Alex From tw at waldmann-edv.de Mon Sep 28 19:00:29 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Tue, 29 Sep 2015 01:00:29 +0200 Subject: borgbackup 0.26.1 released In-Reply-To: <5609C68D.9040304@waldmann-edv.de> References: <5609C68D.9040304@waldmann-edv.de> Message-ID: <5609C68D.9040304@waldmann-edv.de> Hi, just wanted to tell that there is a fresh release with a few improvements and fixes (mostly having to do with the binaries): https://github.com/borgbackup/borg/blob/0.26.1/CHANGES.rst what's new https://pypi.python.org/pypi/borgbackup/0.26.1 pip package https://github.com/borgbackup/borg/issues/214 binaries All releases are signed by me, please check the signature. Cheers, Thomas -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From ag at iss-integration.com Mon Sep 28 19:30:20 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Mon, 28 Sep 2015 19:30:20 -0400 Subject: [borgbackup] borgbackup 0.26.1 released References: <5609C68D.9040304@waldmann-edv.de> <5609C68D.9040304@waldmann-edv.de> Message-ID: Hi Thomas, I tested 0.26.1 against the large (1.1 TB) backup set and it is extremely slow to list files. 0.26.0 comes back pretty quickly. It took more than 5 minutes to list all the files in 0.26.1, while only about 10 seconds in 0.26.0 Thank you, Alex On Mon, Sep 28, 2015 at 7:00 PM, Thomas Waldmann wrote: > Hi, > > just wanted to tell that there is a fresh release with a few > improvements and fixes (mostly having to do with the binaries): > > https://github.com/borgbackup/borg/blob/0.26.1/CHANGES.rst what's new > > https://pypi.python.org/pypi/borgbackup/0.26.1 pip package > > https://github.com/borgbackup/borg/issues/214 binaries > > All releases are signed by me, please check the signature. > > Cheers, > > Thomas > -- > > GPG ID: FAF7B393 > GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ag at iss-integration.com Mon Sep 28 21:20:01 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Mon, 28 Sep 2015 21:20:01 -0400 Subject: [borgbackup] borgbackup 0.26.1 released References: <5609C68D.9040304@waldmann-edv.de> <5609CE23.9020006@waldmann-edv.de> Message-ID: On Mon, Sep 28, 2015 at 7:32 PM, Thomas Waldmann wrote: > > I tested 0.26.1 against the large (1.1 TB) backup set and it is > > extremely slow to list files. 0.26.0 comes back pretty quickly. It > > took more than 5 minutes to list all the files in 0.26.1, while only > > about 10 seconds in 0.26.0 > > File an issue on the issue tracker and give the stuff as described in > the readme. > Will do Thomas, thanks > > > -- > > GPG ID: FAF7B393 > GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tw at waldmann-edv.de Mon Sep 28 19:32:51 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Tue, 29 Sep 2015 01:32:51 +0200 Subject: [borgbackup] borgbackup 0.26.1 released References: <5609C68D.9040304@waldmann-edv.de> <5609C68D.9040304@waldmann-edv.de> Message-ID: <5609CE23.9020006@waldmann-edv.de> > I tested 0.26.1 against the large (1.1 TB) backup set and it is > extremely slow to list files. 0.26.0 comes back pretty quickly. It > took more than 5 minutes to list all the files in 0.26.1, while only > about 10 seconds in 0.26.0 File an issue on the issue tracker and give the stuff as described in the readme. -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From tw at waldmann-edv.de Sat Sep 19 16:41:46 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sat, 19 Sep 2015 22:41:46 +0200 Subject: borgbackup 0.26.0 released In-Reply-To: <55FDC88A.2070302@waldmann-edv.de> References: <55FDC88A.2070302@waldmann-edv.de> Message-ID: <55FDC88A.2070302@waldmann-edv.de> Hi, just wanted to tell that there is a fresh release with a lot of improvements and fixes: https://github.com/borgbackup/borg/blob/0.26.0/CHANGES.rst what's new https://pypi.python.org/pypi/borgbackup/0.26.0 pip package https://github.com/borgbackup/borg/issues/147 binary wheels (soon) All releases are signed by me, please check the signature. Cheers, Thomas -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From ag at iss-integration.com Mon Sep 21 17:44:02 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Mon, 21 Sep 2015 17:44:02 -0400 Subject: [borgbackup] borgbackup 0.26.0 released References: <55FDC88A.2070302@waldmann-edv.de> <55FFD9E1.9020703@waldmann-edv.de> Message-ID: HI Thomas, On Mon, Sep 21, 2015 at 6:20 AM, Thomas Waldmann wrote: > > Thank you for the update. I am having issues installing on Ubuntu > > 14.04, could you please advise? > > root at lab2-b1:/usr/local/ISS# git clone > > https://github.com/borgbackup/borg.git > > Cloning into 'borg'... > > remote: Counting objects: 6144, done. > > remote: Compressing objects: 100% (69/69), done. > > remote: Total 6144 (delta 34), reused 0 (delta 0), pack-reused 6075 > > Receiving objects: 100% (6144/6144), 1.70 MiB | 0 bytes/s, done. > > Resolving deltas: 100% (4184/4184), done. > > Checking connectivity... done. > > root at lab2-b1:/usr/local/ISS# cd borg > > root at lab2-b1:/usr/local/ISS/borg# pip3 install -e . > > Traceback (most recent call last): > > File "/usr/bin/pip3", line 5, in > > from pkg_resources import load_entry_point > > File "/usr/lib/python3/dist-packages/pkg_resources.py", line 31, in > > > > import platform > > File "/usr/local/ISS/borg/borg/platform.py", line 4, in > > It seems to be confused about the "platform" module. > > I guess the pkg_resources wants the stdlib "platform", but it import the > "borg.platform" module. > > Did you add borg/borg to sys.path / PYTHONPATH? > > > from .platform_linux import acl_get, acl_set, API_VERSION > > SystemError: Parent module '' not loaded, cannot perform relative import > > > > I am using a quite similar procedure (but with a virtualenv), works for me. > What I had to do was to rename the old borg install diectory and use a different one - then borg installed OK. Thanks, Alex > > -- > > GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 > Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tw at waldmann-edv.de Mon Sep 21 06:20:17 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Mon, 21 Sep 2015 12:20:17 +0200 Subject: [borgbackup] borgbackup 0.26.0 released References: <55FDC88A.2070302@waldmann-edv.de> <55FDC88A.2070302@waldmann-edv.de> Message-ID: <55FFD9E1.9020703@waldmann-edv.de> > Thank you for the update. I am having issues installing on Ubuntu > 14.04, could you please advise? > root at lab2-b1:/usr/local/ISS# git clone > https://github.com/borgbackup/borg.git > Cloning into 'borg'... > remote: Counting objects: 6144, done. > remote: Compressing objects: 100% (69/69), done. > remote: Total 6144 (delta 34), reused 0 (delta 0), pack-reused 6075 > Receiving objects: 100% (6144/6144), 1.70 MiB | 0 bytes/s, done. > Resolving deltas: 100% (4184/4184), done. > Checking connectivity... done. > root at lab2-b1:/usr/local/ISS# cd borg > root at lab2-b1:/usr/local/ISS/borg# pip3 install -e . > Traceback (most recent call last): > File "/usr/bin/pip3", line 5, in > from pkg_resources import load_entry_point > File "/usr/lib/python3/dist-packages/pkg_resources.py", line 31, in > > import platform > File "/usr/local/ISS/borg/borg/platform.py", line 4, in It seems to be confused about the "platform" module. I guess the pkg_resources wants the stdlib "platform", but it import the "borg.platform" module. Did you add borg/borg to sys.path / PYTHONPATH? > from .platform_linux import acl_get, acl_set, API_VERSION > SystemError: Parent module '' not loaded, cannot perform relative import > I am using a quite similar procedure (but with a virtualenv), works for me. -- GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. From tw at waldmann-edv.de Mon Sep 21 06:32:21 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Mon, 21 Sep 2015 12:32:21 +0200 Subject: Single-File Binaries for borg 0.26.0 In-Reply-To: <55FFDCB5.1060102@waldmann-edv.de> References: <55FFDCB5.1060102@waldmann-edv.de> Message-ID: <55FFDCB5.1060102@waldmann-edv.de> https://github.com/borgbackup/borg/issues/214 If you are on Linux, please test these. -- GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. From tw at waldmann-edv.de Sun Sep 20 13:39:13 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sun, 20 Sep 2015 19:39:13 +0200 Subject: [borgbackup] initial 0.26 cache re-sync References: <55FEA8A7.3020500@waldmann-edv.de> <55FEA8A7.3020500@waldmann-edv.de> <1442770355.3062911.388672729.01A27271@webmail.messagingengine.com> Message-ID: <55FEEF41.2010408@waldmann-edv.de> On 09/20/2015 07:32 PM, Leo Famulari wrote: > I'm really glad to see this work being done. Am I right to assume that > the motivation behind this is to make it easier to backup multiple > systems to the same repository, achieving deduplication of "system" > files like /etc, /usr, /lib, et cetera? Well, this is nothing new, this has worked since long. You can even move some user data files from one machine to another and still have them deduplicated against the past backups. New is that I am trying to make the cache resync faster. Once borg notices that it's local cache is out of sync with the repo (because another machine did a backup meanwhile), it needs to resynchronize the cache. --- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From tw at waldmann-edv.de Sun Sep 20 08:37:59 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sun, 20 Sep 2015 14:37:59 +0200 Subject: initial 0.26 cache re-sync In-Reply-To: <55FEA8A7.3020500@waldmann-edv.de> References: <55FEA8A7.3020500@waldmann-edv.de> Message-ID: <55FEA8A7.3020500@waldmann-edv.de> Hi, just wanted to post a note about this: When you use 0.26.0 the first time, you might see a rather long "synchronizing chunks cache". This is due to the way the synchronisation works in 0.26+ and will only take that long ONCE. Subsequent re-syncs will only fetch the new archives it discovers in the repo (not all of them). I changed the way the single-archive indexes are kept locally, they are now single files in ~/.cache/borg/REPOID/chunks.archive.d/. 0.25 used a compressed tar archive "chunks.archive" - but dealing with tar and (re-)compression took way too much time and cpu and the compression did not work as great in practice as in my experiments. If you have a slow connection to the repository and/or or a huge number of archives, you can save some time by manually extracting your pre-0.26 chunks.archive to that location. This manual procedure is OPTIONAL, if you do not do it, borg will kill the compressed tar automatically and then fetch all single-archive indexes from the (remote?) repo. Make sure you have lots of disk space free in .cache/borg: cd ~/.cache/borg/REPOID mkdir chunks.archive.d cd chunks.archive.d tar xJvf ../chunks.archive # if you have a older python, it might be also xjvf or xzvf. # at the end, check that permissions/mode are as you see # for the other files in .cache/borg: cd .. chown -R borg.borg chunks.archive.d chmod -R go-rwX chunks.archive.d # after successfully extracting the chunks.archive. remove it # 0.26 does not use it any more (and would also remove it the # first time a cache resync happens): rm chunks.archive In case something does not work, you can still kill chunks.archive.d and let borg do it (slowly). Cheers, Thomas -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From ag at iss-integration.com Sun Sep 20 22:37:18 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Sun, 20 Sep 2015 22:37:18 -0400 Subject: [borgbackup] borgbackup 0.26.0 released References: <55FDC88A.2070302@waldmann-edv.de> <55FDC88A.2070302@waldmann-edv.de> Message-ID: Hi Thomas, On Sat, Sep 19, 2015 at 4:41 PM, Thomas Waldmann wrote: > > Hi, > > just wanted to tell that there is a fresh release with a lot of > improvements and fixes: > > https://github.com/borgbackup/borg/blob/0.26.0/CHANGES.rst what's new > > https://pypi.python.org/pypi/borgbackup/0.26.0 pip package > > https://github.com/borgbackup/borg/issues/147 binary wheels (soon) > > All releases are signed by me, please check the signature. > > Cheers, > > Thomas > -- > > GPG ID: FAF7B393 > GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 > > Thank you for the update. I am having issues installing on Ubuntu 14.04, could you please advise? Best regards, Alex root at lab2-b1:/usr/local/ISS# git clone https://github.com/borgbackup/borg.git Cloning into 'borg'... remote: Counting objects: 6144, done. remote: Compressing objects: 100% (69/69), done. remote: Total 6144 (delta 34), reused 0 (delta 0), pack-reused 6075 Receiving objects: 100% (6144/6144), 1.70 MiB | 0 bytes/s, done. Resolving deltas: 100% (4184/4184), done. Checking connectivity... done. root at lab2-b1:/usr/local/ISS# cd borg root at lab2-b1:/usr/local/ISS/borg# pip3 install -e . Traceback (most recent call last): File "/usr/bin/pip3", line 5, in from pkg_resources import load_entry_point File "/usr/lib/python3/dist-packages/pkg_resources.py", line 31, in import platform File "/usr/local/ISS/borg/borg/platform.py", line 4, in from .platform_linux import acl_get, acl_set, API_VERSION SystemError: Parent module '' not loaded, cannot perform relative import -------------- next part -------------- An HTML attachment was scrubbed... URL: From leo at famulari.name Sun Sep 20 13:32:35 2015 From: leo at famulari.name (Leo Famulari) Date: Sun, 20 Sep 2015 13:32:35 -0400 Subject: [borgbackup] initial 0.26 cache re-sync References: <55FEA8A7.3020500@waldmann-edv.de> <55FEA8A7.3020500@waldmann-edv.de> Message-ID: <1442770355.3062911.388672729.01A27271@webmail.messagingengine.com> I'm really glad to see this work being done. Am I right to assume that the motivation behind this is to make it easier to backup multiple systems to the same repository, achieving deduplication of "system" files like /etc, /usr, /lib, et cetera? On Sun, Sep 20, 2015, at 08:37, Thomas Waldmann wrote: > Hi, just wanted to post a note about this: > > When you use 0.26.0 the first time, you might see a rather long > "synchronizing chunks cache". > > This is due to the way the synchronisation works in 0.26+ and will only > take that long ONCE. Subsequent re-syncs will only fetch the new > archives it discovers in the repo (not all of them). > > I changed the way the single-archive indexes are kept locally, they are > now single files in ~/.cache/borg/REPOID/chunks.archive.d/. > > 0.25 used a compressed tar archive "chunks.archive" - but dealing with > tar and (re-)compression took way too much time and cpu and the > compression did not work as great in practice as in my experiments. > > If you have a slow connection to the repository and/or or a huge number > of archives, you can save some time by manually extracting your pre-0.26 > chunks.archive to that location. > > This manual procedure is OPTIONAL, if you do not do it, borg will kill > the compressed tar automatically and then fetch all single-archive > indexes from the (remote?) repo. > > Make sure you have lots of disk space free in .cache/borg: > > cd ~/.cache/borg/REPOID > mkdir chunks.archive.d > cd chunks.archive.d > tar xJvf ../chunks.archive > # if you have a older python, it might be also xjvf or xzvf. > > # at the end, check that permissions/mode are as you see > # for the other files in .cache/borg: > cd .. > chown -R borg.borg chunks.archive.d > chmod -R go-rwX chunks.archive.d > > # after successfully extracting the chunks.archive. remove it > # 0.26 does not use it any more (and would also remove it the > # first time a cache resync happens): > rm chunks.archive > > In case something does not work, you can still kill chunks.archive.d and > let borg do it (slowly). > > Cheers, > > Thomas > > -- > > GPG ID: FAF7B393 > GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 > From ag at iss-integration.com Fri Sep 11 18:27:35 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Fri, 11 Sep 2015 18:27:35 -0400 Subject: [borgbackup] Borg speed tuning on large files References: <55DC2790.3000005@waldmann-edv.de> <55DCEFF1.1050408@waldmann-edv.de> <55E1B248.1060409@waldmann-edv.de> <55E598FC.3030306@waldmann-edv.de> Message-ID: Here is the latest round of benchmarks - lz4 is definitely a lot faster Borg First Runlz4,19,23,20,409522569107527829648648/31/2015 13:51:009/1/2015 7:12:0017.435%2.8835Borg Next Runlz4,19,23,20,409522387572964369795849/1/2015 7:20:009/1/2015 21:58:0014.620%5.1242Borg First Runlz4,17,22,19,4095 22656182407912710409/2/2015 7:15:009/2/2015 23:37:0016.435%2.8638Borg Next Runlz4,17,22,19,409522844942163960725769/3/2015 15:54:009/4/2015 6:58:0015.1 17%5.7741Borg Next Runlz4,17,22,19,409523200848803747252489/4/2015 13:20:009/5/2015 4:08:0014.816%6.1943Borg First Runlz4,19,23,21,409523581555688117031689/5/2015 10:21:009/6/2015 3:13:0016.934%2.9138Borg Next Runlz4,19,23,21,4095 23456232883870658569/6/2015 20:20:009/7/2015 11:52:0015.517%6.0641 On Tue, Sep 1, 2015 at 8:24 AM, Thomas Waldmann wrote: > > Indeed, with lz4 the compression speed is the fastest it has been. On > > the first run, data is 35% of original size (same 2TB volume) and it > > took 17 hours to compress vs. the previous 33 with LZMA. > > Ah, that's impressive. > > lz4 seems to like your data (usually it doesn't compress to 35%). :D > > A zlib,1 comparison value would have been nice here (as that is the > fastest compression zlib can do [but usually slower than lz4]). > lzma is known to be rather slow (but high compression). > > > Computed speed is 12 MB/s vs. the previous 5 MB/s, and we are not at > > all disk bound (we do 100+ MB/s network transfers from it). > > If I divide 2TB original data by 17h backup time, I get 32MB/s. > Your data rate is based on the compressed data. > > > -- > > GPG ID: FAF7B393 > GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ag at iss-integration.com Fri Sep 11 19:06:33 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Fri, 11 Sep 2015 19:06:33 -0400 Subject: borg list hangs during backup References: Message-ID: The snap/clone actually works using the method described in https://borgbackup.github.io/borgbackup/internals.html "in case you run into troubles with the locks, you can just delete the lock.* directory and file IF you first make sure that no Borg process is running on any machine that accesses this resource" this seems to allow list/extract operations to proceed On Fri, Sep 11, 2015 at 6:57 PM, Alex Gorbachev wrote: > Hello, while running borg create (takes 16+ hours in our case), borg list > hangs. Is this expected behavior and can it be changed? > > Also this leads to another question about data integrity running borg > extract while borg create is also running? > > This seems related to https://github.com/jborg/attic/issues/110 > > Thank you, > Alex > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ag at iss-integration.com Fri Sep 11 18:57:35 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Fri, 11 Sep 2015 18:57:35 -0400 Subject: borg list hangs during backup In-Reply-To: References: Message-ID: Hello, while running borg create (takes 16+ hours in our case), borg list hangs. Is this expected behavior and can it be changed? Also this leads to another question about data integrity running borg extract while borg create is also running? This seems related to https://github.com/jborg/attic/issues/110 Thank you, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From ag at iss-integration.com Fri Sep 11 18:59:59 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Fri, 11 Sep 2015 18:59:59 -0400 Subject: borg list hangs during backup References: Message-ID: ...and a practical question from below limitation: if, for example, a recovery is needed while backup is running, can one snapshot/clone the filesystem (i.e. zfs snapshot) and somehow remove the exclusive lock? The original repo does not suffer, since it will not be modified. On Fri, Sep 11, 2015 at 6:57 PM, Alex Gorbachev wrote: > Hello, while running borg create (takes 16+ hours in our case), borg list > hangs. Is this expected behavior and can it be changed? > > Also this leads to another question about data integrity running borg > extract while borg create is also running? > > This seems related to https://github.com/jborg/attic/issues/110 > > Thank you, > Alex > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mh+borgbackup at zugschlus.de Mon Jul 27 11:16:22 2015 From: mh+borgbackup at zugschlus.de (Marc Haber) Date: Mon, 27 Jul 2015 17:16:22 +0200 Subject: deduplication for backup of largely identical systems In-Reply-To: <20150727151622.GZ7338@torres.zugschlus.de> References: <20150727151622.GZ7338@torres.zugschlus.de> Message-ID: <20150727151622.GZ7338@torres.zugschlus.de> Hi, most of my machines are running Debian stable. It is therefore likely that parts of the file system such as /usr will be largely identical over most of my system. To take advantage of borg's deduplication feature in this scale, it sounds enticing to have all backups run into the same repository (borg init ssh://backuphost//repository once and borg create ssh:/backuphost//repository::localhost-date / for the actual backups). Is that a recommended procedure? I guess that risk of repository loss is higher that way, only one backup can write to the repository at a single time, and all machines would be able to read each other's backups since it's the same repository with the same key. Are there any other implications that I need to be aware of before engaging in borg deduplicating backup in scale? Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Leimen, Germany | lose things." Winona Ryder | Fon: *49 6224 1600402 Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421 From tw at waldmann-edv.de Mon Jul 27 12:02:31 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Mon, 27 Jul 2015 18:02:31 +0200 Subject: [borgbackup] deduplication for backup of largely identical systems References: <20150727151622.GZ7338@torres.zugschlus.de> <20150727151622.GZ7338@torres.zugschlus.de> Message-ID: <55B65617.2090307@waldmann-edv.de> Hi Marc, > most of my machines are running Debian stable. It is therefore likely > that parts of the file system such as /usr will be largely identical > over most of my system. Yeah (plus some other parts, too). > To take advantage of borg's deduplication feature in this scale, it > sounds enticing to have all backups run into the same repository (borg > init ssh://backuphost//repository once and borg create > ssh:/backuphost//repository::localhost-date / for the actual backups). > > Is that a recommended procedure? You can backup multiple machines to same repo, but: - be careful with prune (use prefix option, use dry-run) - there is an exclusive write lock, so they will run sequential, not in parallel - the local cache on each machine will need a resync each time another machine has updated the repo to bring it in sync again with the repo state > I guess that risk of repository loss > is higher that way, only one backup can write to the repository at a > single time, and all machines would be able to read each other's > backups since it's the same repository with the same key. Exactly. > Are there any other implications that I need to be aware of before > engaging in borg deduplicating backup in scale? Only what we have above. (AFAIK) ^^ Cheers, Thomas From ag at iss-integration.com Tue Aug 25 11:43:54 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Tue, 25 Aug 2015 11:43:54 -0400 Subject: [borgbackup] Borg speed tuning on large files References: <55DC2790.3000005@waldmann-edv.de> Message-ID: On Tue, Aug 25, 2015 at 4:30 AM, Thomas Waldmann wrote: > On 08/25/2015 03:35 AM, Alex Gorbachev wrote: > > Hello, new challenge on performance - running a machine with two 4 > > core 2 GHz CPUs, 32 GB RAM and pretty fast disks. Trying to run a > > dedup backup of 14 files, 20-150 GB in size with a total of about 2TB. > > > > When borg runs I see IO rates via iostat that are far below the > > storage subsystem capabilities. top shows 97-99% load for borg and > > around 100 MB RSS. > > Do you use compression or encryption? > No encryption. Tested with compression and without - observed the behavior you described below, but the volume of data pretty much requires compression to properly function. I tried compression levels of 0 and 3. > > I am wondering a bit that you get so close to 100% as due to the > single-threaded and relatively simple way of internal working of the > current release, one usually only gets that close when using high > compression (and in that case, the high compression can be a bottleneck) > or no cpu-accelerated encryption (or both). > > If you use encryption and openssl can't use AES-NI (because the cpu does > not support it or the drivers are not loaded), that can also slow down > things. > > So, for first speed tests, I'ld recommend no or fast compression and no > encryption. With 0.24 release that is --compression 0 (default) or > --compression 1. > > Next release will add super fast lz4 compression (which I think you will > like if your I/O system is rather fast), I hope I can release it in a > few days. > Oh can't wait - this is what ZFS uses and ours is plenty fast. I also realized that we are going from a compressed ZFS to its snapshot and then to uncompressed target destination with borg (as borg already compresses), so there is overhead on ZFS decompression...but that should run on another core. > > Keep an eye on borg's cpu load and get it to a bit lower value (maybe > 50-80%), > so it is in the sweet spot in the middle of being I/O bound and being > CPU bound. > I turned of hyperthreading and enabled aggressive CPU power mode, running a test, which will take about 25+ hours on the 2TB > > Also, as I've already said, I am sorry that 0.24 had broken > --chunker-params parameter parsing, so do not use that right now, this > will also be fixed in next release asap. > Thanks, I compiled right away from the git tree and no problems there, thank you for the fast response > > > I am assuming the bottleneck is the CPU as borg is single threaded. > > Is there anything we could do to speed the process up though - more > > RAM caching somehow? > > I am working on a multithreaded implementation (which is not trivial and > not expected to be finished soon) which can use the CPU cores and I/O > capabilities of a system much better. > Completely understood, and I am assuming starting multiple borg processes in parallel for each file is not a good idea? > > If you want to play with it, it is in multithreading branch of the repo, > but do NOT use that for real backups, it has still failing tests and > also the crypto might be unsecure there. > > I've seen > 300% CPU load with that code on a dual-core cpu with > hyperthreading and also the wallclock runtime was better than with > single-threaded code (but not 3x better, there is also some overhead). > > -- > > GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 > Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tw at waldmann-edv.de Tue Aug 25 04:30:08 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Tue, 25 Aug 2015 10:30:08 +0200 Subject: [borgbackup] Borg speed tuning on large files References: Message-ID: <55DC2790.3000005@waldmann-edv.de> On 08/25/2015 03:35 AM, Alex Gorbachev wrote: > Hello, new challenge on performance - running a machine with two 4 > core 2 GHz CPUs, 32 GB RAM and pretty fast disks. Trying to run a > dedup backup of 14 files, 20-150 GB in size with a total of about 2TB. > > When borg runs I see IO rates via iostat that are far below the > storage subsystem capabilities. top shows 97-99% load for borg and > around 100 MB RSS. Do you use compression or encryption? I am wondering a bit that you get so close to 100% as due to the single-threaded and relatively simple way of internal working of the current release, one usually only gets that close when using high compression (and in that case, the high compression can be a bottleneck) or no cpu-accelerated encryption (or both). If you use encryption and openssl can't use AES-NI (because the cpu does not support it or the drivers are not loaded), that can also slow down things. So, for first speed tests, I'ld recommend no or fast compression and no encryption. With 0.24 release that is --compression 0 (default) or --compression 1. Next release will add super fast lz4 compression (which I think you will like if your I/O system is rather fast), I hope I can release it in a few days. Keep an eye on borg's cpu load and get it to a bit lower value (maybe 50-80%), so it is in the sweet spot in the middle of being I/O bound and being CPU bound. Also, as I've already said, I am sorry that 0.24 had broken --chunker-params parameter parsing, so do not use that right now, this will also be fixed in next release asap. > I am assuming the bottleneck is the CPU as borg is single threaded. > Is there anything we could do to speed the process up though - more > RAM caching somehow? I am working on a multithreaded implementation (which is not trivial and not expected to be finished soon) which can use the CPU cores and I/O capabilities of a system much better. If you want to play with it, it is in multithreading branch of the repo, but do NOT use that for real backups, it has still failing tests and also the crypto might be unsecure there. I've seen > 300% CPU load with that code on a dual-core cpu with hyperthreading and also the wallclock runtime was better than with single-threaded code (but not 3x better, there is also some overhead). -- GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. From tw at waldmann-edv.de Tue Aug 25 18:45:05 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Wed, 26 Aug 2015 00:45:05 +0200 Subject: [borgbackup] Borg speed tuning on large files References: <55DC2790.3000005@waldmann-edv.de> Message-ID: <55DCEFF1.1050408@waldmann-edv.de> > No encryption. Tested with compression and without - observed the > behavior you described below, but the volume of data pretty much > requires compression to properly function. I tried compression levels > of 0 and 3. Maybe try 1 until lz4 is available. That's relatively fast and still compresses. > Completely understood, and I am assuming starting multiple borg > processes in parallel for each file is not a good idea? You can run up to N borg in parallel (if N is your cpu core count), but only if the target repo of each is a different one, otherwise they will block each other. -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From tw at waldmann-edv.de Sun Aug 9 16:20:43 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sun, 09 Aug 2015 22:20:43 +0200 Subject: borgbackup 0.24.0 released In-Reply-To: <55C7B61B.6000807@waldmann-edv.de> References: <55C7B61B.6000807@waldmann-edv.de> Message-ID: <55C7B61B.6000807@waldmann-edv.de> Hi, just wanted to tell that there is a fresh release with a lot of improvements and fixes: https://github.com/borgbackup/borg/blob/0.24.0/CHANGES.rst what's new https://pypi.python.org/pypi/borgbackup/0.24.0 pip package https://github.com/borgbackup/borg/issues/147 linux binary wheels All releases are signed by me (like this message), please check the signature. Cheers, Thomas -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From tw at waldmann-edv.de Sun Aug 9 17:00:03 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sun, 09 Aug 2015 23:00:03 +0200 Subject: borgbackup 0.24.0 released In-Reply-To: <55C7BF53.9040500@waldmann-edv.de> References: <55C7BF53.9040500@waldmann-edv.de> Message-ID: <55C7BF53.9040500@waldmann-edv.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Obviously the mailing list software changes the mail content in some ways so the signature is not valid any more. So, retrying to send a signed message, this time using inline pgp, not pgp/mime: Hi, just wanted to tell that there is a fresh release with a lot of improvements and fixes: https://github.com/borgbackup/borg/blob/0.24.0/CHANGES.rst what's new https://pypi.python.org/pypi/borgbackup/0.24.0 pip package https://github.com/borgbackup/borg/issues/147 linux binary wheels All releases are signed by me (like this message), please check the signature. Cheers, Thomas - -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCgAGBQJVx79TAAoJECQ6z6lR944BoSEP/Rgi60gI9uKp8RP85Vg7g5uC xaVnS9JdInP7f82AxtCwl1S+KoqezGeCUNMVhz9KvAzmIDDQl+Btv9Mh06TU1ZUx 2kKosWBIgAVixAIEx1Z+20d5rnEne7DjgUJWEAiMozSYM2I52m+1iFqWrRLSWGIG dLaOYfn5Bt17Wb9cW2eNHT64CKJZXniTa66jZNs1liGl60laKRL7a1bwt1R8z3Mx 0Q2BYlo/jr+COp4J4M/wrPdCvwrVfl0oZbAEC1WpaQgPD2+f+Jbzo8GfHPDSUxfz R6je+Cxx0GHoTNRSooNUFHZyBGHIku4rFbiyrwE7ZM9qTHvthgOfYiA6RCmT7mOm jfT0FeYwPPPHxT300leRFRv8d4MDBP2OxJclxmb2cZBRnDGAJwVw23oDZk6QaRRs YSWlLqER7N1me4hBodkdv8DILB6ipJhb7Ke7B1xfP1zr9MwjGUgyLPpOp+aU4WpX roQTLpBmd/KsHotn8USRLRq7qmF8xDTm/mOv6ngYAYYB4OZqsvuTaHI+RkIBpioi mOEilOQu9R6a/Z/pr+wCxN1al2kP5ixJwm6hmCJMZ9drysM1V3cZer2C13bGga+z SwwRY65hxttLuSa2D6LQ6z0aTL34QdRaWk9OZM8F9JMC2gHK9pX5ht4EVSqt9Zf/ gmHj8X+xk1MBqpSbGpEh =RwWE -----END PGP SIGNATURE----- From ag at iss-integration.com Mon Aug 24 21:35:55 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Mon, 24 Aug 2015 21:35:55 -0400 Subject: Borg speed tuning on large files In-Reply-To: References: Message-ID: Hello, new challenge on performance - running a machine with two 4 core 2 GHz CPUs, 32 GB RAM and pretty fast disks. Trying to run a dedup backup of 14 files, 20-150 GB in size with a total of about 2TB. When borg runs I see IO rates via iostat that are far below the storage subsystem capabilities. top shows 97-99% load for borg and around 100 MB RSS. I am assuming the bottleneck is the CPU as borg is single threaded. Is there anything we could do to speed the process up though - more RAM caching somehow? Thank you, Alex From tw at waldmann-edv.de Mon Aug 24 05:45:28 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Mon, 24 Aug 2015 11:45:28 +0200 Subject: [borgbackup] Chunker params for very large files References: <55D70FFC.2020100@waldmann-edv.de> Message-ID: <55DAE7B8.5070301@waldmann-edv.de> A short note for all people who want to play with --chunker-params: currently you need to use git master branch for that, 0.24 release does not work: https://github.com/borgbackup/borg/issues/154 Cheers, Thomas -- GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. From ag at iss-integration.com Fri Aug 28 18:36:50 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Fri, 28 Aug 2015 18:36:50 -0400 Subject: [borgbackup] Borg speed tuning on large files References: <55DC2790.3000005@waldmann-edv.de> <55DCEFF1.1050408@waldmann-edv.de> Message-ID: Hi Thomas, Here are my test results for a large area with ~2TB database backups per day: Tool Parameters Data size (apparent) Repo size Hrs Ratio C Rat C MB/s gzip c3 2308843696 560376600 22 24% 4.1 7 Attic First Run default 2251760621 531964928 48 24% 4.2 3 Attic Next Run default 2308843696 234398336 32 10% 9.9 2 Borg First Run C0,19,23,21,4095 2330579192 2354907008 26 101% 1 25 Borg Next Run C0,19,23,21,4095 2270686256 1341393408 18 59% 1.7 21 Borg First Run C3,19,23,21,4095 2270686256 568351360 33 25% 4 5 Borg Next Run C3,19,23,21,4095 2268472600 302165632 23 13% 7.5 4 Borg Next Run C1,19,23,21,4095 2247244128 422037120 24 19% 5.3 5 Here is a picture in case the text does not come through well: [image: Inline image 1] Oddly, compression setting of 1 took longer than C3. C0 shows the actual dedup capability of this data. My business goal here is to get the data in within a day, so about 12 hours or so. Best regards, Alex On Tue, Aug 25, 2015 at 6:45 PM, Thomas Waldmann wrote: > > No encryption. Tested with compression and without - observed the > > behavior you described below, but the volume of data pretty much > > requires compression to properly function. I tried compression levels > > of 0 and 3. > > Maybe try 1 until lz4 is available. That's relatively fast and still > compresses. > > > Completely understood, and I am assuming starting multiple borg > > processes in parallel for each file is not a good idea? > > You can run up to N borg in parallel (if N is your cpu core count), but > only if the target repo of each is a different one, otherwise they will > block each other. > > > > -- > > GPG ID: FAF7B393 > GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 16748 bytes Desc: not available URL: From tw at waldmann-edv.de Sat Aug 29 09:50:19 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sat, 29 Aug 2015 15:50:19 +0200 Subject: borgbackup 0.25.0 released In-Reply-To: <55E1B89B.4050600@waldmann-edv.de> References: <55E1B89B.4050600@waldmann-edv.de> Message-ID: <55E1B89B.4050600@waldmann-edv.de> Hi, just wanted to tell that there is a fresh release with a lot of improvements and fixes: https://github.com/borgbackup/borg/blob/0.25.0/CHANGES.rst what's new https://pypi.python.org/pypi/borgbackup/0.25.0 pip package https://github.com/borgbackup/borg/issues/147 binary wheels (soon) All releases are signed by me (like this message), please check the signature. Cheers, Thomas -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From tw at waldmann-edv.de Sat Aug 29 09:23:20 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sat, 29 Aug 2015 15:23:20 +0200 Subject: [borgbackup] Borg speed tuning on large files References: <55DC2790.3000005@waldmann-edv.de> <55DCEFF1.1050408@waldmann-edv.de> Message-ID: <55E1B248.1060409@waldmann-edv.de> > Tool Parameters Data size (apparent) Repo size Hrs Ratio C Rat C > MB/s > gzip c3 2308843696 560376600 22 24% 4.1 7 > Attic First Run default 2251760621 531964928 48 24% 4.2 3 > Attic Next Run default 2308843696 234398336 32 10% 9.9 2 > Borg First Run C0,19,23,21,4095 2330579192 2354907008 26 101% 1 25 > Borg Next Run C0,19,23,21,4095 2270686256 1341393408 18 59% 1.7 21 > Borg First Run C3,19,23,21,4095 2270686256 568351360 33 25% 4 5 > Borg Next Run C3,19,23,21,4095 2268472600 302165632 23 13% 7.5 4 > Borg Next Run C1,19,23,21,4095 2247244128 422037120 24 19% 5.3 5 Nice to see confirmation that we are quite faster than Attic. :) Hmm, should the last line read "Borg First Run ... C1"? In general, to evaluate the speed, it might be easier to only do "first runs", because there always some specific amount of data (== all input data) gets processed. In "next run", the amount of data actually needing processing might vary widely, depending on how much change there is between first and next run. BTW, note for other readers: the "Parameters" column can't be given that way to borg, it needs to be (e.g.): borg create -C1 --chunker-params 19,23,21,4095 repo::archive data Or in 0.25: borg create -C zlib,1 --chunker-params .... > Here is a picture in case the text does not come through well: Yeah, that looked better. :) BTW, what you currently have in the C MB/s column is how many compressed MB/s it actually writes to storage (and if that is a limiting factor, it would be your target storage, not borg). Maybe more interesting would be how much uncompressed data it can process per second. > Oddly, compression setting of 1 took longer than C3. Either there is a mistake in your table or your cpu is so fast that higher compression saves more time by avoiding I/O than it needs for the better compression. With 0.25.0 you could try: - lz4 = superfast, but low compression - lzma = slow/expensive, but high compression - none - no compression, no overhead (this is not zlib,0 any more) > C0 shows the actual dedup capability of this data. Doesn't seem to find significant amounts of "internal" duplication within a "first run". Historical dedup seems to work and help, though. Does that match your expectations considering the contents of your files? In case you measure again, keep an eye on CPU load. > My business goal here is to get > the data in within a day, so about 12 hours or so. If you can partition your data set somehow into N pieces and use N separate repos, you could save some time by running N borgs in parallel (assuming your I/O isn't a bottleneck then). N ~= core count of your CPU At some time in the future, borg might be able to a similar thing by internal multithreading, but that is not ready for production yet. There are also some other optimizations possible in the code (using different hashes, different crypto modes, ...) - we'll try making it much faster. -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From ag at iss-integration.com Sun Aug 23 22:26:01 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Sun, 23 Aug 2015 22:26:01 -0400 Subject: [borgbackup] Chunker params for very large files References: <55D70FFC.2020100@waldmann-edv.de> Message-ID: Hi Thomas, On Fri, Aug 21, 2015 at 7:48 AM, Thomas Waldmann wrote: > If you have enough space and you rather care for good speed, little > management overhead (but not so much about deduplicating with very fine > grained blocks), use a higher value for HASH_MASK_BITS, like 20 or 21, > so it creates larger chunks in the statistical medium. It sounds like > this matches your case. > > If you care for very fine grained deduplication and you maybe don't have > that much data and you can live with the management overhead, use a > small chunksize (small HASH_MASK_BITS, like the default 16). > >> An existing recommendation of 19,23,21,4095 for huge files from >> https://borgbackup.github.io/borgbackup/usage.html appears to >> translate into: >> >> minimum chunk of 512 KiB >> maximum chunk of 8 MiB >> medium chunk of 2 MiB >> >> In a 100GB file we are looking at 51200 chunks. > > You need to take the total amount of your data (~2TB) and compute the > chunk count (1.000.000). Then use the resource formula from the docs and > compute the sizes of the index files (and RAM needs). > > In your case this looks quite reasonable, you could also use 1MB chunks, > but better don't use 64KB chunks. Thank you for the clarification. Is the HASH_WINDOW_SIZE tunable in any way or useful to change? Best regards, Alex > >> beneficial to raise these further? The machine I have doing this has >> plenty of RAM (32 GB) and 8 CPU cores at 2.3 GHz, so RAM/compute is >> not a problem. > > Right. But if your index is rather big, it'll need to copy around a lot > of data (for transactions, for resyncing the cache in case you backup > multiple machines to same repo). > > > Cheers, Thomas > > ---- > > GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 > Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. From anarcat at debian.org Tue Aug 18 16:46:06 2015 From: anarcat at debian.org (=?utf-8?q?Antoine_Beaupr=C3=A9?=) Date: Tue, 18 Aug 2015 16:46:06 -0400 Subject: registered to gmane In-Reply-To: <87y4h8l3zl.fsf@marcos.anarc.at> References: <87y4h8l3zl.fsf@marcos.anarc.at> Message-ID: <87y4h8l3zl.fsf@marcos.anarc.at> Hi, I have registered this mailing list to the gmane archive. It should start showing up in here: http://dir.gmane.org/gmane.comp.sysutils.backup.borgbackup.general and in your favorite news reader, any time soon. A. -- Sous un gouvernement qui emprisonne injustement, la place de l?homme juste est aussi en prison. - La d?sob?issance civile, Henry David Thoreau From tw at waldmann-edv.de Wed Aug 26 13:14:52 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Wed, 26 Aug 2015 19:14:52 +0200 Subject: [borgbackup] Questions about hardening borg repositories References: <1440603903.28b5a6@strabo.loghyr> <1440603903.28b5a6@strabo.loghyr> Message-ID: <55DDF40C.4090400@waldmann-edv.de> > However, borg's deduplication means that corruption within the > repository would affect all of the archives. Could, yes (if the archive refers to the corrupt chunk). > My plan is to back up to a local disk, then do some scripting to create > par2 parity files (Reed Solomon coding, > https://github.com/Parchive/par2cmdline) for each segment, then sync the > repository and parity files to Amazon S3, providing protection against > individual file corruption and also failure of the local disk. In the FAQ I argued against adding redundancy in borg (see there). > Questions: > - Is it safe to create files with non-numeric parts in the segment > directory (eg "532.par2" and "532.volxxx+nnn.par2" for segment file > 532)? par2cmdline has a heritage as a 'post binaries to Usenet' > utility, so it wants to operate on files in the same directory as the > parity files. Putting stuff into same directory is a bit unclean of course. Currently, the segment iterator only works on purely numerical directories and files, so I'ld guess it doesn't cause an issue now. > - Do segment files ever change content between being first written and > being removed? "create" won't touch full segment files again, just create new ones. I am not totally sure about the very last ("not full" segment file), maybe observe that yourself and tell us. "check" might delete and add segments when repairing a repo/archive. "delete" will delete and add segments. > - Is there any value in generating parity for or syncing the index.%d > and hints.%d files? It looks like they are rewritten on most operations > and can be trivially regenerated by borg check. AFAIK "no". > - Are there any plans to add par2 support to borg? See FAQ. If you can bring up good arguments for it that do not lead to "false promises" and do not require information we do not have, I may reconsider it. Cheers, Thomas -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From ed at edgewood.to Wed Aug 26 12:16:29 2015 From: ed at edgewood.to (Ed Blackman) Date: Wed, 26 Aug 2015 12:16:29 -0400 Subject: Questions about hardening borg repositories In-Reply-To: <1440603903.28b5a6@strabo.loghyr> References: <1440603903.28b5a6@strabo.loghyr> Message-ID: <1440603903.28b5a6@strabo.loghyr> I'm evaluating borg as a possible replacement of my current backup system that uses duplicity. Generally, I backup system files for about 90 days, and user files (with subdir-based exclusions) effectively forever. However, borg's deduplication means that corruption within the repository would affect all of the archives. My plan is to back up to a local disk, then do some scripting to create par2 parity files (Reed Solomon coding, https://github.com/Parchive/par2cmdline) for each segment, then sync the repository and parity files to Amazon S3, providing protection against individual file corruption and also failure of the local disk. Questions: - Is it safe to create files with non-numeric parts in the segment directory (eg "532.par2" and "532.volxxx+nnn.par2" for segment file 532)? par2cmdline has a heritage as a 'post binaries to Usenet' utility, so it wants to operate on files in the same directory as the parity files. borg check --repair creates %d.beforerecover, so I think the answer is yes except for a negligible chance that borg will want to use those extensions. - Do segment files ever change content between being first written and being removed? - Is there any value in generating parity for or syncing the index.%d and hints.%d files? It looks like they are rewritten on most operations and can be trivially regenerated by borg check. - Are there any plans to add par2 support to borg? Duplicity offers a par2 backend wrapper that creates parity files when duplicity creates files, removes the parity files when duplicity removes the corresponding core duplicity files, etc. I didn't see a Github issue for it, but maybe someone has thoughts along that line. -- Ed Blackman From ed at edgewood.to Wed Aug 26 17:35:01 2015 From: ed at edgewood.to (Ed Blackman) Date: Wed, 26 Aug 2015 17:35:01 -0400 Subject: [borgbackup] Questions about hardening borg repositories References: <1440603903.28b5a6@strabo.loghyr> <1440603903.28b5a6@strabo.loghyr> <55DDF40C.4090400@waldmann-edv.de> Message-ID: <1440619112.016aa4@strabo.loghyr> On Wed, Aug 26, 2015 at 07:14:52PM +0200, Thomas Waldmann wrote: >> My plan is to back up to a local disk, then do some scripting to create >> par2 parity files (Reed Solomon coding, >> https://github.com/Parchive/par2cmdline) for each segment, then sync the >> repository and parity files to Amazon S3, providing protection against >> individual file corruption and also failure of the local disk. > >In the FAQ I argued against adding redundancy in borg (see there). I read the FAQ, but missed that or forgot it was there. I understand your reasoning even if I wish it were otherwise. >> Questions: >> - Is it safe to create files with non-numeric parts in the segment >> directory (eg "532.par2" and "532.volxxx+nnn.par2" for segment file >> 532)? par2cmdline has a heritage as a 'post binaries to Usenet' >> utility, so it wants to operate on files in the same directory as the >> parity files. > >Putting stuff into same directory is a bit unclean of course. >Currently, the segment iterator only works on purely numerical >directories and files, so I'ld guess it doesn't cause an issue now. Yeah, I'd prefer that the parity files live on a separate disk, too, not just for cleanliness but for safety too. There might be some way to do it with par2cmdline that I haven't figured out. In my experiments I could create the par2 files separate from the data, and with difficulty verify the par2 files separate from the data, but attempting to repair lead to the repaired file being created in the CWD, not where the file was. >> - Do segment files ever change content between being first written and >> being removed? > >"create" won't touch full segment files again, just create new ones. >I am not totally sure about the very last ("not full" segment file), >maybe observe that yourself and tell us. Will do and report back. >"check" might delete and add segments when repairing a repo/archive. > >"delete" will delete and add segments. But not change? That is, once data/0/532 is full, can it ever be changed or deleted and later recreated, or will it always have the same content until it's deleted? >If you can bring up good arguments for it that do not lead to "false >promises" and do not require information we do not have, I may >reconsider it. My understanding of your objection is that if a user has sectors go bad in the disk holding the repository, there's no way to prevent the bad sectors from also corrupting the parity blocks too. Well, if the parity blocks could be kept in a different directory (set up at init time?), and that directory was mounted on a different disk, then the fact that sectors go bad in the repository wouldn't affect the parity blocks. Alternately, are there plans to implement pluggable backends? Duplicity provides an easy interface for adding different backends, leading to a great number of them. See http://bazaar.launchpad.net/~duplicity-team/duplicity/0.7-series/files/head:/duplicity/backends/ starting with the README. If borg separated the 'create segment file' from the 'store segment file' logic, with the latter in a backend, I could steal logic from duplicity's par2 meta-backend to do it myself, but with considerably higher reliability. -- Ed Blackman From mh+borgbackup at zugschlus.de Mon Aug 31 06:45:54 2015 From: mh+borgbackup at zugschlus.de (Marc Haber) Date: Mon, 31 Aug 2015 12:45:54 +0200 Subject: [borgbackup] borgbackup 0.25.0 released References: <55E1B89B.4050600@waldmann-edv.de> <55E1B89B.4050600@waldmann-edv.de> <20150830203758.GY5060@torres.zugschlus.de> <55E42C1C.1090308@waldmann-edv.de> Message-ID: <20150831104554.GZ5060@torres.zugschlus.de> On Mon, Aug 31, 2015 at 12:27:40PM +0200, Thomas Waldmann wrote: > > thanks for keeping the development running. A few questions about > > compression: Is there any reason why xz compression is not (yet) > > supported? > > It is, see "lzma". At least on Linux, lzma is different from xz. > > Is the compression done on the client or on the server? > > Client-side (must be first compressed, then encrypted, then transmitted). Sounds fair enough. When thinking about it, the server only sees the encrypted data stream and thus cannot compress any more. Stupid me. > > And while I'm asking, are there plans to add a connection scheme that > > allows the TCP connection to go from the server (the machine holding > > the actual backup) to the client (the machine being backed up)? There > > are places with a security policy that says "no connections to the > > backup server". One possible solution would be a "ssh > > -R10222:localhost:22 client borg create foo" with the repository being > > on "localhost:10022" so that the connection from the client to the > > server is tunneled through the outgoing ssh session from the server. > > Well, sounds interesting. > > Doesn't help 100% against the "hacked production server" issue (see that > ticket in the issue tracker), though, as at specific times, the client > will be able to connect to the server (through localhost:10022) and do > whatever it wants, right? Yes, but there are some things that need to be done. ;-) I trust ssh enough so that a borgbackup account that has its authorized_keys restricted to "borg serve --restrict-to-path" is secure enough. It's just that many security/firewall people will open a can of worms if connections go the wrong direction. Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Leimen, Germany | lose things." Winona Ryder | Fon: *49 6224 1600402 Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421 From tw at waldmann-edv.de Mon Aug 31 06:27:40 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Mon, 31 Aug 2015 12:27:40 +0200 Subject: [borgbackup] borgbackup 0.25.0 released References: <55E1B89B.4050600@waldmann-edv.de> <55E1B89B.4050600@waldmann-edv.de> <20150830203758.GY5060@torres.zugschlus.de> Message-ID: <55E42C1C.1090308@waldmann-edv.de> Moin Marc, > thanks for keeping the development running. A few questions about > compression: Is there any reason why xz compression is not (yet) > supported? It is, see "lzma". > Is the compression done on the client or on the server? Client-side (must be first compressed, then encrypted, then transmitted). > And while I'm asking, are there plans to add a connection scheme that > allows the TCP connection to go from the server (the machine holding > the actual backup) to the client (the machine being backed up)? There > are places with a security policy that says "no connections to the > backup server". One possible solution would be a "ssh > -R10222:localhost:22 client borg create foo" with the repository being > on "localhost:10022" so that the connection from the client to the > server is tunneled through the outgoing ssh session from the server. Well, sounds interesting. Doesn't help 100% against the "hacked production server" issue (see that ticket in the issue tracker), though, as at specific times, the client will be able to connect to the server (through localhost:10022) and do whatever it wants, right? -- GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. From tw at waldmann-edv.de Mon Aug 31 09:22:05 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Mon, 31 Aug 2015 15:22:05 +0200 Subject: [borgbackup] borgbackup 0.25.0 released References: <55E1B89B.4050600@waldmann-edv.de> <55E1B89B.4050600@waldmann-edv.de> <20150830203758.GY5060@torres.zugschlus.de> <55E42C1C.1090308@waldmann-edv.de> <20150831104554.GZ5060@torres.zugschlus.de> Message-ID: <55E454FD.3000504@waldmann-edv.de> >>> Is there any reason why xz compression is not (yet) supported? >> It is, see "lzma". > At least on Linux, lzma is different from xz. https://docs.python.org/3/library/lzma.html Search for XZ on that page. The default format is what we use. -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From tw at waldmann-edv.de Mon Aug 31 05:53:13 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Mon, 31 Aug 2015 11:53:13 +0200 Subject: [borgbackup] borgbackup 0.25.0 released References: <55E1B89B.4050600@waldmann-edv.de> <55E1B89B.4050600@waldmann-edv.de> Message-ID: <55E42409.8010504@waldmann-edv.de> > Thank you for adding lz4! You're welcome. :) > Question also on create using lz4, the new output is like this: > > A /vjob/snap_vtst1/random.bin > U /vjob/snap_vtst1/dupel.bck That's unrelated to lz4, it is just the verbose output (-v). A is added (new file), M is modified, U is unchanged regular file. The lowercase letters are not regular files, but other stuff. Guess this needs to be documented. :) -- GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. From ag at iss-integration.com Fri Aug 21 02:07:35 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Fri, 21 Aug 2015 02:07:35 -0400 Subject: Chunker params for very large files In-Reply-To: References: Message-ID: Hello, what would be a good chunker setting for a handful (15) files with sizes from 20 GB to 150 GB to a total of 2.3 TB per day? These are database backups that cannot be made incremental. Reading https://borgbackup.github.io/borgbackup/internals.html#chunks CHUNK_MIN_EXP = 10 (minimum chunk size = 2^10 B = 1 kiB) CHUNK_MAX_EXP = 23 (maximum chunk size = 2^23 B = 8 MiB) HASH_MASK_BITS = 16 (statistical medium chunk size ~= 2^16 B = 64 kiB) HASH_WINDOW_SIZE = 4095 [B] (0xFFF) An existing recommendation of 19,23,21,4095 for huge files from https://borgbackup.github.io/borgbackup/usage.html appears to translate into: minimum chunk of 512 KiB maximum chunk of 8 MiB medium chunk of 2 MiB In a 100GB file we are looking at 51200 chunks. Would it be beneficial to raise these further? The machine I have doing this has plenty of RAM (32 GB) and 8 CPU cores at 2.3 GHz, so RAM/compute is not a problem. Main goal is processing speed then deduplication efficiency. Thank you, Alex From tw at waldmann-edv.de Fri Aug 21 07:48:12 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Fri, 21 Aug 2015 13:48:12 +0200 Subject: [borgbackup] Chunker params for very large files References: Message-ID: <55D70FFC.2020100@waldmann-edv.de> Hi Alex, > Hello, what would be a good chunker setting for a handful (15) files > with sizes from 20 GB to 150 GB to a total of 2.3 TB per day? These > are database backups that cannot be made incremental. That depends a bit on your goals. If you have enough space and you rather care for good speed, little management overhead (but not so much about deduplicating with very fine grained blocks), use a higher value for HASH_MASK_BITS, like 20 or 21, so it creates larger chunks in the statistical medium. It sounds like this matches your case. If you care for very fine grained deduplication and you maybe don't have that much data and you can live with the management overhead, use a small chunksize (small HASH_MASK_BITS, like the default 16). > An existing recommendation of 19,23,21,4095 for huge files from > https://borgbackup.github.io/borgbackup/usage.html appears to > translate into: > > minimum chunk of 512 KiB > maximum chunk of 8 MiB > medium chunk of 2 MiB > > In a 100GB file we are looking at 51200 chunks. You need to take the total amount of your data (~2TB) and compute the chunk count (1.000.000). Then use the resource formula from the docs and compute the sizes of the index files (and RAM needs). In your case this looks quite reasonable, you could also use 1MB chunks, but better don't use 64KB chunks. > beneficial to raise these further? The machine I have doing this has > plenty of RAM (32 GB) and 8 CPU cores at 2.3 GHz, so RAM/compute is > not a problem. Right. But if your index is rather big, it'll need to copy around a lot of data (for transactions, for resyncing the cache in case you backup multiple machines to same repo). Cheers, Thomas ---- GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. From ag at iss-integration.com Sun Aug 30 23:27:29 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Sun, 30 Aug 2015 23:27:29 -0400 Subject: [borgbackup] Borg speed tuning on large files References: <55DC2790.3000005@waldmann-edv.de> <55DCEFF1.1050408@waldmann-edv.de> <55E1B248.1060409@waldmann-edv.de> Message-ID: Hi Thomas, On Sat, Aug 29, 2015 at 9:23 AM, Thomas Waldmann wrote: >> Tool Parameters Data size (apparent) Repo size Hrs Ratio C Rat C >> MB/s >> gzip c3 2308843696 560376600 22 24% 4.1 7 >> Attic First Run default 2251760621 531964928 48 24% 4.2 3 >> Attic Next Run default 2308843696 234398336 32 10% 9.9 2 >> Borg First Run C0,19,23,21,4095 2330579192 2354907008 26 101% 1 25 >> Borg Next Run C0,19,23,21,4095 2270686256 1341393408 18 59% 1.7 21 >> Borg First Run C3,19,23,21,4095 2270686256 568351360 33 25% 4 5 >> Borg Next Run C3,19,23,21,4095 2268472600 302165632 23 13% 7.5 4 >> Borg Next Run C1,19,23,21,4095 2247244128 422037120 24 19% 5.3 5 > > Nice to see confirmation that we are quite faster than Attic. :) > > Hmm, should the last line read "Borg First Run ... C1"? Yes, I switched the [now obsolete] parameter to level 1 for a "next run" > > In general, to evaluate the speed, it might be easier to only do "first > runs", because there always some specific amount of data (== all input > data) gets processed. But...in that case gzip beats all :). > > In "next run", the amount of data actually needing processing might vary > widely, depending on how much change there is between first and next run. Understood, though the point of dedup is to save space on shared/unchanged data regions. In my case the data is likely not as similar, with 59% at no compression it means we only found 41% of "same data" whereas I know in these databases 10% of change a day is high. So maybe I need to go chunk size hunting. For others this will likely work in a more efficient manner. > BTW, note for other readers: the "Parameters" column can't be given that > way to borg, it needs to be (e.g.): > borg create -C1 --chunker-params 19,23,21,4095 repo::archive data > > Or in 0.25: > borg create -C zlib,1 --chunker-params .... > >> Here is a picture in case the text does not come through well: > > Yeah, that looked better. :) > > BTW, what you currently have in the C MB/s column is how many compressed > MB/s it actually writes to storage (and if that is a limiting factor, it > would be your target storage, not borg). Sorry, I should have commented, C is for computed, i.e. size divided by time. I assume storage is not an issue, as uncompressed data can pump here at 50+ MB/s. > > Maybe more interesting would be how much uncompressed data it can > process per second. > >> Oddly, compression setting of 1 took longer than C3. > > Either there is a mistake in your table or your cpu is so fast that > higher compression saves more time by avoiding I/O than it needs for the > better compression. That makes sense, CPU on this box is quite powerful. > > With 0.25.0 you could try: > - lz4 = superfast, but low compression > - lzma = slow/expensive, but high compression > - none - no compression, no overhead (this is not zlib,0 any more) Started lz4 trials tonight, will update! > >> C0 shows the actual dedup capability of this data. > > Doesn't seem to find significant amounts of "internal" duplication > within a "first run". Historical dedup seems to work and help, though. > > Does that match your expectations considering the contents of your files? It's a big mystery, highly esoteric database (think MUMPS :) but I know overall change is unlikely to exceed 10% of "business content" per day. So I am not finding the right chunk size yet. > > In case you measure again, keep an eye on CPU load. I see borg taking 99% of one core, load average in the 3-4 range, but other processes are working, so this may be a bit muddled, I will observe at idle times. > >> My business goal here is to get >> the data in within a day, so about 12 hours or so. > > If you can partition your data set somehow into N pieces and use N > separate repos, you could save some time by running N borgs in parallel > (assuming your I/O isn't a bottleneck then). > > N ~= core count of your CPU > > At some time in the future, borg might be able to a similar thing by > internal multithreading, but that is not ready for production yet. Understood, hard to do and make safe. Thanks. > > There are also some other optimizations possible in the code (using > different hashes, different crypto modes, ...) - we'll try making it > much faster. Much appreciated, I have the good high stress real life playground to test this. Alex > > -- > > > GPG ID: FAF7B393 > GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 > From ag at iss-integration.com Sun Aug 30 23:13:19 2015 From: ag at iss-integration.com (Alex Gorbachev) Date: Sun, 30 Aug 2015 23:13:19 -0400 Subject: [borgbackup] borgbackup 0.25.0 released References: <55E1B89B.4050600@waldmann-edv.de> <55E1B89B.4050600@waldmann-edv.de> Message-ID: Thank you for adding lz4! Question also on create using lz4, the new output is like this: A /vjob/snap_vtst1/random.bin U /vjob/snap_vtst1/dupel.bck What do A and U mean in the beginning? Thanks, Alex On Sat, Aug 29, 2015 at 9:50 AM, Thomas Waldmann wrote: > > Hi, > > just wanted to tell that there is a fresh release with a lot of > improvements and fixes: > > https://github.com/borgbackup/borg/blob/0.25.0/CHANGES.rst what's new > > https://pypi.python.org/pypi/borgbackup/0.25.0 pip package > > https://github.com/borgbackup/borg/issues/147 binary wheels (soon) > > All releases are signed by me (like this message), please check the > signature. > > Cheers, > > Thomas > -- > > GPG ID: FAF7B393 > GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 > > From mh+borgbackup at zugschlus.de Sun Aug 30 16:37:58 2015 From: mh+borgbackup at zugschlus.de (Marc Haber) Date: Sun, 30 Aug 2015 22:37:58 +0200 Subject: [borgbackup] borgbackup 0.25.0 released References: <55E1B89B.4050600@waldmann-edv.de> <55E1B89B.4050600@waldmann-edv.de> Message-ID: <20150830203758.GY5060@torres.zugschlus.de> Hi, On Sat, Aug 29, 2015 at 03:50:19PM +0200, Thomas Waldmann wrote: > just wanted to tell that there is a fresh release with a lot of > improvements and fixes: > > https://github.com/borgbackup/borg/blob/0.25.0/CHANGES.rst what's new > > https://pypi.python.org/pypi/borgbackup/0.25.0 pip package > > https://github.com/borgbackup/borg/issues/147 binary wheels (soon) thanks for keeping the development running. A few questions about compression: Is there any reason why xz compression is not (yet) supported? Is the compression done on the client or on the server? And while I'm asking, are there plans to add a connection scheme that allows the TCP connection to go from the server (the machine holding the actual backup) to the client (the machine being backed up)? There are places with a security policy that says "no connections to the backup server". One possible solution would be a "ssh -R10222:localhost:22 client borg create foo" with the repository being on "localhost:10022" so that the connection from the client to the server is tunneled through the outgoing ssh session from the server. Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Leimen, Germany | lose things." Winona Ryder | Fon: *49 6224 1600402 Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421 From anarcat at debian.org Sat Oct 17 02:47:09 2015 From: anarcat at debian.org (=?utf-8?q?Antoine_Beaupr=C3=A9?=) Date: Sat, 17 Oct 2015 02:47:09 -0400 Subject: summary of my last hacking streak In-Reply-To: <8737xa800i.fsf@marcos.anarc.at> References: <8737xa800i.fsf@marcos.anarc.at> Message-ID: <8737xa800i.fsf@marcos.anarc.at> Hi, This week, I took my laptop out to the countryside, without internet (but with a HAM radio ;), to focus on peace and quiet and maybe work on borg. First off, I must apologise for the mess I created on the master branch. I attempted to create a master branch in my own fork with all the branches merged in and I mistakenly pushed that to the main repo, which closed all the pull requests and put all my code live at once. I have reverted the change and now only the --version pull request is there. I am at a loss at words at how to handle this... I guess I could rebase those branches on top of the new messed up master, or we could reset the master branch again. I am not sure what is the best way to proceed, but I can't fix this cleanly myself, as I cannot force-push to master. :( Again, apologies. Here is the rest of the mail i was hoping to send happily an hour ago... Pull requests ============= I ended up with 6 feature branchs which i submitted as pull requests: * i18n: translation support for borg https://github.com/borgbackup/borg/pull/287 * man-builder: failed attempt at writing a manpage formatter for argparse - the latter is such a tangled mess that i can't make head or tails of it, and i gave up: let's wait for click https://github.com/borgbackup/borg/issues/208 * no-inplace: operate on a "copy" during upgrades, the previous "inplace" mode is still available through `--inplace` https://github.com/borgbackup/borg/pull/280 * upgrader-index-fixes: fixes for the upgrader index problems, not sure how to create a unit test for this one, as repository.check() doesn't find the error (only create does) https://github.com/borgbackup/borg/pull/281 * verbosity: lots of tweaks to verbosity levels, --progress and --stats https://github.com/borgbackup/borg/pull/288 * version-arg: a --version flag! (this was mistakenly pushed directly to master, sorry, let me know if i need to revert) * x-option, add -x to --do-not-cross-mountpoints setting https://github.com/borgbackup/borg/pull/282 Some of those conflict with each other: i18n and verbosity touch similar areas, but the merge is fairly trivial. I have merged all the above (but man-builder) into my master branch: https://github.com/anarcat/borg/commits/master (Update: this was mistakenly pushed to the main master branch, and then reverted.) Bug reports =========== * tox seems to run the Attic test suite: https://github.com/borgbackup/borg/issues/283 * can't run tests offline: https://github.com/borgbackup/borg/issues/284 * lock.exclusive left behind: https://github.com/borgbackup/borg/issues/285 Other notes =========== Upgrade notes ------------- I ran a new conversion from Attic, this time on a fresh backup of my laptop (i usually work on my workstation). The backup looked like: # attic create --exclude-caches -e /home/anarcat/mp3/ -e /home/pbuilder/ -e /home/anarcat/video/ -e /home/anarcat/iso/ -e /home/anarcat/books -e /home/anarcat/books-incoming -e .cache --stats --do-not-cross-mountpoints /media/anarcat/calyx/attic-angela::2015-10-15-first / Initializing cache... ------------------------------------------------------------------------------ Archive name: 2015-10-15-first Archive fingerprint: 5626925d54fae692e7ca5cb852e6629661d4ac5de00ad6b6494979980ec7822c Start time: Thu Oct 15 12:03:34 2015 End time: Thu Oct 15 15:28:44 2015 Duration: 3 hours 25 minutes 10.06 seconds Number of files: 847008 Original size Compressed size Deduplicated size This archive: 56.78 GB 38.82 GB 35.55 GB All archives: 56.78 GB 38.82 GB 35.55 GB ------------------------------------------------------------------------------ First borg run after upgrade was fast enough: ------------------------------------------------------------------------------ Archive name: 2015-10-15-post-attic Archive fingerprint: 44812971c502a25dcad61a7664eab3559405c6f1fd8b38bc50118f4619dbca72 Start time: Thu Oct 15 18:36:02 2015 End time: Thu Oct 15 18:49:15 2015 Duration: 13 minutes 13.69 seconds Number of files: 847084 Original size Compressed size Deduplicated size This archive: 56.78 GB 38.85 GB 41.12 MB All archives: 113.57 GB 77.67 GB 35.59 GB Unique chunks Total chunks Chunk index: 1430830 3255749 ------------------------------------------------------------------------------ So it seems the chunks cache got reused properly this time, maybe because the dataset is smaller. weird integrity failure ----------------------- When running tests with `-s`, we see some exceptions show up in the log. It's a little confusing because we have the feeling that tests fail because of the exception, while it seems the error is normal. Failing test sample: borg/testsuite/archiver.py::ArchiverCheckTestCase::test_missing_archive_item_chunk Exception ignored in: > Traceback (most recent call last): File "/home/anarcat/src/borg/borg/repository.py", line 69, in __del__ self.close() File "/home/anarcat/src/borg/borg/repository.py", line 155, in close self.lock.release() File "/home/anarcat/src/borg/borg/locking.py", line 281, in release self._roster.modify(EXCLUSIVE, REMOVE) File "/home/anarcat/src/borg/borg/locking.py", line 203, in modify elements.remove(self.id) KeyError: (('angela', 28267, 2564556544),) PASSED Another: borg/testsuite/archiver.py::RemoteArchiverTestCase::test_corrupted_repository Segment entry checksum mismatch [segment 2, offset 8] Index object count mismatch. 147 != 0 PASSED Maybe those warnings are unimportant and should be ignored, but they certainly make tests more confusing, and if those errors are expected, maybe they shouldn't output those errors? argv ---- It would be nice if borg would show up as "borg" instead of "python borg" or whatever. bup does this through unpythonize_argv() which does some pretty heavy memset() stuff to clear it up. there has to be a better way (tm). See: https://github.com/borgbackup/borg/issues/286 verbosity and logging --------------------- it's harder to censor the unchanged files with stderr output: we basically need to shove it back to stdout to grep it. that is annoying. basically, -v is way too verbose - it's unusable. also, as it is, --progress doesn't show *any* progress until the file cache is loaded, which is confusing, as we don't know if borg is waiting for a lock, blocked or what. the basic problem could be that --progress conflicts with the per-file output. to resolve that, i have pushed the file listing down to the DEBUG level for now, but the question of how to handle different verbosity levels still stands but i do feel confident now that the logging-refactor branch can be merged in, and in fact most branches i have been working on have been based on it... oh, and --progress now knows about the terminal width, so it makes good use of all the space to show file paths. it also shows the number of files found so far. --stats could similarly be improved as well... format_file_sizes ----------------- This could be reimplemented to cover more than terabytes, there's some neat code from stackexchange for this... http://stackoverflow.com/a/1094933/1174784 StableDict ---------- Why is that thing necessary, when we have OrderedDict? From anarcat at debian.org Sat Oct 17 22:13:46 2015 From: anarcat at debian.org (=?utf-8?q?Antoine_Beaupr=C3=A9?=) Date: Sat, 17 Oct 2015 22:13:46 -0400 Subject: public git repo was reset References: <8737xa800i.fsf@marcos.anarc.at> <8737xa800i.fsf@marcos.anarc.at> Message-ID: <87vba42aat.fsf@marcos.anarc.at> On 2015-10-17 02:47:09, Antoine Beaupr? wrote: > Hi, > > This week, I took my laptop out to the countryside, without internet > (but with a HAM radio ;), to focus on peace and quiet and maybe work on > borg. > > First off, I must apologise for the mess I created on the master > branch. I attempted to create a master branch in my own fork with all > the branches merged in and I mistakenly pushed that to the main repo, > which closed all the pull requests and put all my code live at once. I > have reverted the change and now only the --version pull request is > there. > > I am at a loss at words at how to handle this... I guess I could rebase > those branches on top of the new messed up master, or we could reset the > master branch again. I am not sure what is the best way to proceed, but > I can't fix this cleanly myself, as I cannot force-push to master. :( > > Again, apologies. Here is the rest of the mail i was hoping to send > happily an hour ago... So a followup on this, Thomas reset the git repository this morning. Which means that if some of you pulled the repo between last night (2015-10-17T02:47:09-0400) and today (12:11:23-0400), a git remote update will force-update the remote branches, and you will need to reset your local "master" branch, and rebase topic branches. To reset the master branch: git remote update git checkout master git reset master To rebase a topic branch: git remote update git rebase upstream topic --onto origin/master In the above, "topic" is the name of your topic branch and "upstream" is the original branch point of your topic branch, basically where "master" was when you created the branch. "git log --decorate" is useful to figure that out. In fact, I use the following alias profusely for stuff like this: [alias] lg = log --graph --pretty=format:'%Cred%h%Creset %C(green)%G?%Creset%C(yellow)%d%Creset %s %Cgreen(%ar) %C(blue)<%an>%Creset' --abbrev-commit --date=relative Sorry again for the trouble, and thanks to Thomas for fixing all of this! I have rerolled all my pull requests into new ones, because github believes they were merged... I would still welcome any feedback on the original post as well, of course. :) A. -- Information is not knowledge. Knowledge is not wisdom. Wisdom is not truth. Truth is not beauty. Beauty is not love. Love is not music. Music is the best. - Frank Zappa From tw at waldmann-edv.de Wed Oct 7 08:57:49 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Wed, 7 Oct 2015 14:57:49 +0200 Subject: borgbackup 0.27.0 released In-Reply-To: <561516CD.1080501@waldmann-edv.de> References: <561516CD.1080501@waldmann-edv.de> Message-ID: <561516CD.1080501@waldmann-edv.de> See there: https://github.com/borgbackup/borg/releases/tag/0.27.0 It's also available on PyPi (via pip install). Cheers, Thomas -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From leon at tim-online.nl Mon Oct 12 09:21:35 2015 From: leon at tim-online.nl (Leon Bogaert) Date: Mon, 12 Oct 2015 15:21:35 +0200 Subject: one global repository or one repository per host In-Reply-To: <561BB3DF.4000301@tim-online.nl> References: <561BB3DF.4000301@tim-online.nl> Message-ID: <561BB3DF.4000301@tim-online.nl> I'm just setting up borg backup on our servers to test with it. We have about 15 servers with about 100GB to backup per server. Should I create one repository on the backupserver per host or should I create one global repository and prefix every archive with the hostname? Thanks in advance! -- Tim_online B.V. Axelsestraat 4 4543CJ, Zaamslag tel.: (+31) 115 851 851 www.tim-online.nl From tw at waldmann-edv.de Mon Oct 12 10:57:18 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Mon, 12 Oct 2015 16:57:18 +0200 Subject: [borgbackup] one global repository or one repository per host References: <561BB3DF.4000301@tim-online.nl> <561BB3DF.4000301@tim-online.nl> Message-ID: <561BCA4E.1070707@waldmann-edv.de> > I'm just setting up borg backup on our servers to test with it. We have > about 15 servers with about 100GB to backup per server. > > Should I create one repository on the backupserver per host or should I > create one global repository and prefix every archive with the hostname? If you want to take advantage of deduplication AND you do not want to backup simultaneously AND you can live with cache resyncs happening regularly and taking some time, then you can store into same repo. Use a different prefix for the archives (e.g. hostname-date) and be careful to use that prefix also when pruning. If above does not apply (or there isn't much inter-server duplication, use different repos. --- GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. From tw at waldmann-edv.de Fri Oct 16 19:16:42 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sat, 17 Oct 2015 01:16:42 +0200 Subject: [borgbackup] Renaming an archive References: <56217B5A.3090408@age.bdeb.qc.ca> <56217B5A.3090408@age.bdeb.qc.ca> <56218377.8060808@waldmann-edv.de> <56218431.4050307@age.bdeb.qc.ca> Message-ID: <5621855A.9050201@waldmann-edv.de> On 10/17/2015 01:11 AM, Tech AGEBdB wrote: > Great, now I feel silly :P > > In my defense, the online doc does not mention the 'rename' option: > > https://borgbackup.readthedocs.org/en/latest/search.html?q=rename&check_keywords=yes&area=default Good finding, I filed a bug: https://github.com/borgbackup/borg/issues/279 -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From tech at age.bdeb.qc.ca Fri Oct 16 18:34:02 2015 From: tech at age.bdeb.qc.ca (Tech AGEBdB) Date: Fri, 16 Oct 2015 18:34:02 -0400 Subject: Renaming an archive In-Reply-To: <56217B5A.3090408@age.bdeb.qc.ca> References: <56217B5A.3090408@age.bdeb.qc.ca> Message-ID: <56217B5A.3090408@age.bdeb.qc.ca> Hi! I've been mucking with borg a bit (yay borg!) and I wanted to know if it was possible to rename archives. I did my first backup manually with a test name and I now want to use the 'hostname'-date format to be able to purge. I could always delete the first backup with the 'delete' option but since it's pretty big it seems like a waste of cpu resources... Great job on this program. I'll never use duplicity my servers again! -- Louis-Philippe V?ronneau - Informaticien AGEBdB - Association G?n?rale ?tudiante de Bois-de-Boulogne 12555 av. de Bois-de-Boulogne T?l?phone: (514) 332-3000 #7580 Site web: http://agebdeb.org From tech at age.bdeb.qc.ca Fri Oct 16 19:11:45 2015 From: tech at age.bdeb.qc.ca (Tech AGEBdB) Date: Fri, 16 Oct 2015 19:11:45 -0400 Subject: [borgbackup] Renaming an archive References: <56217B5A.3090408@age.bdeb.qc.ca> <56217B5A.3090408@age.bdeb.qc.ca> <56218377.8060808@waldmann-edv.de> Message-ID: <56218431.4050307@age.bdeb.qc.ca> Great, now I feel silly :P In my defense, the online doc does not mention the 'rename' option: https://borgbackup.readthedocs.org/en/latest/search.html?q=rename&check_keywords=yes&area=default Thanks! -- Louis-Philippe V?ronneau - Informaticien AGEBdB - Association G?n?rale ?tudiante de Bois-de-Boulogne 12555 av. de Bois-de-Boulogne T?l?phone: (514) 332-3000 #7580 Site web: http://agebdeb.org On 16/10/15 07:08 PM, Thomas Waldmann wrote: >> I've been mucking with borg a bit (yay borg!) and I wanted to know if it >> was possible to rename archives. > > Sure, try "borg rename --help". > > --- > > GPG ID: FAF7B393 > GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 > From tw at waldmann-edv.de Fri Oct 16 19:08:39 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sat, 17 Oct 2015 01:08:39 +0200 Subject: [borgbackup] Renaming an archive References: <56217B5A.3090408@age.bdeb.qc.ca> <56217B5A.3090408@age.bdeb.qc.ca> Message-ID: <56218377.8060808@waldmann-edv.de> > I've been mucking with borg a bit (yay borg!) and I wanted to know if it > was possible to rename archives. Sure, try "borg rename --help". --- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From anarcat at debian.org Tue Oct 20 03:31:51 2015 From: anarcat at debian.org (=?utf-8?q?Antoine_Beaupr=C3=A9?=) Date: Tue, 20 Oct 2015 03:31:51 -0400 Subject: one to many? In-Reply-To: <87io62572w.fsf@marcos.anarc.at> References: <87io62572w.fsf@marcos.anarc.at> Message-ID: <87io62572w.fsf@marcos.anarc.at> We know more or less how borg behaves in the "many to one" scenario, that is when we try to backup multiple servers to a single backup repository. What about "one to many"? Can a single server be backed up to multiple independent borg repositories? A. -- O gentilshommes, la vie est courte. Si nous vivons, nous vivons pour marcher sur la t?te des rois. - William Shakespeare From tw at waldmann-edv.de Tue Oct 20 09:37:21 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Tue, 20 Oct 2015 15:37:21 +0200 Subject: [borgbackup] one to many? References: <87io62572w.fsf@marcos.anarc.at> <87io62572w.fsf@marcos.anarc.at> Message-ID: <56264391.5040203@waldmann-edv.de> > What about "one to many"? Can a single server be backed up to multiple > independent borg repositories? I don't see any problem with that. Make as many backups as you like. :) From tw at waldmann-edv.de Tue Oct 20 12:24:56 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Tue, 20 Oct 2015 18:24:56 +0200 Subject: [borgbackup] one to many? References: <87io62572w.fsf@marcos.anarc.at> <87io62572w.fsf@marcos.anarc.at> <56264391.5040203@waldmann-edv.de> <87fv1562vd.fsf@marcos.anarc.at> Message-ID: <56266AD8.7050405@waldmann-edv.de> >>> What about "one to many"? Can a single server be backed up to multiple >>> independent borg repositories? >> >> I don't see any problem with that. Make as many backups as you like. :) > > I was wondering if there would be an impact on the chunks cache - > wouldn't it be inconsistent when related to the different repositories? See output of this: ls -l ~/.cache/borg What you see are the (unique) repoids used for directories to keep the stuff separate, so there are no issues with that. As a side note: space usage for the cache will increase if you backup to multiple repositories, so be careful with free disk space. --- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From leon at tim-online.nl Tue Oct 13 05:15:30 2015 From: leon at tim-online.nl (Leon Bogaert) Date: Tue, 13 Oct 2015 11:15:30 +0200 Subject: [borgbackup] one global repository or one repository per host References: <561BB3DF.4000301@tim-online.nl> <561BB3DF.4000301@tim-online.nl> <561BCA4E.1070707@waldmann-edv.de> Message-ID: <561CCBB2.7030007@tim-online.nl> >> I'm just setting up borg backup on our servers to test with it. We have >> about 15 servers with about 100GB to backup per server. >> >> Should I create one repository on the backupserver per host or should I >> create one global repository and prefix every archive with the hostname? > > If you want to take advantage of deduplication AND you do not want to > backup simultaneously AND you can live with cache resyncs happening > regularly and taking some time, then you can store into same repo. Use a > different prefix for the archives (e.g. hostname-date) and be careful to > use that prefix also when pruning. > > If above does not apply (or there isn't much inter-server duplication, > use different repos. Thanks for the great explanation Thomas. There is quite a bit inter-server duplication but that usually are smaller files. I think I'm going to have to experiment what would be most beneficial for us. Maybe use a different repository for each group of servers. -- Tim_online B.V. Axelsestraat 4 4543CJ, Zaamslag tel.: (+31) 115 851 851 www.tim-online.nl From tw at waldmann-edv.de Wed Nov 18 09:00:49 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Wed, 18 Nov 2015 15:00:49 +0100 Subject: [Borgbackup] first post + test Message-ID: <564C8491.3000808@waldmann-edv.de> test -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From jdc at uwo.ca Fri Dec 4 09:08:55 2015 From: jdc at uwo.ca (Dan Christensen) Date: Fri, 04 Dec 2015 09:08:55 -0500 Subject: [Borgbackup] conversion from attic to borg Message-ID: <87zixqz4p4.fsf@uwo.ca> I've been using attic for a while, and am now thinking of switching to borg. I saw various issues and pull requests about the "borg upgrade" command to convert an attic repo, but I couldn't find any documentation online. I've now found the output of "borg upgrade -h", which is quite helpful. (I've included it below in part to help make it show up in web searches.) Before going ahead with this, I wanted to confirm that it is working and ask if there are any gotchas I should look out for. Also, how fast is it, with and without the --inplace option? If several client machines access the same repo, do I run it for each of them, or just one of them? And presumably it's faster if I run it directly on the machine storing the repo? Thanks, Dan usage: borg-linux64 upgrade [-h] [-v] [--show-rc] [--no-files-cache] [--umask M] [--remote-path PATH] [-n] [-i] [REPOSITORY] upgrade a repository from a previous version positional arguments: REPOSITORY path to the repository to be upgraded optional arguments: -h, --help show this help message and exit -v, --verbose verbose output --show-rc show/log the return code (rc) --no-files-cache do not load/update the file metadata cache used to detect unchanged files --umask M set umask to M (local and remote, default: 63) --remote-path PATH set remote path to executable (default: "borg") -n, --dry-run do not change repository -i, --inplace rewrite repository in place, with no chance of going back to older versions of the repository. upgrade an existing Borg repository. this currently only support converting an Attic repository, but may eventually be extended to cover major Borg upgrades as well. it will change the magic strings in the repository's segments to match the new Borg magic strings. the keyfiles found in $ATTIC_KEYS_DIR or ~/.attic/keys/ will also be converted and copied to $BORG_KEYS_DIR or ~/.borg/keys. the cache files are converted, from $ATTIC_CACHE_DIR or ~/.cache/attic to $BORG_CACHE_DIR or ~/.cache/borg, but the cache layout between Borg and Attic changed, so it is possible the first backup after the conversion takes longer than expected due to the cache resync. upgrade should be able to resume if interrupted, although it will still iterate over all segments. if you want to start from scratch, use `borg delete` over the copied repository to make sure the cache files are also removed: borg delete borg unless ``--inplace`` is specified, the upgrade process first creates a backup copy of the repository, in REPOSITORY.upgrade-DATETIME, using hardlinks. this takes longer than in place upgrades, but is much safer and gives progress information (as opposed to ``cp -al``). once you are satisfied with the conversion, you can safely destroy the backup copy. WARNING: running the upgrade in place will make the current copy unusable with older version, with no way of going back to previous versions. this can PERMANENTLY DAMAGE YOUR REPOSITORY! Attic CAN NOT READ BORG REPOSITORIES, as the magic strings have changed. you have been warned. From jdc at uwo.ca Fri Dec 4 09:28:42 2015 From: jdc at uwo.ca (Dan Christensen) Date: Fri, 04 Dec 2015 09:28:42 -0500 Subject: [Borgbackup] status of cache resync improvements Message-ID: <87a8pqz3s5.fsf@uwo.ca> I read various issues and pull requests about changes to the cache resync process in borg, and was wondering what the current status is. I have multiple machines backing up to one repo, and the resyncs with attic are getting very slow. I also prune fairly regularly. As an example with attic, one repository I have has Original size Compressed size Deduplicated size All archives: 1.61 TB 976.84 GB 30.92 GB This repo contains 188 archives. Rebuilding the cache, over gigabit ethernet, takes 40 minutes, and is CPU bound on the local machine. The local and remote machines both have plenty of ram and the local machine has a fast cpu. (For some reason, a similar but slightly smaller repo I have needs 2 hours for a cache rebuild, on the same machines.) Attic's cache directory is 98MB, and it takes 2.3 *seconds* to copy it from the remote machine to the local machine using scp. Because of this, it seems to me to make sense to keep a copy of the cache in the remote repo, and then copy to the local machine when borg notices that the cache is out of sync. The cache would add 0.3% to the repo size, in this case. But maybe the improvements that borg has made make things fast enough that this isn't needed? Thanks, Dan From anarcat at debian.org Fri Dec 4 09:35:12 2015 From: anarcat at debian.org (Antoine =?utf-8?Q?Beaupr=C3=A9?=) Date: Fri, 04 Dec 2015 09:35:12 -0500 Subject: [Borgbackup] conversion from attic to borg In-Reply-To: <87zixqz4p4.fsf@uwo.ca> References: <87zixqz4p4.fsf@uwo.ca> Message-ID: <87fuzi2sf3.fsf@marcos.anarc.at> On 2015-12-04 09:08:55, Dan Christensen wrote: > I've been using attic for a while, and am now thinking of switching to > borg. I saw various issues and pull requests about the "borg upgrade" > command to convert an attic repo, but I couldn't find any documentation > online. I've now found the output of "borg upgrade -h", which is quite > helpful. (I've included it below in part to help make it show up in web > searches.) The documentation for the upgrade system should be available in the main docs as well. It is actually a bug that it doesn't show up there: http://borgbackup.readthedocs.org/en/latest/usage.html A PR to fix that would of course be welcome. :) > Before going ahead with this, I wanted to confirm that it is working and > ask if there are any gotchas I should look out for. The biggest gotchas is that attic won't be able to run over your converted repository. > Also, how fast is it, with and without the --inplace option? It's pretty fast! `--inplace` just skips the step where it copies (equivalent of `cp -al`) the repository into a backup repo. Once that copy is done, it just needs to write a few bytes to every segment, which takes at most a few minutes. > If several client machines access the same repo, do I run it for > each of them, or just one of them? I didn't think of that: i would assume it's enough to run it once. > And presumably it's faster if I run it directly on the machine storing > the repo? I would assume so as well. In fact, I don't know if it's possible at all to run it remotely. A. -- Ce que les si?cles des grands abatoirs nous aura appris Devrait ?tre inscrit au fond de toutes les ?coles; Voici l'homme: le destructeur des mondes est arriv?. - [no one is innocent] From jdc at uwo.ca Fri Dec 4 09:46:46 2015 From: jdc at uwo.ca (Dan Christensen) Date: Fri, 04 Dec 2015 09:46:46 -0500 Subject: [Borgbackup] conversion from attic to borg In-Reply-To: <87fuzi2sf3.fsf@marcos.anarc.at> References: <87zixqz4p4.fsf@uwo.ca> <87fuzi2sf3.fsf@marcos.anarc.at> Message-ID: <87y4daxodl.fsf@uwo.ca> Antoine Beaupr? writes: > On 2015-12-04 09:08:55, Dan Christensen wrote: > >> If several client machines access the same repo, do I run it for >> each of them, or just one of them? > > I didn't think of that: i would assume it's enough to run it once. I read that the upgrade process upgrades the cache as well, so I guess the other clients that access the same repo won't get their caches upgraded, and will have to do a full rebuild instead? >> And presumably it's faster if I run it directly on the machine storing >> the repo? > > I would assume so as well. In fact, I don't know if it's possible at all > to run it remotely. In many use cases, the repo will never have been accessed directly from the machine storing the repo, so there will be no existing cache on that machine. Presumably that's not a problem, and one could use the --no-files-cache option? Dan From anarcat at debian.org Fri Dec 4 10:45:20 2015 From: anarcat at debian.org (Antoine =?utf-8?Q?Beaupr=C3=A9?=) Date: Fri, 04 Dec 2015 10:45:20 -0500 Subject: [Borgbackup] conversion from attic to borg In-Reply-To: <87y4daxodl.fsf@uwo.ca> References: <87zixqz4p4.fsf@uwo.ca> <87fuzi2sf3.fsf@marcos.anarc.at> <87y4daxodl.fsf@uwo.ca> Message-ID: <8737vi2p67.fsf@marcos.anarc.at> On 2015-12-04 09:46:46, Dan Christensen wrote: > Antoine Beaupr? writes: > >> On 2015-12-04 09:08:55, Dan Christensen wrote: >> >>> If several client machines access the same repo, do I run it for >>> each of them, or just one of them? >> >> I didn't think of that: i would assume it's enough to run it once. > > I read that the upgrade process upgrades the cache as well, so I guess > the other clients that access the same repo won't get their caches > upgraded, and will have to do a full rebuild instead? Ah, true, i forgot about those. Then I guess you *could* run the upgrade process multiple times, once per client. The server side is "idempotent", that is: it can be run multiple times without ill effects, and it will skip work already done. >>> And presumably it's faster if I run it directly on the machine storing >>> the repo? >> >> I would assume so as well. In fact, I don't know if it's possible at all >> to run it remotely. > > In many use cases, the repo will never have been accessed directly from > the machine storing the repo, so there will be no existing cache on that > machine. Presumably that's not a problem, and one could use the > --no-files-cache option? I don't think `--no-files-cache` is a valid option for the upgrade command, but it could be added! If you mean on the `create` command, then yes, you could use that option, but borg will re-create the missing cache anyways... a. -- Au nom de l'?tat, la force s'appelle droit. Au main de l'individu, elle s'appelle crime. - Max Stirner From tw at waldmann-edv.de Fri Dec 4 10:46:06 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Fri, 4 Dec 2015 16:46:06 +0100 Subject: [Borgbackup] status of cache resync improvements In-Reply-To: <87a8pqz3s5.fsf@uwo.ca> References: <87a8pqz3s5.fsf@uwo.ca> Message-ID: <5661B53E.3060804@waldmann-edv.de> Hi Dan, > I read various issues and pull requests about changes to the cache > resync process in borg, and was wondering what the current status is. As it currently looks, the cache merging code will get a LOT faster in next release (already in github master branch) due to that fix: https://github.com/borgbackup/borg/issues/450 (this also applies to attic, btw.) Previously, there were also some other changes (like local per-archive index caching, which speeds up things, too, but also uses quite a lot of space. also the index merging code was rewritten in C). > This repo contains 188 archives. Be careful, with borgbackup you will need a LOT of space for ~/.cache/borg (except if you switch off per-archive index caching, see the README about chunks.archive.d). So it is space vs. speed here, the usual problem. > Rebuilding the cache, over gigabit ethernet, takes 40 minutes, and is > CPU bound on the local machine. The local and remote machines both have > plenty of ram and the local machine has a fast cpu. If your client-server connection is that fast and you do not have plenty of space for chunks.archive.d, it is maybe better to switch the local chunks archive caching off. > (For some reason, a > similar but slightly smaller repo I have needs 2 hours for a cache > rebuild, on the same machines.) Maybe due to some hashtable weirdness, see bug above. > Because of this, it seems to me to make sense to keep a copy of the > cache in the remote repo, and then copy to the local machine when borg > notices that the cache is out of sync. The cache would add 0.3% to the > repo size, in this case. We have to be careful to not disclose information about your stuff (the data in the repo might be encrypted, but the index stuff is not [yet?]). > But maybe the improvements that borg has made make things fast enough > that this isn't needed? Cache resync is a lot of computation. And if one adds AND deletes archives [prune] there is no easy & fast way to re-compute the index. Cheers, Thomas -- GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. From tw at waldmann-edv.de Fri Dec 4 10:54:27 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Fri, 4 Dec 2015 16:54:27 +0100 Subject: [Borgbackup] conversion from attic to borg In-Reply-To: <87fuzi2sf3.fsf@marcos.anarc.at> References: <87zixqz4p4.fsf@uwo.ca> <87fuzi2sf3.fsf@marcos.anarc.at> Message-ID: <5661B733.1050005@waldmann-edv.de> > The documentation for the upgrade system should be available in the main > docs as well. It is actually a bug that it doesn't show up there: https://github.com/borgbackup/borg/issues/464 >> If several client machines access the same repo, do I run it for >> each of them, or just one of them? I would assume that as the "user data" in the repo does not change and (AFAIK) the repo signature stays the same, the cache contents would still match. But: they are under .attic and also need to be converted to .borg. > I didn't think of that: i would assume it's enough to run it once. It would just create a new cache if there is nothing under .cache/borg. >> And presumably it's faster if I run it directly on the machine storing >> the repo? > > I would assume so as well. In fact, I don't know if it's possible at all > to run it remotely. If you didn't add a RPC call for it in remote.py, it won't work. Maybe there should be one. :-) https://github.com/borgbackup/borg/issues/465 -- GPG Fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 Encrypted E-Mail is preferred / Verschluesselte E-Mail wird bevorzugt. From anarcat at debian.org Fri Dec 4 12:25:06 2015 From: anarcat at debian.org (Antoine =?utf-8?Q?Beaupr=C3=A9?=) Date: Fri, 04 Dec 2015 12:25:06 -0500 Subject: [Borgbackup] conversion from attic to borg In-Reply-To: <5661B733.1050005@waldmann-edv.de> References: <87zixqz4p4.fsf@uwo.ca> <87fuzi2sf3.fsf@marcos.anarc.at> <5661B733.1050005@waldmann-edv.de> Message-ID: <87vb8e15zh.fsf@marcos.anarc.at> On 2015-12-04 10:54:27, Thomas Waldmann wrote: >> The documentation for the upgrade system should be available in the main >> docs as well. It is actually a bug that it doesn't show up there: > > https://github.com/borgbackup/borg/issues/464 > >>> If several client machines access the same repo, do I run it for >>> each of them, or just one of them? > > I would assume that as the "user data" in the repo does not change and > (AFAIK) the repo signature stays the same, the cache contents would > still match. > > But: they are under .attic and also need to be converted to .borg. That's basically what the upgrade script does. a. -- La nature n'a cr?? ni ma?tres ni esclaves Je ne veux ni donner ni recevoir de lois. - Denis Diderot From jdc at uwo.ca Fri Dec 4 14:02:07 2015 From: jdc at uwo.ca (Dan Christensen) Date: Fri, 04 Dec 2015 14:02:07 -0500 Subject: [Borgbackup] conversion from attic to borg In-Reply-To: <87fuzi2sf3.fsf@marcos.anarc.at> References: <87zixqz4p4.fsf@uwo.ca> <87fuzi2sf3.fsf@marcos.anarc.at> Message-ID: <87k2ouxck0.fsf@uwo.ca> Antoine Beaupr? writes: > On 2015-12-04 09:08:55, Dan Christensen wrote: > >> Also, how fast is it, with and without the --inplace option? > > It's pretty fast! `--inplace` just skips the step where it copies > (equivalent of `cp -al`) the repository into a backup repo. Once that > copy is done, it just needs to write a few bytes to every segment, which > takes at most a few minutes. It looks like the upgrade processes is changing the files by seeking into them and writing new bytes. This doesn't break hardlinks, so the "backup" copy that is made is also upgraded and is unusable by attic. But if just about every file in the repo needs to be changed, what point is there in making hardlinks? The backup and the upgraded repo aren't going to be able to share files, so borg will have to either make copies at the start or break the hardlinks by making new copies of the files during the upgrade. It seems cleanest to just make a true copy of the repo at the start. (The only advantage of the hardlinks would be if there are substantial files that can be shared between the repos, but I don't think that's the case.) https://github.com/borgbackup/borg/issues/466 Antoine Beaupr? writes: > I guess you *could* run the upgrade process multiple times, once per > client. The server side is "idempotent", that is: it can be run > multiple times without ill effects, and it will skip work already > done. But since remote operation isn't yet supported, this is a moot point. I'll just run the upgrade on the backup server, and then each client will have to rebuild their cache the first time. I guess there could be a "--files-cache-only" option to upgrade for this use case. > I don't think `--no-files-cache` is a valid option for the upgrade > command, but it could be added! Ah, it is shown in the usage information: usage: borg-linux64 upgrade [-h] [-v] [--show-rc] [--no-files-cache] [--umask M] [--remote-path PATH] [-n] [-i] [REPOSITORY] It looks like some of the other things shown might not apply either. Dan From anarcat at debian.org Fri Dec 4 14:23:36 2015 From: anarcat at debian.org (Antoine =?utf-8?Q?Beaupr=C3=A9?=) Date: Fri, 04 Dec 2015 14:23:36 -0500 Subject: [Borgbackup] conversion from attic to borg In-Reply-To: <87k2ouxck0.fsf@uwo.ca> References: <87zixqz4p4.fsf@uwo.ca> <87fuzi2sf3.fsf@marcos.anarc.at> <87k2ouxck0.fsf@uwo.ca> Message-ID: <87lh9a10hz.fsf@marcos.anarc.at> On 2015-12-04 14:02:07, Dan Christensen wrote: > Antoine Beaupr? writes: > >> On 2015-12-04 09:08:55, Dan Christensen wrote: >> >>> Also, how fast is it, with and without the --inplace option? >> >> It's pretty fast! `--inplace` just skips the step where it copies >> (equivalent of `cp -al`) the repository into a backup repo. Once that >> copy is done, it just needs to write a few bytes to every segment, which >> takes at most a few minutes. > > It looks like the upgrade processes is changing the files by seeking > into them and writing new bytes. This doesn't break hardlinks, so the > "backup" copy that is made is also upgraded and is unusable by attic. I *believe* the open/writing process deliberately breaks hardlinks. There's a unit test I wrote to confirm this as well: https://github.com/borgbackup/borg/blob/master/borg/testsuite/upgrader.py#L188 > But if just about every file in the repo needs to be changed, what point > is there in making hardlinks? The point is that we get meaningful progress information. The upgrade command was originally written as "inplace only", that is: it would change all the files and assume *you* would perform the backup first, with `cp -a` or `cp -al` or similar. Then i did a bit more hand-holding and thought of doing the copy as part of the upgrade command directly. The problem with doing that copy then is that it's harder to get an idea of progress, as we copy what seems to be an arbitrary hierarchy of files recursively, and we don't know when that long process finishes. By making a forest of hardlinks and rewriting the files one by one, we effectively do a copy, but we have progress information along the way. > The backup and the upgraded repo aren't going to be able to share > files, so borg will have to either make copies at the start or break > the hardlinks by making new copies of the files during the upgrade. > It seems cleanest to just make a true copy of the repo at the start. You can still do that: do the "true copy" yourself and use `--inplace`. > (The only advantage of the hardlinks would be if there are substantial > files that can be shared between the repos, but I don't think that's > the case.) > > https://github.com/borgbackup/borg/issues/466 Let's move the conversation to that issue: if the hardlinks are not broken, then this is indeed a bug. I hope the rationale for the hardlinks was made clearer in the above comment, if not, i'd be happy to clarify further. > Antoine Beaupr? writes: > >> I guess you *could* run the upgrade process multiple times, once per >> client. The server side is "idempotent", that is: it can be run >> multiple times without ill effects, and it will skip work already >> done. > > But since remote operation isn't yet supported, this is a moot point. > I'll just run the upgrade on the backup server, and then each client > will have to rebuild their cache the first time. True. > I guess there could be a "--files-cache-only" option to upgrade > for this use case. That would make sense. There's a new "ToggleAction" option that was added for --progress vs --no-progress recently, maybe we could expand that to have topical flags like --{no-,}files-cache --{no,}segments and so on... >> I don't think `--no-files-cache` is a valid option for the upgrade >> command, but it could be added! > > Ah, it is shown in the usage information: > > usage: borg-linux64 upgrade [-h] [-v] [--show-rc] [--no-files-cache] > [--umask M] [--remote-path PATH] [-n] [-i] > [REPOSITORY] > > It looks like some of the other things shown might not apply either. Yet another bug i guess - we should obey --no-files-cache in upgrade. A. -- Quis custodiet ipsos custodes? Who watches the watchmen? Qui police la police? Tu. You. Toi. From jdc at uwo.ca Fri Dec 4 14:37:16 2015 From: jdc at uwo.ca (Dan Christensen) Date: Fri, 04 Dec 2015 14:37:16 -0500 Subject: [Borgbackup] status of cache resync improvements In-Reply-To: <5661B53E.3060804@waldmann-edv.de> References: <87a8pqz3s5.fsf@uwo.ca> <5661B53E.3060804@waldmann-edv.de> Message-ID: <87fuzixaxf.fsf@uwo.ca> Thomas Waldmann writes: > As it currently looks, the cache merging code will get a LOT faster in > next release (already in github master branch) due to that fix: > > https://github.com/borgbackup/borg/issues/450 (this also applies to > attic, btw.) Ah, nice to hear! I hope to give this a try. Do you have an ETA for the next release? >> This repo contains 188 archives. > > Be careful, with borgbackup you will need a LOT of space for > ~/.cache/borg (except if you switch off per-archive index caching, see > the README about chunks.archive.d). Thanks for the tip. Can you point me to this README? I may turn this off if the other changes improve things enough. >> Because of this, it seems to me to make sense to keep a copy of the >> cache in the remote repo, and then copy to the local machine when borg >> notices that the cache is out of sync. The cache would add 0.3% to the >> repo size, in this case. > > We have to be careful to not disclose information about your stuff (the > data in the repo might be encrypted, but the index stuff is not [yet?]). Since this is information that all clients need and that is slow to recompute, I really think that it would make sense to store it in the repo. When a client's cache is out of date, it could either simply download it from the server, or an rsync-like method could be used to only download changed portions. I understand that this might be a problem for encrypted repos, although I imagine it would be possible to work around this. For example, the chunk id_hash and unencrypted size could be encrypted before being stored in the chunks cache, if they are considered sensitive. Then the server could keep the reference counts up to date in the chunks cache. I personally don't mind having my cache stored unencrypted on the server, so I might make a wrapper around borg that downloads the cache from the server before each backup, and copies it back after each backup. To make sure I understand things fully: cache/files should never be out of date. It contains only data local to the client, so shouldn't need updating if another client updated the repo. cache/chunks should contain nothing specific to the client, so it could be shared between clients. cache/config: what about this? Thanks for the help! It's great to see such active development on this project. Dan From jdc at uwo.ca Sat Dec 5 21:50:55 2015 From: jdc at uwo.ca (Dan Christensen) Date: Sat, 05 Dec 2015 21:50:55 -0500 Subject: [Borgbackup] status of cache resync improvements In-Reply-To: <87a8pqz3s5.fsf@uwo.ca> References: <87a8pqz3s5.fsf@uwo.ca> Message-ID: <877fksuw6o.fsf@uwo.ca> Dan Christensen writes: > As an example with attic, one repository I have has > > Original size Compressed size Deduplicated size > All archives: 1.61 TB 976.84 GB 30.92 GB > > This repo contains 188 archives. > > Rebuilding the cache, over gigabit ethernet, takes 40 minutes, and is > CPU bound on the local machine. I'm embarrassed to say that on the machine where I was having trouble, msgpack-python was falling back to its pure python implementation. I installed a pre-compiled Ubuntu package, and now the 40 minutes is reduced to 2 minutes. For another slightly smaller repo, the time is down from 2 hours to 5.5 minutes. This is still not ideal, but is also not a deal breaker. Sorry for missing this basic point. I wish msgpack gave a loud warning during "pip install msgpack-python" if it doesn't find what it needs in order to compile itself... Dan From rumpelsepp at sevenbyte.org Tue Dec 8 04:11:58 2015 From: rumpelsepp at sevenbyte.org (Stefan Tatschner) Date: Tue, 8 Dec 2015 10:11:58 +0100 Subject: [Borgbackup] 0.29 released? Message-ID: <56669EDE.10308@sevenbyte.org> Hi folks, I am a bit confused about the changes.rst file. There is already a 0.29 section, but there is no 0.29 Git tag yet. Has this version already been released, and somebody missed the tag? I would suggest to add the release date of each release to the changes.rst file (maybe just in the headline) in order to avoid further confusion. Stay tuned, Stefan From tw at waldmann-edv.de Tue Dec 8 07:57:14 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Tue, 8 Dec 2015 13:57:14 +0100 Subject: [Borgbackup] 0.29 released? In-Reply-To: <56669EDE.10308@sevenbyte.org> References: <56669EDE.10308@sevenbyte.org> Message-ID: <5666D3AA.3070400@waldmann-edv.de> Hi Stefan, > I am a bit confused about the changes.rst file. There is already a > 0.29 section, but there is no 0.29 Git tag yet. I am just preparing 0.29 release notes. There can be no release without a git tag as the release version number is automatically taken from the git tag. You can also look at the milestones on github, a release won't happen until all in the milestone is done (or has been transferred to another milestone): https://github.com/borgbackup/borg/milestones Releases also show up there (and on pypi): https://github.com/borgbackup/borg/releases > I would suggest to add the release date of each release to the > changes.rst file (maybe just in the headline) in order to avoid further > confusion. No, I don't like dates there. But I can add a "(not released yet)" so noone gets confused. Cheers, Thomas -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From rumpelsepp at sevenbyte.org Tue Dec 8 08:25:18 2015 From: rumpelsepp at sevenbyte.org (Stefan Tatschner) Date: Tue, 8 Dec 2015 14:25:18 +0100 Subject: [Borgbackup] 0.29 released? In-Reply-To: <5666D3AA.3070400@waldmann-edv.de> References: <56669EDE.10308@sevenbyte.org> <5666D3AA.3070400@waldmann-edv.de> Message-ID: <5666DA3E.2080208@sevenbyte.org> Hi Thomas, >> I am a bit confused about the changes.rst file. There is already a >> 0.29 section, but there is no 0.29 Git tag yet. > > I am just preparing 0.29 release notes. > > There can be no release without a git tag as the release version number > is automatically taken from the git tag. > > You can also look at the milestones on github, a release won't happen > until all in the milestone is done (or has been transferred to another > milestone): > > https://github.com/borgbackup/borg/milestones > > Releases also show up there (and on pypi): > > https://github.com/borgbackup/borg/releases thanks for your explanation! On 08.12.2015 13:57, Thomas Waldmann wrote: > But I can add a "(not released yet)" so noone gets confused. That would be nice! Thanks, Stefan From rumpelsepp at sevenbyte.org Tue Dec 8 08:32:24 2015 From: rumpelsepp at sevenbyte.org (Stefan Tatschner) Date: Tue, 8 Dec 2015 14:32:24 +0100 Subject: [Borgbackup] [PATCH] Fix wrong installation instructions Message-ID: <1449581544-29591-1-git-send-email-rumpelsepp@sevenbyte.org> - On arch I don't want to perform a full system upgrade when installing a new package; so let's drop the "yu" part. - On debian it is "apt-get install" instead of "apt install". --- docs/installation.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/installation.rst b/docs/installation.rst index 271fd08..5bb9771 100644 --- a/docs/installation.rst +++ b/docs/installation.rst @@ -32,9 +32,9 @@ yet. ============ ===================== ======= Distribution Source Command ============ ===================== ======= -Arch Linux `[community]`_ ``pacman -Syu borg`` -Debian `unstable/sid`_ ``apt install borgbackup`` -Ubuntu `Xenial Xerus 16.04`_ ``apt install borgbackup`` +Arch Linux `[community]`_ ``pacman -S borg`` +Debian `unstable/sid`_ ``apt-get install borgbackup`` +Ubuntu `Xenial Xerus 16.04`_ ``apt-get install borgbackup`` OS X `Brew cask`_ ``brew cask install borgbackup`` ============ ===================== ======= -- 2.6.3 From anarcat at debian.org Tue Dec 8 14:08:55 2015 From: anarcat at debian.org (Antoine =?utf-8?Q?Beaupr=C3=A9?=) Date: Tue, 08 Dec 2015 14:08:55 -0500 Subject: [Borgbackup] [PATCH] Fix wrong installation instructions In-Reply-To: <1449581544-29591-1-git-send-email-rumpelsepp@sevenbyte.org> References: <1449581544-29591-1-git-send-email-rumpelsepp@sevenbyte.org> Message-ID: <87y4d4zrjs.fsf@marcos.anarc.at> On 2015-12-08 08:32:24, Stefan Tatschner wrote: > - On arch I don't want to perform a full system upgrade when > installing a new package; so let's drop the "yu" part. > > - On debian it is "apt-get install" instead of "apt install". In Debian jessie and above, `apt install` actually works. This was deliberate. I didn't know about pacman, so that's a fine change. Can you reroll? -- The difference between a democracy and a dictatorship is that in a democracy you vote first and take orders later; in a dictatorship you don't have to waste your time voting. - Charles Bukowski From rumpelsepp at sevenbyte.org Tue Dec 8 14:27:06 2015 From: rumpelsepp at sevenbyte.org (Stefan Tatschner) Date: Tue, 8 Dec 2015 20:27:06 +0100 Subject: [Borgbackup] [PATCH] Fix wrong installation instructions In-Reply-To: <87y4d4zrjs.fsf@marcos.anarc.at> References: <1449581544-29591-1-git-send-email-rumpelsepp@sevenbyte.org> <87y4d4zrjs.fsf@marcos.anarc.at> Message-ID: <56672F0A.6000608@sevenbyte.org> On 08.12.2015 20:08, Antoine Beaupr? wrote: > In Debian jessie and above, `apt install` actually works. This was > deliberate. > > I didn't know about pacman, so that's a fine change. Can you reroll? I already had a discussion off-list with Thomas which resulted in this commit: https://github.com/borgbackup/borg/commit/f1b9b95e0deade9b2260135b169cdd9798d68164 Stefan From adrian.klaver at aklaver.com Fri Dec 11 11:05:46 2015 From: adrian.klaver at aklaver.com (Adrian Klaver) Date: Fri, 11 Dec 2015 08:05:46 -0800 Subject: [Borgbackup] Mailing list visibility Message-ID: <566AF45A.2020009@aklaver.com> First, thanks for moving the list. Per comments in this issue: https://github.com/borgbackup/borg/issues/430 it seems there are questions about why there is no traffic on the new list. So here is some. Repeating my comment to this issue: https://github.com/borgbackup/borg/issues/467 It would help list visibility and traffic if this page: https://borgbackup.readthedocs.org/en/stable/support.html was updated to reflect the new list address and where to subscribe to it. Thanks, -- Adrian Klaver adrian.klaver at aklaver.com From rumpelsepp at sevenbyte.org Sat Dec 12 11:50:59 2015 From: rumpelsepp at sevenbyte.org (Stefan Tatschner) Date: Sat, 12 Dec 2015 17:50:59 +0100 Subject: [Borgbackup] Mailing list visibility In-Reply-To: <566AF45A.2020009@aklaver.com> References: <566AF45A.2020009@aklaver.com> Message-ID: <8b5865f1d282e65dd61bb23e5f7722eb@sevenbyte.org> On 2015-12-11 17:05, Adrian Klaver wrote: > It would help list visibility and traffic if this page: > > https://borgbackup.readthedocs.org/en/stable/support.html > > was updated to reflect the new list address and where to subscribe to it. It has been updated; it is just not visible in the stable revision of the docs-homepage: http://borgbackup.readthedocs.org/en/latest/support.html Stefan From adrian.klaver at aklaver.com Sat Dec 12 12:33:13 2015 From: adrian.klaver at aklaver.com (Adrian Klaver) Date: Sat, 12 Dec 2015 09:33:13 -0800 Subject: [Borgbackup] Mailing list visibility In-Reply-To: <8b5865f1d282e65dd61bb23e5f7722eb@sevenbyte.org> References: <566AF45A.2020009@aklaver.com> <8b5865f1d282e65dd61bb23e5f7722eb@sevenbyte.org> Message-ID: <566C5A59.6050905@aklaver.com> On 12/12/2015 08:50 AM, Stefan Tatschner wrote: > On 2015-12-11 17:05, Adrian Klaver wrote: >> It would help list visibility and traffic if this page: >> >> https://borgbackup.readthedocs.org/en/stable/support.html >> >> was updated to reflect the new list address and where to subscribe to it. > > It has been updated; it is just not visible in the stable revision of > the docs-homepage: > http://borgbackup.readthedocs.org/en/latest/support.html See this issue: https://github.com/borgbackup/borg/issues/467 > > Stefan > -- Adrian Klaver adrian.klaver at aklaver.com From tw at waldmann-edv.de Sun Dec 13 13:23:51 2015 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sun, 13 Dec 2015 19:23:51 +0100 Subject: [Borgbackup] borgbackup 0.29.0 released Message-ID: <566DB7B7.7030002@waldmann-edv.de> See there: https://github.com/borgbackup/borg/releases/tag/0.29.0 It's also available on PyPi (via pip install). Cheers, Thomas -- GPG ID: FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From billy at worldofbilly.com Wed Dec 23 17:10:42 2015 From: billy at worldofbilly.com (Billy Charlton) Date: Wed, 23 Dec 2015 14:10:42 -0800 Subject: [Borgbackup] borg create: it's really difficult to get wildcard expansions correct Message-ID: I'm having a lot of difficulty getting a borg create command written just right for backing up my home directory. What seems like should be a simple and common task has taken me hours of fiddling and it still isn't correct. I'm wondering if the wildcard expansion logic needs some work. I've been following https://github.com/borgbackup/borg/issues/43 but I don't see a solution yet. Here's what I want to do: - I want to back up everything in my home directory, except for my dotfiles, dot folders, and one folder called "AppData". - I don't want any of the leading folder names stored in the archive's file paths, because I may be restoring to a different platform. My laptop has my home folder in /home/billy, while my windows desktop's home is seen as /cygdrive/c/users/Billy. So its best for me to leave all that out and just backup from the relative safety of my home folder. - I can't rely on shell expansion because many of the folders and filenames have spaces. So, I feel like this command should work: cd ~ borg create -v --exclude "./AppData" --exclude ".*" test::mybackup . but that second exclude is apparently excluding ALL files. Why is that? Shouldn't it just exclude files and folders that begin with "."? - If I change it to --exclude "./.*" --> it excludes all files again, which makes no sense to me. - If I change it to --exclude "./.*/" --> it almost works: regular folders and files get archived; the *contents* of dot folders are not archived; but all dot files in the home folder as well as the dot folder names themselves (but not their contents) get archived. This all seems befuddling to me. Am I missing something? (This is using borg 0.29.0 on a cygwin install.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdc at uwo.ca Wed Dec 23 17:39:55 2015 From: jdc at uwo.ca (Dan Christensen) Date: Wed, 23 Dec 2015 17:39:55 -0500 Subject: [Borgbackup] borg create: it's really difficult to get wildcard expansions correct In-Reply-To: References: Message-ID: <87bn9gvlfo.fsf@uwo.ca> On Dec 23, 2015, Billy Charlton wrote: ... > Here's what I want to do: > - I want to back up everything in my home directory, except for my > dotfiles, dot folders, and one folder called "AppData". ... > So, I feel like this command should work: > > cd ~ > borg create -v --exclude "./AppData" --exclude ".*" test::mybackup . > > but that second exclude is apparently excluding ALL files. Why is that? > Shouldn't it just exclude files and folders that begin with "."? I believe that pattern is matching the root directory of your backup, so everything is getting excluded. Using --exclude ".[a-zA-Z0-9]*" seems to do the trick, assuming all of the files/dirs you need to exclude have a letter or number as the second character. Actually, this --exclude ".?*" is probably better, as it allows any second character. Incidentally, attic has paths of the form "./path/to/file" in this situation, so this is an incompatibility with attic that should be documented (if it isn't already). Dan From billy at worldofbilly.com Wed Dec 23 17:44:45 2015 From: billy at worldofbilly.com (Billy Charlton) Date: Wed, 23 Dec 2015 14:44:45 -0800 Subject: [Borgbackup] borg create: it's really difficult to get wildcard expansions correct In-Reply-To: <87bn9gvlfo.fsf@uwo.ca> References: <87bn9gvlfo.fsf@uwo.ca> Message-ID: Aha -- so ".*" is excluding "." because * matches zero or more characters. I probably knew that at some point. So, ".?*" should do the trick -- and now after testing, it does. Thank you! ..b On Wed, Dec 23, 2015 at 2:39 PM, Dan Christensen wrote: > On Dec 23, 2015, Billy Charlton wrote: > > ... > > > Here's what I want to do: > > - I want to back up everything in my home directory, except for my > > dotfiles, dot folders, and one folder called "AppData". > > ... > > > So, I feel like this command should work: > > > > cd ~ > > borg create -v --exclude "./AppData" --exclude ".*" test::mybackup . > > > > but that second exclude is apparently excluding ALL files. Why is that? > > Shouldn't it just exclude files and folders that begin with "."? > > I believe that pattern is matching the root directory of your backup, > so everything is getting excluded. > > Using > > --exclude ".[a-zA-Z0-9]*" > > seems to do the trick, assuming all of the files/dirs you need to > exclude have a letter or number as the second character. Actually, > this > > --exclude ".?*" > > is probably better, as it allows any second character. > > Incidentally, attic has paths of the form "./path/to/file" in this > situation, so this is an incompatibility with attic that should be > documented (if it isn't already). > > Dan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matze at trex-tec.de Sun Dec 27 14:38:07 2015 From: matze at trex-tec.de (Matthias Rieche) Date: Sun, 27 Dec 2015 20:38:07 +0100 Subject: [Borgbackup] Collect Backups Message-ID: Hey, i have a Question. When i have Backuprepositorys /mnt/borg/::mon /mnt/borg/::die /mnt/borg/::mi /mnt/borg/::do /mnt/borg/::fr Can i collect the 5 Repositories to example /mnt/borg/::week50 ? After the Collect borg should have only have the repository ::week50 ? I?ve seen someone with an Borg Hoodie at the Congress. So i find this Project and i think it?s perfect for Backup my Fileserver's ;) Gru? aus Hamburg Matze