From melkor.lord at gmail.com Sat Oct 1 14:25:46 2016 From: melkor.lord at gmail.com (Melkor Lord) Date: Sat, 1 Oct 2016 20:25:46 +0200 Subject: [Borgbackup] [IDEA] Add support for external commands Message-ID: Hi, Playing a bit with Borg made me think of something that could be very useful. It's just a rough idea. It involves extending "stdin" via "borg serve". We can already to something like : someapp | borg create /path/to/repo::name - It would be really useful to extend that to "borg serve". Get the output of any arbitrary command to backup (stdout) and display any errors (stderr) if they should occur. Typically, this would be useful for remote backups of application data such as SQL dumps, LDAP dumps, whatever... Something like : borg create --remote-cmd user at host: -- mysqldump [options] db.sql or borg create --remote-cmd user at host: -- somebigscriptdoingstuff This would connect to "user at host" and "--remote-cmd" would instruct the remote "borg serve" process to execute whatever is after "--" and retrieve stdout, stderr and also the returncode of the executed app. if "--remote-cmd" is used, no path allowed after the ":" in "user at host:" part and "--" would be required. This would allow to use complex commands without the need of tricky shell escapes. What do you think? A nice addition to that would be an option like "--skip-cmd-fail" that would NOT create the backup (or create it and delete it afterwards if I understand the way Borg currently works) if the returncode of the remote command is non-zero. Even better, for strange and unusual commands, "--skip-cmd-fail=1,2,3" that would list the returncodes triggering the non-creation of the backup. This kind of feature would be a huge life saver for admins. PS: I know we currently can do something like : ssh user at host "do-stuff" | borg create /path/to/repo::name but this lacks the more thorough controls borg could achieve by wrapping all the process. -- Unix _IS_ user friendly, it's just selective about who its friends are. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tw at waldmann-edv.de Sat Oct 1 14:43:54 2016 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sat, 1 Oct 2016 20:43:54 +0200 Subject: [Borgbackup] [IDEA] Add support for external commands In-Reply-To: References: Message-ID: <771d30a8-4a03-dc55-2909-233c3d843428@waldmann-edv.de> > PS: I know we currently can do something like : > ssh user at host "do-stuff" | borg create /path/to/repo::name > but this lacks the more thorough controls borg could achieve by wrapping > all the process. I just wanted to suggest you just do that. In general, borg does not try to be a shell / scripting thing / scheduler / etc. Just using a shell, script language, cron, etc. is usually better for this than reinventing this inside borg. -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From wtraylor at areyouthinking.org Sat Oct 1 14:45:39 2016 From: wtraylor at areyouthinking.org (Walker Traylor) Date: Sun, 2 Oct 2016 01:45:39 +0700 Subject: [Borgbackup] [IDEA] Add support for external commands In-Reply-To: References: Message-ID: <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org> "PS: I know we currently can do something like : ssh user at host "do-stuff" | borg create /path/to/repo::name but this lacks the more thorough controls borg could achieve by wrapping all the process." What more thorough controls do you require? Walker Traylor walker at walkertraylor.com > On Oct 2, 2016, at 1:25 AM, Melkor Lord wrote: > > Hi, > > Playing a bit with Borg made me think of something that could be very useful. It's just a rough idea. It involves extending "stdin" via "borg serve". > > We can already to something like : someapp | borg create /path/to/repo::name - > > It would be really useful to extend that to "borg serve". Get the output of any arbitrary command to backup (stdout) and display any errors (stderr) if they should occur. > > Typically, this would be useful for remote backups of application data such as SQL dumps, LDAP dumps, whatever... Something like : > > borg create --remote-cmd user at host: -- mysqldump [options] db.sql > > or > > borg create --remote-cmd user at host: -- somebigscriptdoingstuff > > This would connect to "user at host" and "--remote-cmd" would instruct the remote "borg serve" process to execute whatever is after "--" and retrieve stdout, stderr and also the returncode of the executed app. > > if "--remote-cmd" is used, no path allowed after the ":" in "user at host:" part and "--" would be required. This would allow to use complex commands without the need of tricky shell escapes. > > What do you think? > > A nice addition to that would be an option like "--skip-cmd-fail" that would NOT create the backup (or create it and delete it afterwards if I understand the way Borg currently works) if the returncode of the remote command is non-zero. Even better, for strange and unusual commands, "--skip-cmd-fail=1,2,3" that would list the returncodes triggering the non-creation of the backup. > > This kind of feature would be a huge life saver for admins. > > PS: I know we currently can do something like : > ssh user at host "do-stuff" | borg create /path/to/repo::name > but this lacks the more thorough controls borg could achieve by wrapping all the process. > > -- > Unix _IS_ user friendly, it's just selective about who its friends are. > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup -------------- next part -------------- An HTML attachment was scrubbed... URL: From tw at waldmann-edv.de Sat Oct 1 17:30:43 2016 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sat, 1 Oct 2016 23:30:43 +0200 Subject: [Borgbackup] borgbackup beta 1.1.0b2 released Message-ID: <90fe9cd3-3fa6-4a1d-c36f-a36201ccf9b3@waldmann-edv.de> https://github.com/borgbackup/borg/releases/tag/1.1.0b2 More details: see URL above. Cheers, Thomas -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From melkor.lord at gmail.com Sat Oct 1 19:38:28 2016 From: melkor.lord at gmail.com (Melkor Lord) Date: Sun, 2 Oct 2016 01:38:28 +0200 Subject: [Borgbackup] [IDEA] Add support for external commands In-Reply-To: <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org> References: <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org> Message-ID: On Sat, Oct 1, 2016 at 8:45 PM, Walker Traylor wrote: "PS: I know we currently can do something like : > ssh user at host "do-stuff" | borg create /path/to/repo::name > but this lacks the more thorough controls borg could achieve by wrapping > all the process." > > What more thorough controls do you require? > Quite simple : if "do-stuff" fails for some reason, there's a lot of work to determine how to suppress the "failed" backup that Borg's going to make anyway. ssh user at host "echo '' ; /bin/false" | borg create /path/to/repo::test - see? Now there's a "test" backup which is empty and serves no purpose at all. Because of the pipe, I don't know that the ssh command failed. Borg didn't fail but I have no way to distinguish "test" between "legit but no new data" from "failed empty useless data" type of backup. As I see it coming, NO, I will NOT use Bash for scripting to get the $PIPESTATUS variable because Bash isn't available everywhere. I love Bash for my everyday shell usage but I never use it in scripts, I require /bin/sh (dash on Debian and derivatives) because POSIX scripts work everywhere the same way without side effets (dash, busybox ash, whatever POSIX compliant shell). Borg (serve) wrapping the command could detect the failure (and all cases with the --skip-cmd-fail=x,y,z) and notify the calling Borg to not create the backup. Hope it's clear. -- Unix _IS_ user friendly, it's just selective about who its friends are. -------------- next part -------------- An HTML attachment was scrubbed... URL: From melkor.lord at gmail.com Sat Oct 1 19:42:39 2016 From: melkor.lord at gmail.com (Melkor Lord) Date: Sun, 2 Oct 2016 01:42:39 +0200 Subject: [Borgbackup] [IDEA] Add support for external commands In-Reply-To: <771d30a8-4a03-dc55-2909-233c3d843428@waldmann-edv.de> References: <771d30a8-4a03-dc55-2909-233c3d843428@waldmann-edv.de> Message-ID: On Sat, Oct 1, 2016 at 8:43 PM, Thomas Waldmann wrote: > PS: I know we currently can do something like : > > ssh user at host "do-stuff" | borg create /path/to/repo::name > > but this lacks the more thorough controls borg could achieve by wrapping > > all the process. > > I just wanted to suggest you just do that. > > In general, borg does not try to be a shell / scripting thing / > scheduler / etc. > > Just using a shell, script language, cron, etc. is usually better for > this than reinventing this inside borg. > I know and like Borg for that. In this case, this more "wrapping a process and monitor its correct execution" rather providing scripting abilities or anything out of the scope of Borg. What's why I don't think this is reinventing the wheel. Just trying to have "snow tires" for the winter :-) -- Unix _IS_ user friendly, it's just selective about who its friends are. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sitaramc at gmail.com Sat Oct 1 22:17:11 2016 From: sitaramc at gmail.com (Sitaram Chamarty) Date: Sun, 2 Oct 2016 07:47:11 +0530 Subject: [Borgbackup] [IDEA] Add support for external commands In-Reply-To: References: <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org> Message-ID: <02b02db8-6469-fb85-78c6-5408a9defef4@gmail.com> On 10/02/2016 05:08 AM, Melkor Lord wrote: > On Sat, Oct 1, 2016 at 8:45 PM, Walker Traylor > wrote: > > "PS: I know we currently can do something like : > ssh user at host "do-stuff" | borg create /path/to/repo::name > but this lacks the more thorough controls borg could achieve by wrapping all the process." > > What more thorough controls do you require? > > > Quite simple : if "do-stuff" fails for some reason, there's a lot of work to determine how to suppress the "failed" backup that Borg's going to make anyway. > > ssh user at host "echo '' ; /bin/false" | borg create /path/to/repo::test - > > see? Now there's a "test" backup which is empty and serves no purpose at all. Because of the pipe, I don't know that the ssh command failed. Borg didn't fail but I have no way to distinguish "test" between "legit but no new data" from "failed empty useless data" type of backup. > > As I see it coming, NO, I will NOT use Bash for scripting to get the $PIPESTATUS variable because Bash isn't available everywhere. I love Bash for my everyday shell usage but I never use it in scripts, I require /bin/sh (dash on Debian and derivatives) because POSIX scripts work everywhere the same way without side effets (dash, busybox ash, whatever POSIX compliant shell). > > Borg (serve) wrapping the command could detect the failure (and all cases with the --skip-cmd-fail=x,y,z) and notify the calling Borg to not create the backup. > > Hope it's clear. Speaking as a borg user (i.e., not a borg developer), it seems to me that the output of the command, which could be arbitrarily large, needs to be preserved somewhere, somehow, until the exit happens, and then -- if the exit is good -- create the archive. I'm not a borg dev, and I can't speak for them, but it seems to me that is not something borg should be doing. The output could be arbitrarily large before this happens. If I needed something like this, I'd make a wrapper that calls the command needed and uses a TMPDIR to buffer it's output, *then* call borg depending on the exit. Another way is to write a wrapper that creates the archive anyway, but if the pipe fails, deletes it. Here's something I cooked up in a few minutes. Sure it's kludgey, but it gets the job done. (As for the bash rant; to each his own. I would suggest that any environment that *requires* such complex features should be able to install the right tools, but that's just my opinion. In any case, it sounds weird that you can install borg but not bash). #!/bin/bash repo="$1" archive="$2" # cannot be '{now}' or similar; sorry! if [ "$3" = "abort" ] then borg delete $repo::$archive else cmd="$3" ( eval $cmd || ( ( sleep 1; $0 $repo $archive abort ) >&2 & ) ) | borg create $repo::$archive - fi From melkor.lord at gmail.com Sun Oct 2 00:00:39 2016 From: melkor.lord at gmail.com (Melkor Lord) Date: Sun, 2 Oct 2016 06:00:39 +0200 Subject: [Borgbackup] [IDEA] Add support for external commands In-Reply-To: <02b02db8-6469-fb85-78c6-5408a9defef4@gmail.com> References: <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org> <02b02db8-6469-fb85-78c6-5408a9defef4@gmail.com> Message-ID: On Sun, Oct 2, 2016 at 4:17 AM, Sitaram Chamarty wrote: Speaking as a borg user (i.e., not a borg developer), it seems to me > that the output of the command, which could be arbitrarily large, needs > to be preserved somewhere, somehow, until the exit happens, and then -- > if the exit is good -- create the archive. > No. This is not the intended use. Please (re)read my first post. The goal is to apply the equivalent of "someapp | borg create ..." to a networked version via "borg serve". I'm not a borg dev, and I can't speak for them, but it seems to me that > is not something borg should be doing. The output could be arbitrarily > large before this happens. > This is not a problem, this is the intended usage! Again from my first post (either you didn't read or did not understand, no offense) : (mockup) borg create --remote-cmd user at host: -- mysqldump [options] db.sql "borg serve" running on "host" would simply execute "mysqldump [options] db.sql" and yes, the output could be arbitrary large but we don't care since it is piped into Borg on *PURPOSE* ! "borg serve" running on "host" would just have to capture stdout to feed "borg create" on the backup server _AND_ capture stderror as well as the returncode of the executed process (here: "mysqldump" for instance). "borg serve" running on "host" would then notify "borg create" if the returncode is non-zero (or another returncode if supported via the adequate option) to NOT create the backup and optionally return the contents of stderr of the failed process to display or mail to the admin. I hope it's clearer now. > If I needed something like this, I'd make a wrapper that calls the > command needed and uses a TMPDIR to buffer it's output, *then* call borg > depending on the exit. > This approach is impossible in the case of "borg serve". > Another way is to write a wrapper that creates the archive anyway, but > if the pipe fails, deletes it. That would be necessary in some cases yes. I have some in mind like making backups of MSSQL servers. There's no way to create the SQL dump and output to stdout so, a script would first create the SQL backup to a file and if everything is fine, "cat" the file to stdout for "borg create" to capture. > (As for the bash rant; to each his own. I would suggest that any > environment that *requires* such complex features should be able to > install the right tools, but that's just my opinion. In any case, it > sounds weird that you can install borg but not bash). > Where did you see any rant? I use and love Bash everyday on interactive shells but ALL scripts I write are in pure POSIX shell and I have yet to find a case where Bash is required over a pure POSIX shell. Believe me, I wrote highly complex apps in pure Shell script for the past 2 decades. As of the weirdness you imply in your last sentence, please don't draw conclusions when you don't have all the details. There are numerous situations where you can't install software on servers. They can be "certified" servers where no installation is allowed. They can be proprietary or limited systems/appliances where you have minimal shell support via SSH. Ever tried to install Bash on a VMware ESXi host? In all those cases, we could not install Borg either (well, using some cxfreeze magic could make it possible though) but these are special cases. -- Unix _IS_ user friendly, it's just selective about who its friends are. -------------- next part -------------- An HTML attachment was scrubbed... URL: From public at enkore.de Sun Oct 2 04:03:15 2016 From: public at enkore.de (Marian Beermann) Date: Sun, 2 Oct 2016 10:03:15 +0200 Subject: [Borgbackup] [IDEA] Add support for external commands In-Reply-To: References: <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org> Message-ID: On 02.10.2016 01:38, Melkor Lord wrote: > On Sat, Oct 1, 2016 at 8:45 PM, Walker Traylor > > wrote: > > "PS: I know we currently can do something like : > ssh user at host "do-stuff" | borg create /path/to/repo::name > but this lacks the more thorough controls borg could achieve by > wrapping all the process." > > What more thorough controls do you require? > > > Quite simple : if "do-stuff" fails for some reason, there's a lot of > work to determine how to suppress the "failed" backup that Borg's going > to make anyway. > > ssh user at host "echo '' ; /bin/false" | borg create /path/to/repo::test - > > see? Now there's a "test" backup which is empty and serves no purpose at > all. Because of the pipe, I don't know that the ssh command failed. Borg > didn't fail but I have no way to distinguish "test" between "legit but > no new data" from "failed empty useless data" type of backup. > That's a good point. I could imagine something like borg create /path/to/repo::test --from-command "ssh user at host ..." Which would execute that command (in a local shell) and put stdout into an archive, while leaving stderr connected to stderr. If the command/shell exits with a nonzero status, a rollback is made and no backup would be created. Cheers, Marian From public at enkore.de Sun Oct 2 04:07:08 2016 From: public at enkore.de (Marian Beermann) Date: Sun, 2 Oct 2016 10:07:08 +0200 Subject: [Borgbackup] [IDEA] Add support for external commands In-Reply-To: References: <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org> Message-ID: <5b275fc8-15fb-ae6e-2c38-5076169b5a37@enkore.de> On 02.10.2016 10:03, Marian Beermann wrote: > On 02.10.2016 01:38, Melkor Lord wrote: >> On Sat, Oct 1, 2016 at 8:45 PM, Walker Traylor >> > wrote: >> >> "PS: I know we currently can do something like : >> ssh user at host "do-stuff" | borg create /path/to/repo::name >> but this lacks the more thorough controls borg could achieve by >> wrapping all the process." >> >> What more thorough controls do you require? >> >> >> Quite simple : if "do-stuff" fails for some reason, there's a lot of >> work to determine how to suppress the "failed" backup that Borg's going >> to make anyway. >> >> ssh user at host "echo '' ; /bin/false" | borg create /path/to/repo::test - >> >> see? Now there's a "test" backup which is empty and serves no purpose at >> all. Because of the pipe, I don't know that the ssh command failed. Borg >> didn't fail but I have no way to distinguish "test" between "legit but >> no new data" from "failed empty useless data" type of backup. >> > > That's a good point. > > I could imagine something like > > borg create /path/to/repo::test --from-command "ssh user at host ..." > > Which would execute that command (in a local shell) and put stdout into > an archive, while leaving stderr connected to stderr. If the > command/shell exits with a nonzero status, a rollback is made and no > backup would be created. > > Cheers, Marian > What I don't quite understand yet is where borg serve comes into play. If I do, e.g. "ssh someone at somewhere false" then the error code is propagated to the host and ssh itself exits with it. So no specific handling for SSH would be needed, unless I'm overlooking something here. Cheers, Marian From melkor.lord at gmail.com Sun Oct 2 11:17:22 2016 From: melkor.lord at gmail.com (Melkor Lord) Date: Sun, 2 Oct 2016 17:17:22 +0200 Subject: [Borgbackup] [IDEA] Add support for external commands In-Reply-To: <5b275fc8-15fb-ae6e-2c38-5076169b5a37@enkore.de> References: <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org> <5b275fc8-15fb-ae6e-2c38-5076169b5a37@enkore.de> Message-ID: On Sun, Oct 2, 2016 at 10:07 AM, Marian Beermann wrote: What I don't quite understand yet is where borg serve comes into play. > If I do, e.g. "ssh someone at somewhere false" then the error code is > propagated to the host and ssh itself exits with it. So no specific > handling for SSH would be needed, unless I'm overlooking something here. > Isn't the job of "borg serve" to perform the backup itself on the remote host and just feed back "borg create" (the caller) with backed-up-data-ready-for-storage? Or do I not understand clearly the purpose of "borg serve"? This way, the strain is put on the hosts executing "borg serve", thus allowing the backup server "borg create" to perform parallel backups (on different repos of course) without hogging the CPU. That's why I was thinking this is a job for "borg serve". Another benefit : When "borg serve" executes the command but it "fails" (non-zero returncode or --skip-cmd-fail=1,2,n... ) it can "notify" the caller (borg create) to not create the backup in the first place instead of creating one and then rollback. I bet this is quite I/O costly on already big repos. -- Unix _IS_ user friendly, it's just selective about who its friends are. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sitaramc at gmail.com Sat Oct 8 07:06:39 2016 From: sitaramc at gmail.com (Sitaram Chamarty) Date: Sat, 8 Oct 2016 16:36:39 +0530 Subject: [Borgbackup] bypassing "Cache newer than repo" error Message-ID: Hi Some of the directories are backed up to two (in one case three) different external (USB) hard disks. When I finish with the first USB drive, unmount it, and mount the next one and try the backup, borg tells me the cache is newer. I could not find anything about how to bypass this in the docs. I have now created a complicated system of separately maintaining the cache directories for each external disk (labelled in some way that correlates with the physical disk in question) and manually shuffle them around. Any pointers would be appreciated. regards sitaram PS: Yes removing ~/.cache/borg works and is generally harmless. But that makes ALL the files show up as "A ...", whereas I like to eyeball the list to see if something got updated which I did not expec to be updated based on what I have been working on since the last backup. From public at enkore.de Sat Oct 8 07:26:46 2016 From: public at enkore.de (Marian Beermann) Date: Sat, 8 Oct 2016 13:26:46 +0200 Subject: [Borgbackup] bypassing "Cache newer than repo" error In-Reply-To: References: Message-ID: Hi sitaram This sounds like you created one repository and copied it to multiple drives/locations? In that case this is to be expected - Borg distinguishes different repositories by their ID, which is independent of the location and would be the same for these copied repositories. You can change the repository ID in the "config" file of the repository (it's hex, keep it the same length), which separates the repositories. Note: for encrypted repositories it's a very unsafe thing to have multiple independently updated copies of a repository; if they diverge (minutely different contents) and an attacker gains access to more than one copy, the privacy of the repository contents may be compromised. Cheers, Marian FAQs: http://borgbackup.readthedocs.io/en/stable/faq.html#can-i-copy-or-synchronize-my-repo-to-another-location On 08.10.2016 13:06, Sitaram Chamarty wrote: > Hi > > Some of the directories are backed up to two (in one case three) > different external (USB) hard disks. When I finish with the first USB > drive, unmount it, and mount the next one and try the backup, borg tells > me the cache is newer. > > I could not find anything about how to bypass this in the docs. I have > now created a complicated system of separately maintaining the cache > directories for each external disk (labelled in some way that correlates > with the physical disk in question) and manually shuffle them around. > > Any pointers would be appreciated. > > regards > sitaram > > PS: Yes removing ~/.cache/borg works and is generally harmless. But > that makes ALL the files show up as "A ...", whereas I like to eyeball > the list to see if something got updated which I did not expec to be > updated based on what I have been working on since the last backup. > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > From sitaramc at gmail.com Sat Oct 8 08:17:22 2016 From: sitaramc at gmail.com (Sitaram Chamarty) Date: Sat, 8 Oct 2016 17:47:22 +0530 Subject: [Borgbackup] bypassing "Cache newer than repo" error In-Reply-To: References: Message-ID: <777b7ccd-cb62-45d6-1c99-1bb9efb12730@gmail.com> Hi Marian Thanks for your quick reply; much appreciated. On 10/08/2016 04:56 PM, Marian Beermann wrote: > This sounds like you created one repository and copied it to multiple > drives/locations? I was going to say "no way" but it appears that is what I did. Now I also understand why it's happening only to one specific repo (a rather large one I got lazy about creating the first time, and simply did a copy). I'd clean forgotten! > You can change the repository ID in the "config" file of the repository > (it's hex, keep it the same length), which separates the repositories. I assume it has no other semantics so I can just randomly change some hex digits into others? > Note: for encrypted repositories it's a very unsafe thing to have > multiple independently updated copies of a repository; if they diverge > (minutely different contents) and an attacker gains access to more than > one copy, the privacy of the repository contents may be compromised. I do have multiple independently updated copies of a repo, but -- other than this one where #2 was created by a file-system level copy of #1, all the others are "borg init"-ed independently on each disk. The passphrase I use is the same, but I assume the internal key was randomly generated each time and would not be the same, so the attack you speak of should not happen for those repos even if someone got hold of them. Is that understanding correct? Thanks again and best regards sitaram > > Cheers, Marian > > FAQs: > http://borgbackup.readthedocs.io/en/stable/faq.html#can-i-copy-or-synchronize-my-repo-to-another-location > > On 08.10.2016 13:06, Sitaram Chamarty wrote: >> Hi >> >> Some of the directories are backed up to two (in one case three) >> different external (USB) hard disks. When I finish with the first USB >> drive, unmount it, and mount the next one and try the backup, borg tells >> me the cache is newer. >> >> I could not find anything about how to bypass this in the docs. I have >> now created a complicated system of separately maintaining the cache >> directories for each external disk (labelled in some way that correlates >> with the physical disk in question) and manually shuffle them around. >> >> Any pointers would be appreciated. >> >> regards >> sitaram >> >> PS: Yes removing ~/.cache/borg works and is generally harmless. But >> that makes ALL the files show up as "A ...", whereas I like to eyeball >> the list to see if something got updated which I did not expec to be >> updated based on what I have been working on since the last backup. >> _______________________________________________ >> Borgbackup mailing list >> Borgbackup at python.org >> https://mail.python.org/mailman/listinfo/borgbackup >> > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > From public at enkore.de Sat Oct 8 08:43:39 2016 From: public at enkore.de (Marian Beermann) Date: Sat, 8 Oct 2016 14:43:39 +0200 Subject: [Borgbackup] bypassing "Cache newer than repo" error In-Reply-To: <777b7ccd-cb62-45d6-1c99-1bb9efb12730@gmail.com> References: <777b7ccd-cb62-45d6-1c99-1bb9efb12730@gmail.com> Message-ID: Hi Sitaram On 08.10.2016 14:17, Sitaram Chamarty wrote: >> This sounds like you created one repository and copied it to multiple >> drives/locations? > > I was going to say "no way" but it appears that is what I did. Now I > also understand why it's happening only to one specific repo (a rather > large one I got lazy about creating the first time, and simply did a > copy). I'd clean forgotten! > >> You can change the repository ID in the "config" file of the repository >> (it's hex, keep it the same length), which separates the repositories. > > I assume it has no other semantics so I can just randomly change some > hex digits into others? Yes. Caveat: if you use key-file mode you'll have to make the same change in the key files you use. The default location is ~/.config/borg/keys/. Every key file starts with "BORG_KEY ", that's where you need to make the change. In repokey or unencrypted mode this doesn't matter. >> Note: for encrypted repositories it's a very unsafe thing to have >> multiple independently updated copies of a repository; if they diverge >> (minutely different contents) and an attacker gains access to more than >> one copy, the privacy of the repository contents may be compromised. > > I do have multiple independently updated copies of a repo, but -- other > than this one where #2 was created by a file-system level copy of #1, > all the others are "borg init"-ed independently on each disk. > > The passphrase I use is the same, but I assume the internal key was > randomly generated each time and would not be the same, so the attack > you speak of should not happen for those repos even if someone got hold > of them. > > Is that understanding correct? Repositories independently "borg init"-ed are really independent, no problems there. But in the case, where you "borg init" and then copy (cp -r / rsync / etc.) the repo the keys will also be the same. This possibly leads to the situation where different data is encrypted with the same key, which is highly problematic. Cheers, Marian > Thanks again and best regards > sitaram > >> >> Cheers, Marian >> >> FAQs: >> http://borgbackup.readthedocs.io/en/stable/faq.html#can-i-copy-or-synchronize-my-repo-to-another-location >> >> On 08.10.2016 13:06, Sitaram Chamarty wrote: >>> Hi >>> >>> Some of the directories are backed up to two (in one case three) >>> different external (USB) hard disks. When I finish with the first USB >>> drive, unmount it, and mount the next one and try the backup, borg tells >>> me the cache is newer. >>> >>> I could not find anything about how to bypass this in the docs. I have >>> now created a complicated system of separately maintaining the cache >>> directories for each external disk (labelled in some way that correlates >>> with the physical disk in question) and manually shuffle them around. >>> >>> Any pointers would be appreciated. >>> >>> regards >>> sitaram >>> >>> PS: Yes removing ~/.cache/borg works and is generally harmless. But >>> that makes ALL the files show up as "A ...", whereas I like to eyeball >>> the list to see if something got updated which I did not expec to be >>> updated based on what I have been working on since the last backup. >>> _______________________________________________ >>> Borgbackup mailing list >>> Borgbackup at python.org >>> https://mail.python.org/mailman/listinfo/borgbackup >>> >> >> _______________________________________________ >> Borgbackup mailing list >> Borgbackup at python.org >> https://mail.python.org/mailman/listinfo/borgbackup >> > From sitaramc at gmail.com Sat Oct 8 09:58:47 2016 From: sitaramc at gmail.com (Sitaram Chamarty) Date: Sat, 8 Oct 2016 19:28:47 +0530 Subject: [Borgbackup] bypassing "Cache newer than repo" error In-Reply-To: References: <777b7ccd-cb62-45d6-1c99-1bb9efb12730@gmail.com> Message-ID: <3693aa79-c2ed-edf7-303b-2f84662b8039@gmail.com> Hi Marian, On 10/08/2016 06:13 PM, Marian Beermann wrote: > On 08.10.2016 14:17, Sitaram Chamarty wrote: >>> This sounds like you created one repository and copied it to multiple >>> drives/locations? >> >> I was going to say "no way" but it appears that is what I did. Now I >> also understand why it's happening only to one specific repo (a rather >> large one I got lazy about creating the first time, and simply did a >> copy). I'd clean forgotten! >> >>> You can change the repository ID in the "config" file of the repository >>> (it's hex, keep it the same length), which separates the repositories. >> >> I assume it has no other semantics so I can just randomly change some >> hex digits into others? > > Yes. > > Caveat: if you use key-file mode you'll have to make the same change in > the key files you use. The default location is > ~/.config/borg/keys/. Every key file starts with "BORG_KEY > ", that's where you need to make the change. Thanks. I don't use keyfile but it's good to know. > In repokey or unencrypted mode this doesn't matter. > >>> Note: for encrypted repositories it's a very unsafe thing to have >>> multiple independently updated copies of a repository; if they diverge >>> (minutely different contents) and an attacker gains access to more than >>> one copy, the privacy of the repository contents may be compromised. >> >> I do have multiple independently updated copies of a repo, but -- other >> than this one where #2 was created by a file-system level copy of #1, >> all the others are "borg init"-ed independently on each disk. >> >> The passphrase I use is the same, but I assume the internal key was >> randomly generated each time and would not be the same, so the attack >> you speak of should not happen for those repos even if someone got hold >> of them. >> >> Is that understanding correct? > > Repositories independently "borg init"-ed are really independent, no > problems there. > > But in the case, where you "borg init" and then copy (cp -r / rsync / > etc.) the repo the keys will also be the same. This possibly leads to > the situation where different data is encrypted with the same key, which > is highly problematic. Thanks for confirming my understanding. (Yeah that one repo is at risk; I'll delete the two copies and start new, separate, repos on each disk). Thanks again! regards sitaram > > Cheers, Marian > >> Thanks again and best regards >> sitaram >> >>> >>> Cheers, Marian >>> >>> FAQs: >>> > http://borgbackup.readthedocs.io/en/stable/faq.html#can-i-copy-or-synchronize-my-repo-to-another-location >>> >>> On 08.10.2016 13:06, Sitaram Chamarty wrote: >>>> Hi >>>> >>>> Some of the directories are backed up to two (in one case three) >>>> different external (USB) hard disks. When I finish with the first USB >>>> drive, unmount it, and mount the next one and try the backup, borg tells >>>> me the cache is newer. >>>> >>>> I could not find anything about how to bypass this in the docs. I have >>>> now created a complicated system of separately maintaining the cache >>>> directories for each external disk (labelled in some way that correlates >>>> with the physical disk in question) and manually shuffle them around. >>>> >>>> Any pointers would be appreciated. >>>> >>>> regards >>>> sitaram >>>> >>>> PS: Yes removing ~/.cache/borg works and is generally harmless. But >>>> that makes ALL the files show up as "A ...", whereas I like to eyeball >>>> the list to see if something got updated which I did not expec to be >>>> updated based on what I have been working on since the last backup. >>>> _______________________________________________ >>>> Borgbackup mailing list >>>> Borgbackup at python.org >>>> https://mail.python.org/mailman/listinfo/borgbackup >>>> >>> >>> _______________________________________________ >>> Borgbackup mailing list >>> Borgbackup at python.org >>> https://mail.python.org/mailman/listinfo/borgbackup >>> >> > From tw at waldmann-edv.de Mon Oct 17 12:03:28 2016 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Mon, 17 Oct 2016 18:03:28 +0200 Subject: [Borgbackup] borgbackup 1.0.8rc1 Message-ID: <7dc7552e-879d-a119-8b08-a0fad9730bac@waldmann-edv.de> Released borgbackup 1.0.8rc1 right now. https://github.com/borgbackup/borg/releases/tag/1.0.8rc1 https://github.com/borgbackup/borg/blob/1.0.8rc1/docs/changes.rst#version-108rc1-2016-10-17 It would be helpful if you practically test this, so anything not discovered by unit tests can be fixed. The final 1.0.8 release is scheduled for 2016-10-29, so be quick. -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From public at enkore.de Thu Oct 20 06:15:09 2016 From: public at enkore.de (Marian Beermann) Date: Thu, 20 Oct 2016 12:15:09 +0200 Subject: [Borgbackup] What don't you like about Borg? Message-ID: It's easy to lose track of what's important and what annoys people, or doesn't work for them. We all know about the good stuff Borg does, where it shines. I want to know about the bad stuff. Where it's annoying, doesn't work like one wanted to... Cheers, Marian - Things that annoy me: - No good desktop GUI (I'm not a good designer, my own attempt kinda failed). - Sometimes it's slow and it's hard to tell why without knowing a lot about internals - Error messages are often kinda obscure - When used on the command line progress output is often missing (in current beta) even with --progress From pschiffe at redhat.com Thu Oct 20 06:40:00 2016 From: pschiffe at redhat.com (Peter Schiffer) Date: Thu, 20 Oct 2016 12:40:00 +0200 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: Message-ID: I'm currently moving from borg to burp because I'm missing centralized management - managing clients from server (what to backup, when, etc). Burp enables this and with burp-ui it's possible to manage multiple burp servers and all of their clients from single web ui: https://git.ziirish.me/ziirish/burp-ui In borg I also miss native support of various remote locations, like S3, samba, ftp.. And desktop GUI as well. Even simple deja dup is just fine for regular users.. peter On Thu, Oct 20, 2016 at 12:15 PM, Marian Beermann wrote: > It's easy to lose track of what's important and what annoys people, or > doesn't work for them. > > We all know about the good stuff Borg does, where it shines. I want to > know about the bad stuff. Where it's annoying, doesn't work like one > wanted to... > > > Cheers, Marian > > - > > Things that annoy me: > > - No good desktop GUI (I'm not a good designer, my own attempt kinda > failed). > - Sometimes it's slow and it's hard to tell why without knowing a lot > about internals > - Error messages are often kinda obscure > - When used on the command line progress output is often missing (in > current beta) even with --progress > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > -------------- next part -------------- An HTML attachment was scrubbed... URL: From heiko.helmle at horiba.com Thu Oct 20 07:40:43 2016 From: heiko.helmle at horiba.com (heiko.helmle at horiba.com) Date: Thu, 20 Oct 2016 13:40:43 +0200 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: Message-ID: > > Things that annoy me: > > - No good desktop GUI (I'm not a good designer, my own attempt kinda > failed). > - Sometimes it's slow and it's hard to tell why without knowing a lot > about internals > - Error messages are often kinda obscure > - When used on the command line progress output is often missing (in > current beta) even with --progress well I guess borg is best run by a cron job, so progress indicator and/or GUI doesn't matter to me much. on the basic feature side i'm missing an indicator that checks if the file was changed during backup (like tar does). and different storage backends would be very nice - ssh/borg serve is good but sometimes all you have is FTP (and ftpfs is ugly...). Best Regards Heiko -------------- next part -------------- An HTML attachment was scrubbed... URL: From public at enkore.de Thu Oct 20 07:57:10 2016 From: public at enkore.de (Marian Beermann) Date: Thu, 20 Oct 2016 13:57:10 +0200 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: Message-ID: <772df8ae-f788-048e-8c1a-b6a278f5f27d@enkore.de> There is something for changed files, iirc it has been improved since 1.0.x. It looks like this in 1.0.x: $ mkdir files $ touch files/1 $ touch files/2 $ borg create repo::a1 files --list --filter AME -v A files/1 A files/2 $ touch files/2 $ echo "change" > files/2 $ touch files/3 $ borg create repo::a2 files --list --filter AME -v A files/2 A files/3 $ (A, M => added/modified, E => error, U => unchanged, see http://borgbackup.readthedocs.io/en/stable/usage.html#item-flags for all and FAQ http://borgbackup.readthedocs.io/en/stable/faq.html#a-status-oddity ) Cheers, Marian On 20.10.2016 13:40, heiko.helmle at horiba.com wrote: > >> >> Things that annoy me: >> >> - No good desktop GUI (I'm not a good designer, my own attempt kinda >> failed). >> - Sometimes it's slow and it's hard to tell why without knowing a lot >> about internals >> - Error messages are often kinda obscure >> - When used on the command line progress output is often missing (in >> current beta) even with --progress > > > well I guess borg is best run by a cron job, so progress indicator > and/or GUI doesn't matter to me much. > > on the basic feature side i'm missing an indicator that checks if the > file was changed during backup (like tar does). > > and different storage backends would be very nice - ssh/borg serve is > good but sometimes all you have is FTP (and ftpfs is ugly...). > > Best Regards > Heiko > > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > From heiko.helmle at horiba.com Thu Oct 20 07:59:57 2016 From: heiko.helmle at horiba.com (heiko.helmle at horiba.com) Date: Thu, 20 Oct 2016 13:59:57 +0200 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: <772df8ae-f788-048e-8c1a-b6a278f5f27d@enkore.de> References: <772df8ae-f788-048e-8c1a-b6a278f5f27d@enkore.de> Message-ID: "Borgbackup" wrote on 20.10.2016 13:57:10: > From: Marian Beermann > To: borgbackup at python.org > Date: 20.10.2016 13:57 > Subject: Re: [Borgbackup] What don't you like about Borg? > Sent by: "Borgbackup" > > There is something for changed files, iirc it has been improved since > 1.0.x. It looks like this in 1.0.x: > That's not what I meant. TAR brings a warning if the file changed _during_ backup. If another process writes to a file during backup the backupped file is probably corrupt. That's why tar gives a warning in that case. -------------- next part -------------- An HTML attachment was scrubbed... URL: From public at enkore.de Thu Oct 20 08:04:47 2016 From: public at enkore.de (Marian Beermann) Date: Thu, 20 Oct 2016 14:04:47 +0200 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: <772df8ae-f788-048e-8c1a-b6a278f5f27d@enkore.de> Message-ID: <2650d102-fc67-e6ca-553f-5c52f24a13a9@enkore.de> On 20.10.2016 13:59, heiko.helmle at horiba.com wrote: > > "Borgbackup" > wrote on 20.10.2016 13:57:10: > >> From: Marian Beermann >> To: borgbackup at python.org >> Date: 20.10.2016 13:57 >> Subject: Re: [Borgbackup] What don't you like about Borg? >> Sent by: "Borgbackup" > >> >> There is something for changed files, iirc it has been improved since >> 1.0.x. It looks like this in 1.0.x: >> > > That's not what I meant. TAR brings a warning if the file changed > _during_ backup. > > If another process writes to a file during backup the backupped file is > probably corrupt. That's why tar gives a warning in that case. That would be very useful indeed. I looked up how tar does it: if ((timespec_cmp (get_stat_ctime (&final_stat), original_ctime) != 0 /* Original ctime will change if the file is a directory and --remove-files is given */ && !(remove_files_option && is_dir)) || original_size < final_stat.st_size) { WARNOPT (WARN_FILE_CHANGED, (0, 0, _("%s: file changed as we read it"), quotearg_colon (p))); set_exit_status (TAREXIT_DIFFERS); } It stores the ctime before reading the file and then compares it after it is done with a file. This works on *nix, since the c(hange)time is always updated and can't be changed by user space processes. Cheers, Marian From wtraylor at areyouthinking.org Thu Oct 20 08:22:33 2016 From: wtraylor at areyouthinking.org (Walker Traylor) Date: Thu, 20 Oct 2016 19:22:33 +0700 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: Message-ID: <4CC6B42F-B0CD-440A-8C0F-FE47B8BA2B80@areyouthinking.org> -Documentation, getting started guide could use improvement. I wanted to learn how to set it up as fast as possible and had to learn a lot more about internals and command switches than should be necessary on a first try for a 1.0 release with normal use case. -Repair files notice. I would like to see in "borg list" the archives which have been ?repaired? by borg repair (zero?d out.) -Improved ability to cancel and resume. I have to repair files a lot for some reason, probably on archives which were interrupted, which is troubling. Walker Traylor walker at walkertraylor.com m: +1.703.389.4507 skype: wtraylor linkedin.com/in/walkertraylor > On Oct 20, 2016, at 5:15 PM, Marian Beermann wrote: > > It's easy to lose track of what's important and what annoys people, or > doesn't work for them. > > We all know about the good stuff Borg does, where it shines. I want to > know about the bad stuff. Where it's annoying, doesn't work like one > wanted to... > > > Cheers, Marian > > - > > Things that annoy me: > > - No good desktop GUI (I'm not a good designer, my own attempt kinda > failed). > - Sometimes it's slow and it's hard to tell why without knowing a lot > about internals > - Error messages are often kinda obscure > - When used on the command line progress output is often missing (in > current beta) even with --progress > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup -------------- next part -------------- An HTML attachment was scrubbed... URL: From gait at ATComputing.nl Thu Oct 20 08:02:49 2016 From: gait at ATComputing.nl (Gerrit A. Smit) Date: Thu, 20 Oct 2016 14:02:49 +0200 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: <772df8ae-f788-048e-8c1a-b6a278f5f27d@enkore.de> Message-ID: <11fabffc-7a72-9c63-a6e7-04d98b21f875@ATComputing.nl> Op 20-10-16 om 13:59 schreef heiko.helmle at horiba.com: > If another process writes to a file during backup the backupped file is probably corrupt. That's why tar gives a warning in that case. That's why I like to dump snapshots (ZFS, BtrFS). Gerrit From nl at nachtgeist.net Thu Oct 20 08:52:52 2016 From: nl at nachtgeist.net (Daniel Reichelt) Date: Thu, 20 Oct 2016 14:52:52 +0200 Subject: [Borgbackup] Fwd: Re: What don't you like about Borg? In-Reply-To: References: Message-ID: <27f1c625-d6c1-3640-e498-e586c14cfae6@nachtgeist.net> (manual fwrd since I previously replied only to Marian...) -------- Forwarded Message -------- Subject: Re: [Borgbackup] What don't you like about Borg? Date: Thu, 20 Oct 2016 12:55:19 +0200 From: Marian Beermann To: Daniel Reichelt That annoyed me as well, but I couldn't come up with a good solution. Your idea looks like a very good match to me. Cheers, Marian On 20.10.2016 12:38, Daniel Reichelt wrote: >> We all know about the good stuff Borg does, where it shines. I want to >> know about the bad stuff. Where it's annoying, doesn't work like one >> wanted to... > > Nice thinking :-) > > > My $0.02 about annoyances: > > I think the handling of restores is a pretty cumbersome. Most of the > time I know how old a version of a file/sub-tree I need to restore, but > most certainly I do not know the exact name of the archive that stuff is > stored in. > > Before I switched to borg, I used to restore from rdiff-backup's > "repository" (to stay with borg's terminology) with s.th. like > > rdiff-back -r 5d /path/to/repo/path/to/file > > which restored the subtree /path/to/file to CWD in the state is was > known to rdiff-backup 5 days ago. > > > Now with borg, I have to do a borg list /path/to/repo | grep > $someYear-$someMonth followed by mouse-selection, borg restore > /path/to/repo/$middleMouseClick and so on. > > Of course rdiff-backup and borg differ profoundly in the sense that > rdiff-backup sees a repository logically anchored to a fixed path which > is backed up whereas borg stores whatever was specified on the cmdline > to an archive. > > With that in mind it would really be nice to have something like > > borg restore --arch-prefix user-homes-host-0815 --as-of 5d /path/to/repo > /path/to/file-to-restore-1 /path2/to2/file-to-restore-2 > > which then would restore from repo the files file-to-restore-1 [and so > on] from the archives prefixed with user-homes-host-0815 and doing the > final selection of the archive to use as source by a time match, in this > case the latest archive that precedes the point in time [now - 5 days]. > > > What do you think? > > > Cheers > Daniel > > > > > >> >> >> Cheers, Marian >> >> - >> >> Things that annoy me: >> >> - No good desktop GUI (I'm not a good designer, my own attempt kinda >> failed). >> - Sometimes it's slow and it's hard to tell why without knowing a lot >> about internals >> - Error messages are often kinda obscure >> - When used on the command line progress output is often missing (in >> current beta) even with --progress >> >> _______________________________________________ >> Borgbackup mailing list >> Borgbackup at python.org >> https://mail.python.org/mailman/listinfo/borgbackup >> > From dsjstc at gmail.com Thu Oct 20 10:24:16 2016 From: dsjstc at gmail.com (DS Jstc) Date: Thu, 20 Oct 2016 07:24:16 -0700 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: Message-ID: <1ee06473-f530-1a6d-6245-f85cbc05097d@gmail.com> I'm a recent convert to borg from Crashplan. I'm really impressed with the simplicity and usability, particularly when paired with borgmatic (something like it should be made part of the core distribution). The only thing I'm really missing is a restore GUI. When all goes well, I don't interact with my backup system for months at a time. That means I'll have to re-learn the command syntax when the time comes to find and restore a file. Which is unfortunate -- when I'm restoring a file, I'm usually stressed and under time pressure. My ideal restore gui would have the following features: - very fast and simple pathname string filtering - filter with an OR list of several directories - filter with a date range - easy to show change history for file - easy to find other files sharing chunks with all prior versions of this file - drag to restore From dac at conceptual-analytics.com Thu Oct 20 10:39:33 2016 From: dac at conceptual-analytics.com (Dave Cottingham) Date: Thu, 20 Oct 2016 10:39:33 -0400 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: Message-ID: My use of borg has been limited to testing and playing with it, so my comments may be ill informed. But the reason I haven't used it for my backups is the lack of multiple backends. ssh is great in principle, but when it comes to buying online space for the backups, none of the more affordable options do ssh. Perhaps a documented backend API would enable the user community to help out with this. Thanks, Dave Cottingham On Thu, Oct 20, 2016 at 6:15 AM, Marian Beermann wrote: > It's easy to lose track of what's important and what annoys people, or > doesn't work for them. > > We all know about the good stuff Borg does, where it shines. I want to > know about the bad stuff. Where it's annoying, doesn't work like one > wanted to... > > > Cheers, Marian > > - > > Things that annoy me: > > - No good desktop GUI (I'm not a good designer, my own attempt kinda > failed). > - Sometimes it's slow and it's hard to tell why without knowing a lot > about internals > - Error messages are often kinda obscure > - When used on the command line progress output is often missing (in > current beta) even with --progress > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From anarcat at debian.org Thu Oct 20 11:06:58 2016 From: anarcat at debian.org (Antoine =?utf-8?Q?Beaupr=C3=A9?=) Date: Thu, 20 Oct 2016 11:06:58 -0400 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: Message-ID: <87insnrre5.fsf@angela.anarc.at> On 2016-10-20 10:39:33, Dave Cottingham wrote: > My use of borg has been limited to testing and playing with it, so my > comments may be ill informed. > > But the reason I haven't used it for my backups is the lack of multiple > backends. ssh is great in principle, but when it comes to buying online > space for the backups, none of the more affordable options do ssh. > > Perhaps a documented backend API would enable the user community to help > out with this. A while back, I have tried to document the inner workings of Attic (later borg) in: http://borgbackup.readthedocs.io/en/latest/internals.html Arbitrary backend support is a hard problem, because the client-server architecture of borg is deeply coupled with the borg internals. I have looked at the RPC interface here: https://github.com/borgbackup/borg/issues/102#issuecomment-145749103 And it's obvious to me there is a lot of "intelligence" on the server-side, in fact, SSH is not merely a transport as much as a RPC conduit to allow borg to call itself on the remote end. See also: https://github.com/borgbackup/borg/issues/1070 This is also a frustrating blocker for me: there are very cheap backups providers out there that could be leveraged to provide virtually unlimited, secure backups to borg. Backblaze, for example, has ridiculous prices (50$/machine/year for unlimited backups, business use). In comparison, rsync.net, which supports borg, is 2.40$/GB/year, so you would get a measly 20GB a year with 50$... On top of this one, the things missing from borg are, for me: * a complete GUI (no restore, no desktop automation): https://github.com/borgbackup/borg/issues/314 * API stability commitment (there was a huge discussion about this, but it's still unclear if things will change under our feet, which makes it hard to commit to borg for larger deployments): https://github.com/borgbackup/borg/issues/26 * internationalization: https://github.com/borgbackup/borg/pull/305 * extensible snapshot support: https://github.com/borgbackup/borg/issues/983#issuecomment-222513148 * config files: https://github.com/borgbackup/borg/issues/315 See also this issue for a broader usability review: https://github.com/borgbackup/borg/issues/326 I recently did consulting for a community group here and couldn't honestly recommend using Borg because they would not be autonomous in restoring their backups, because they are not familiar with the command line. I am also worried about long-term stability for them and they needed low-cost offsite backups (that means not having to manage a server). Another example: in my previous job, config files, snapshot support and API stability would have been the issues. I still use borg for my personal use, but it would be great to push it forward to a greater public. I know this is a huge commitment and that brings a lot of support requests and further issues, but I believe the benefits are worth it. I wish I would have the feeling I could contribute to this within the borg project, but my efforts, so far, have been mostly met with refusal. A. -- Les plus beaux chants sont les chants de revendications Le vers doit faire l'amour dans la t?te des populations. ? l'?cole de la po?sie, on n'apprend pas: on se bat! - L?o Ferr?, "Pr?face" From public at enkore.de Thu Oct 20 11:51:59 2016 From: public at enkore.de (Marian Beermann) Date: Thu, 20 Oct 2016 17:51:59 +0200 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: <87insnrre5.fsf@angela.anarc.at> References: <87insnrre5.fsf@angela.anarc.at> Message-ID: <9593548c-05c9-4337-f222-38a1bf91434a@enkore.de> On 20.10.2016 17:06, Antoine Beaupr? wrote: > On 2016-10-20 10:39:33, Dave Cottingham wrote: >> My use of borg has been limited to testing and playing with it, so my >> comments may be ill informed. >> >> But the reason I haven't used it for my backups is the lack of multiple >> backends. ssh is great in principle, but when it comes to buying online >> space for the backups, none of the more affordable options do ssh. >> >> Perhaps a documented backend API would enable the user community to help >> out with this. > > A while back, I have tried to document the inner workings of Attic > (later borg) in: > > http://borgbackup.readthedocs.io/en/latest/internals.html > > Arbitrary backend support is a hard problem, because the client-server > architecture of borg is deeply coupled with the borg internals. I have > looked at the RPC interface here: > > https://github.com/borgbackup/borg/issues/102#issuecomment-145749103 > > And it's obvious to me there is a lot of "intelligence" on the > server-side, in fact, SSH is not merely a transport as much as a RPC > conduit to allow borg to call itself on the remote end. I'm not sure if I ever wrote it on GitHub; personally I think the Repository API is a relatively good API to implement other backends, when an "named object API" is added, which would be used for special cases like repokey and manifest storage, since these would usually require special handling by backends. Then something like pluggy could be used to provide a simple plug-and-play mechanism for repository backends with named (versioned) APIs. These should still be very stable. Cheers, Marian > See also: > > https://github.com/borgbackup/borg/issues/1070 > > This is also a frustrating blocker for me: there are very cheap backups > providers out there that could be leveraged to provide virtually > unlimited, secure backups to borg. Backblaze, for example, has > ridiculous prices (50$/machine/year for unlimited backups, business > use). In comparison, rsync.net, which supports borg, is 2.40$/GB/year, > so you would get a measly 20GB a year with 50$... rsync.net price is 0.36 $/GB/year: http://www.rsync.net/products/attic.html It's still not the cheapest option, but personally I'm satisfied with their service. TN: I'm not paid / I don't receive any benefits for talking about rsync.net Cheers, Marian From anarcat at debian.org Thu Oct 20 12:14:53 2016 From: anarcat at debian.org (Antoine =?utf-8?Q?Beaupr=C3=A9?=) Date: Thu, 20 Oct 2016 12:14:53 -0400 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: <9593548c-05c9-4337-f222-38a1bf91434a@enkore.de> References: <87insnrre5.fsf@angela.anarc.at> <9593548c-05c9-4337-f222-38a1bf91434a@enkore.de> Message-ID: <87funrro8y.fsf@angela.anarc.at> On 2016-10-20 11:51:59, Marian Beermann wrote: >> And it's obvious to me there is a lot of "intelligence" on the >> server-side, in fact, SSH is not merely a transport as much as a RPC >> conduit to allow borg to call itself on the remote end. > > I'm not sure if I ever wrote it on GitHub; personally I think the > Repository API is a relatively good API to implement other backends, > when an "named object API" is added, which would be used for special > cases like repokey and manifest storage, since these would usually > require special handling by backends. Yeah, I was wondering if there was a way to separate the storage API instead of the RemoteRepository API... but I am not sure it would work. > Then something like pluggy could be used to provide a simple > plug-and-play mechanism for repository backends with named (versioned) APIs. > > These should still be very stable. I am not sure I would bother with a third party plugin module... Just class derivation and module discovery should be sufficient. >> See also: >> >> https://github.com/borgbackup/borg/issues/1070 >> >> This is also a frustrating blocker for me: there are very cheap backups >> providers out there that could be leveraged to provide virtually >> unlimited, secure backups to borg. Backblaze, for example, has >> ridiculous prices (50$/machine/year for unlimited backups, business >> use). In comparison, rsync.net, which supports borg, is 2.40$/GB/year, >> so you would get a measly 20GB a year with 50$... > > rsync.net price is 0.36 $/GB/year: > http://www.rsync.net/products/attic.html > > It's still not the cheapest option, but personally I'm satisfied with > their service. That's pretty cool, I didn't know that! I was just looking at: http://rsync.net/pricing.html > TN: I'm not paid / I don't receive any benefits for talking about rsync.net Similarly for other providers for me. A. -- Premature optimization is the root of all evil - Donald Knuth From aperucchi at jahia.com Thu Oct 20 12:33:41 2016 From: aperucchi at jahia.com (Alessandro Perucchi) Date: Thu, 20 Oct 2016 18:33:41 +0200 Subject: [Borgbackup] What don't you like about Borg? Message-ID: Hello, for me that?s simple? handling of locks. When I am doing a backup, then I should be able to do read only operations. At least on other backups, not on the one currently done. Or simply have the list of backups done. Or the list of files in a backup? even if a backup/restore is currently running. What would be great is to have a more fine grained lock, so we can do restore at the same time that a backup is running. At the moment, every time I do a backup, I check it afterwards? and sometimes the check is taking a long long long time? and if I need to do something, then I need to kill the process. :-/ Kind regards, Alessandro From lists.borg at pjw.xsmail.com Thu Oct 20 19:25:20 2016 From: lists.borg at pjw.xsmail.com (lists.borg at pjw.xsmail.com) Date: Thu, 20 Oct 2016 17:25:20 -0600 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: <9593548c-05c9-4337-f222-38a1bf91434a@enkore.de> References: <87insnrre5.fsf@angela.anarc.at> <9593548c-05c9-4337-f222-38a1bf91434a@enkore.de> Message-ID: <1477005920.1826288.762551913.7F26BAFC@webmail.messagingengine.com> On Thu, Oct 20, 2016, at 09:51 AM, Marian Beermann wrote: > On 2016-10-20 10:39:33, Dave Cottingham wrote: > > This is also a frustrating blocker for me: there are very cheap backups > > providers out there that could be leveraged to provide virtually > > unlimited, secure backups to borg. Backblaze, for example, has > > ridiculous prices (50$/machine/year for unlimited backups, business > > use). In comparison, rsync.net, which supports borg, is 2.40$/GB/year, > > so you would get a measly 20GB a year with 50$... > > rsync.net price is 0.36 $/GB/year: > http://www.rsync.net/products/attic.html That's 3 cents per GB/mo. 50GB costs me $18 /yr. Do be aware rsync.net defaults to borg v0.29. For current borg (1.07) use, --remote-path /usr/local/bin/borg1/borg1 Or set BORG_REMOTE_PATH to same. From tmhikaru at gmail.com Fri Oct 21 00:49:18 2016 From: tmhikaru at gmail.com (tmhikaru at gmail.com) Date: Thu, 20 Oct 2016 21:49:18 -0700 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: Message-ID: <20161021044918.GA5752@raspberrypi> The lack of centralized management of borg was a big problem for me as well, though I tried working around it in two different ways - the first was to have the server that ran the backup script to login to the remote client as root with an ssh key and run borg to connect back to the server. While this allowed me to centrally script the backup of all machines remotely, it still relied on the remote clients cpu and memory to do the majority of work and required the client machine have borg. Of my machines, my server is many times stronger than the clients it is backing up, and this is unfortunately exactly backwards from the way borg seems to be intended to be used. The other way I attempted to work around it was to use sshfs mounts on the server to access all of the filesystems on the remote machines. This worked nearly perfectly, except that I had to manually blacklist mountpoints on the remote machines such as /proc/. One major disadvantage with this is unfortunately sshfs did not and may still not support xattrs, so for instance this method cannot be used to backup an selinux system properly. This had the huge advantage of being able to use the cpu&ram of the server that stored the backup to manage compressing the data that was being read from the client side, and also didn't require the client to have borg installed - both of the machines I used had no distro support for borg, and one of them didn't even have python 3. Although sshfs adds a bit of overhead, the result was a 2-3x faster backup of the same data from one machine, and another that was having trouble running the borg client at all was able to use this method to back up its data flawlessly. Yet another plus in its favor is it did not require the client to be able to login to the server's storage, and only required that the client have a keyed ssh login as root, which allowed the server to mount its root by sshfs. Unfortunately although the second attempt I came up with worked MUCH better, the showstopper for me was when I realized that it was likely simply impossible for the machine that was having trouble running the client at all to be able to restore files from a backup that had been made to the server. Even though I was able to do the backups using sshfs on the server to get past the problem of the borg client hanging forever, this would not help in the case where I'd want to connect the backup drive directly to the weak machine and restore files. Its hardware is simply incapable of running borg reliably. And that's the other elephant in the room - borg requires too much memory and cpu on the client end compared to the other backup solutions I have used. In the past I used rsync to do incremental backups of my multiple machines. Since my time with borg, I've returned to rsync. I really liked the deduplicating feature of borg as it saved quite a lot of space with the multiple machines sharing the single repository. Given this experience, I investigated and have since been using a hardlink program that handles things rsync does not in incremental backups, such as file renames, and also finds duplicates across backups from separate machines. Rsync is an imperfect solution, and for an example does not support system level selinux xattrs on purpose as well as having fairly annoying bugs with xattrs crop up from time to time, but aside from that it does work across all of my machines and is supported in each distro. Compared to borg, rsync requires nearly no cpu and memory, and can be used from a central server to orchestrate backups with just a ssh key login to the remote client. Restoring files from an rsynced backup is often literally as simple as mounting the backup drive and running cp -a... All in all, I simply do not meet the requirements of using borg and gave up trying to find a way. Tim McGrath On Thu, Oct 20, 2016 at 12:40:00PM +0200, Peter Schiffer wrote: > I'm currently moving from borg to burp because I'm missing centralized > management - managing clients from server (what to backup, when, etc). Burp > enables this and with burp-ui it's possible to manage multiple burp servers > and all of their clients from single web ui: > https://git.ziirish.me/ziirish/burp-ui > > In borg I also miss native support of various remote locations, like S3, > samba, ftp.. > > And desktop GUI as well. Even simple deja dup is just fine for regular > users.. > > peter > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 465 bytes Desc: Digital signature URL: From jungleboogie0 at gmail.com Fri Oct 21 11:44:54 2016 From: jungleboogie0 at gmail.com (jungle Boogie) Date: Fri, 21 Oct 2016 08:44:54 -0700 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: <87insnrre5.fsf@angela.anarc.at> References: <87insnrre5.fsf@angela.anarc.at> Message-ID: On 20 October 2016 at 08:06, Antoine Beaupr? wrote: > I recently did consulting for a community group here and couldn't > honestly recommend using Borg because they would not be autonomous in > restoring their backups, because they are not familiar with the command > line. I am also worried about long-term stability for them and they > needed low-cost offsite backups (that means not having to manage a > server). Another example: in my previous job, config files, snapshot > support and API stability would have been the issues. > > I still use borg for my personal use, but it would be great to push it > forward to a greater public. I know this is a huge commitment and that > brings a lot of support requests and further issues, but I believe the > benefits are worth it. I wish I would have the feeling I could > contribute to this within the borg project, but my efforts, so far, have > been mostly met with refusal. As far as I'm concerned the greater public now means people who do their computing on tablets and smart phones. Borg would never work on those devices. I don't think borg is to blame for the lack of integration to services like dropbx, jungle disk, amazon s3, backblaze, of which the greater public is likely only familiar with dropbox. Why would those companies want someone else's code running on their infrastructure? It's nice that rsync.net offers a service with borg, but what version is it? How fast will they update it to get latest features and important bug fixes? I had jungledisk for about three years and the client was never once updated; I think towards the end of the service I had with them they were updating their website to support stronger TLS ciphers and then were going to roll out a new client. The greater public doesn't care about that. If we're assuming the greater public are folks who use computers and laptops, you're right, they probably won't want command line stuff and to setup things with cronjobs. But again, they don't care what the app is as long as it helps them feel their backups are safe. They likely won't care about encrypted backups and strong ciphers. They'll likely chose a backup plan that has the best prices on black Friday, and they may not even renew the following year. That said, I look forward to the improvements that come to borg! -- ------- inum: 883510009027723 sip: jungleboogie at sip2sip.info From jungleboogie0 at gmail.com Fri Oct 21 11:57:22 2016 From: jungleboogie0 at gmail.com (jungle Boogie) Date: Fri, 21 Oct 2016 08:57:22 -0700 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: Message-ID: On 20 October 2016 at 03:15, Marian Beermann wrote: > Things that annoy me: > > - No good desktop GUI (I'm not a good designer, my own attempt kinda > failed). > - Sometimes it's slow and it's hard to tell why without knowing a lot > about internals > - Error messages are often kinda obscure > - When used on the command line progress output is often missing (in > current beta) even with --progress I think a desktop UI is more practical than a web UI. If borg were to have a webUI and some kind of centralized management, I wouldn't want it without TLS, especially for public internet usage. So that means a self signed cert or bundling with letsencrypt. This means the webUI will be running as root to access those lower ports. Some people may be fine with borg running as root. Can borg output its updates to files? I have it setup as a cron job and the output sent to me via email is a little hard to read. What's this mean: U /var/unbound/unbound.conf I haven't updated that file in awhile, but I have it backed up all the time. -- ------- inum: 883510009027723 sip: jungleboogie at sip2sip.info From public at enkore.de Fri Oct 21 12:10:47 2016 From: public at enkore.de (Marian Beermann) Date: Fri, 21 Oct 2016 18:10:47 +0200 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: Message-ID: On 21.10.2016 17:57, jungle Boogie wrote: > On 20 October 2016 at 03:15, Marian Beermann wrote: >> Things that annoy me: >> >> - No good desktop GUI (I'm not a good designer, my own attempt kinda >> failed). >> - Sometimes it's slow and it's hard to tell why without knowing a lot >> about internals >> - Error messages are often kinda obscure >> - When used on the command line progress output is often missing (in >> current beta) even with --progress > > I think a desktop UI is more practical than a web UI. If borg were to > have a webUI and some kind of centralized management, I wouldn't want > it without TLS, especially for public internet usage. So that means a > self signed cert or bundling with letsencrypt. This means the webUI > will be running as root to access those lower ports. Some people may > be fine with borg running as root. > > Can borg output its updates to files? I have it setup as a cron job > and the output sent to me via email is a little hard to read. What's > this mean: > U /var/unbound/unbound.conf > > I haven't updated that file in awhile, but I have it backed up all the time. > You can use --filter to filter out Unchanged files. Since Borg accepts external logging configurations (BORG_LOGGING_CONF, https://docs.python.org/3/library/logging.config.html#configuration-file-format ) it's relatively easy to separate it's output. In 1.1.x things like file listings go to a different, named logger than normal output, for example. This can be used to largely avoid monkeypatching for a GUI. Cheers, Marian From public at enkore.de Fri Oct 21 12:12:55 2016 From: public at enkore.de (Marian Beermann) Date: Fri, 21 Oct 2016 18:12:55 +0200 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: <87insnrre5.fsf@angela.anarc.at> Message-ID: <25ad8422-919a-f5f4-bf42-1f012f5a1c35@enkore.de> rsync.net: "borg" still refers to 0.xx, but "borg1" is the latest 1.0.x release. I'm not sure how fast exactly they updated it to 1.0.7, but I think less than a week. AFAIK the don't use the binaries Thomas' makes, but compile their own (file size is different from what's released). BORG_REMOTE_PATH=borg1 and it works. > As far as I'm concerned the greater public now means people who do > their computing on tablets and smart phones. Borg would never work on > those devices. I think textshell managed to run it on Android and even back up the root FS, but I somewhat have doubts regarding the viability of a working restore. It might work for documents and stuff like that, but probably not for a system backup. In either case it's cumbersome to use compared to some Android app (which I feel are also cumbersome to use, but that's another matter). Cheers, Marian From jungleboogie0 at gmail.com Wed Oct 26 19:51:26 2016 From: jungleboogie0 at gmail.com (jungle Boogie) Date: Wed, 26 Oct 2016 16:51:26 -0700 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: <25ad8422-919a-f5f4-bf42-1f012f5a1c35@enkore.de> References: <87insnrre5.fsf@angela.anarc.at> <25ad8422-919a-f5f4-bf42-1f012f5a1c35@enkore.de> Message-ID: On 21 October 2016 at 09:12, Marian Beermann wrote: > rsync.net: "borg" still refers to 0.xx, but "borg1" is the latest 1.0.x > release. I'm not sure how fast exactly they updated it to 1.0.7, but I > think less than a week. AFAIK the don't use the binaries Thomas' makes, > but compile their own (file size is different from what's released). > > BORG_REMOTE_PATH=borg1 and it works. > That's good to hear about rsync. That's worth a shot with them! What's that variable do? > >> As far as I'm concerned the greater public now means people who do >> their computing on tablets and smart phones. Borg would never work on >> those devices. > > I think textshell managed to run it on Android and even back up the root > FS, but I somewhat have doubts regarding the viability of a working restore. > It might work for documents and stuff like that, but probably not for a > system backup. In either case it's cumbersome to use compared to some > Android app (which I feel are also cumbersome to use, but that's another > matter). > So borg can work on those devices. I shouldn't have said never (sorry for the double negative ;)) > Cheers, Marian -- ------- inum: 883510009027723 sip: jungleboogie at sip2sip.info From public at enkore.de Thu Oct 27 12:38:53 2016 From: public at enkore.de (Marian Beermann) Date: Thu, 27 Oct 2016 18:38:53 +0200 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: References: <87insnrre5.fsf@angela.anarc.at> <25ad8422-919a-f5f4-bf42-1f012f5a1c35@enkore.de> Message-ID: <26bb7618-6e2e-16ab-50aa-22fcae66ede7@enkore.de> On 27.10.2016 01:57, jungle Boogie wrote: > On 26 October 2016 at 16:39, jungle Boogie wrote: >>> You can use --filter to filter out Unchanged files. >>> >> >> Can you give an example of that? I don't know what STATUSCHARS I need >> to list files that have recently changed. > > > I think I got this working with ?filter=AME > > What's AME represent and where can I see that? > Sorry, forgot to add a link: http://borgbackup.readthedocs.io/en/stable/usage.html#item-flags > That's good to hear about rsync. That's worth a shot with them! > What's that variable do? It has Borg use the 1.x release installed on rsync.net's servers; the command "borg" refers to 0.xx on their servers. Cheers, Marian From gmatht at gmail.com Fri Oct 28 04:30:17 2016 From: gmatht at gmail.com (John McCabe-Dansted) Date: Fri, 28 Oct 2016 16:30:17 +0800 Subject: [Borgbackup] What don't you like about Borg? In-Reply-To: <26bb7618-6e2e-16ab-50aa-22fcae66ede7@enkore.de> References: <87insnrre5.fsf@angela.anarc.at> <25ad8422-919a-f5f4-bf42-1f012f5a1c35@enkore.de> <26bb7618-6e2e-16ab-50aa-22fcae66ede7@enkore.de> Message-ID: Personal Peeves: 1. I often end up with stale lock files. 2. Not well suited to multiple clients backing up to a single deduplicated repository. 3. No Partclone integration for storing sparse disk images. I think (3) Is more the responsibility of Partclone, since if Partclone had a way of dd'ing images or mounting partclone backups, it would be trivial for Borg to support it. On 28 October 2016 at 00:38, Marian Beermann wrote: > On 27.10.2016 01:57, jungle Boogie wrote: > > On 26 October 2016 at 16:39, jungle Boogie > wrote: > >>> You can use --filter to filter out Unchanged files. > >>> > >> > >> Can you give an example of that? I don't know what STATUSCHARS I need > >> to list files that have recently changed. > > > > > > I think I got this working with ?filter=AME > > > > What's AME represent and where can I see that? > > > > Sorry, forgot to add a link: > http://borgbackup.readthedocs.io/en/stable/usage.html#item-flags > > > That's good to hear about rsync. That's worth a shot with them! > > What's that variable do? > > It has Borg use the 1.x release installed on rsync.net's servers; the > command "borg" refers to 0.xx on their servers. > > Cheers, Marian > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > -- John C. McCabe-Dansted -------------- next part -------------- An HTML attachment was scrubbed... URL: From tw at waldmann-edv.de Sat Oct 29 07:37:02 2016 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sat, 29 Oct 2016 13:37:02 +0200 Subject: [Borgbackup] borgbackup 1.0.8 released Message-ID: https://github.com/borgbackup/borg/releases/tag/1.0.8 Bug fixes, please upgrade. More details: see URL above. Cheers, Thomas -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From giovanni at panozzo.it Sun Oct 30 14:27:17 2016 From: giovanni at panozzo.it (Giovanni Panozzo) Date: Sun, 30 Oct 2016 19:27:17 +0100 Subject: [Borgbackup] Chunk size experiments Message-ID: Hello to all, I'm new to this ML and to borg. I started using it on small servers, I liked it and now I would like to use it also in bigger fileservers. The problem: I'm running a FreeNAS (FreeBSD 10) 5.2 TB fileserver, and I'm testing borg on it. I started with only a subdirectory of 2.1 TB of 500868 files, with --chunker-params 19,23,21,4095 and here are the results: - Time taken: 20h 44mins 52sec - Data usage after dedup: 1.76 TB - Chunks: 1193475 Time is ok, deduplication also is very good. But chunks... one million of chunks is twice the number of the files. Is it ok ? It took half an hour to delete a repository with 119347 files. I know, deleting a repo is not a common operation, so it can be not very important. I tried to raise some chunker-params values, but it seems that 23 is the maximum value I can have as CHUNK_MAX_EXP... Any suggestions ? Thank you. Borg backup is a great tool! From public at enkore.de Sun Oct 30 15:55:27 2016 From: public at enkore.de (Marian Beermann) Date: Sun, 30 Oct 2016 20:55:27 +0100 Subject: [Borgbackup] Chunk size experiments In-Reply-To: References: Message-ID: <340e5620-bfea-5df8-932c-6ca10cdeef3e@enkore.de> On 30.10.2016 19:27, Giovanni Panozzo wrote: > Hello to all, I'm new to this ML and to borg. Hi :) > I started using it on small servers, I liked it and now I would like to > use it also in bigger fileservers. > > The problem: > I'm running a FreeNAS (FreeBSD 10) 5.2 TB fileserver, and I'm testing > borg on it. > > I started with only a subdirectory of 2.1 TB of 500868 files, > with --chunker-params 19,23,21,4095 > and here are the results: > > - Time taken: 20h 44mins 52sec > - Data usage after dedup: 1.76 TB > - Chunks: 1193475 > > Time is ok, deduplication also is very good. > > But chunks... one million of chunks is twice the number of the files. > Is it ok ? With 19,23,21 the target average chunk size is 2^21 bytes = 2 MB. 2.1 TB / 500k files is ~4.2 MB so that's totally fine. > It took half an hour to delete a repository with 119347 > files. I know, deleting a repo is not a common operation, so it can be > not very important. Thanks for that information. I looked into it and it's a defect in the Python standard library. For now I created a Borg ticket: https://github.com/borgbackup/borg/issues/1776 > I tried to raise some chunker-params values, but it seems that 23 is the > maximum value I can have as CHUNK_MAX_EXP... Correct. This is a conscious limitation, because it's easier to know a "hard limit" on the chunks that can appear. Larger chunk sizes would make very little if any difference. > Any suggestions ? A change we're making for 1.1 and that also mostly works ok with 1.0 is to increase the segment size in the repository config from 5 MB to 500 MB. Things like prune/delete will go slower in 1.0, though. But writing should be a fair amount faster, if it's not CPU limited (but your numbers above look like it is). Cheers, Marian From giovanni at panozzo.it Sun Oct 30 16:07:40 2016 From: giovanni at panozzo.it (Giovanni Panozzo) Date: Sun, 30 Oct 2016 21:07:40 +0100 Subject: [Borgbackup] Chunk size experiments In-Reply-To: <340e5620-bfea-5df8-932c-6ca10cdeef3e@enkore.de> References: <340e5620-bfea-5df8-932c-6ca10cdeef3e@enkore.de> Message-ID: > > With 19,23,21 the target average chunk size is 2^21 bytes = 2 MB. 2.1 TB > / 500k files is ~4.2 MB so that's totally fine. Thank you. > >> It took half an hour to delete a repository with 119347 >> files. I know, deleting a repo is not a common operation, so it can be >> not very important. My fault: I forgot one digit during copy&paste :( they are 1193475 I think the slowness could also be a filesystem issue, I'm using ZFS and one million of files to delete can be a big deal. But feel free to improve it at the python side, and, as we agree, it's not a common operation to delete a repository. > A change we're making for 1.1 and that also mostly works ok with 1.0 is > to increase the segment size in the repository config from 5 MB to 500 > MB. Things like prune/delete will go slower in 1.0, though. But writing > should be a fair amount faster, if it's not CPU limited (but your > numbers above look like it is). Thank you. I will experiment with 1.1 when available :) From archi.laurent at gmail.com Fri Nov 11 05:18:28 2016 From: archi.laurent at gmail.com (Laurent Archi) Date: Fri, 11 Nov 2016 11:18:28 +0100 Subject: [Borgbackup] Borg return status (RC in mode full) Message-ID: Hi, In documentation (Official site) Borg backup with option "--show-rc" return 0,1 or 2...and same the PID. Ok for this. But in Perl script when i exec a command, this end of execution return 512 similar at rc = 2. 512 is for me = file already exists" and my question is "have you this full codes for more precisions ?" (sorry for my english) Best regards and thanks -- ----~o00o-----//{ ??`(_)??` }\\-----o00o~------ Laurent Archambault Under Linux -------------- next part -------------- An HTML attachment was scrubbed... URL: From nospam at kota.moe Wed Nov 16 07:57:45 2016 From: nospam at kota.moe (=?UTF-8?B?4oCN5bCP5aSq?=) Date: Wed, 16 Nov 2016 23:57:45 +1100 Subject: [Borgbackup] Backup over sneakernet? Message-ID: Hello Suppose I have the following situation: - 500 GB (incompressible, unique) data to backup to an untrusted offsite location - Slow (~1 Mb/s) and data capped upload speed Clearly uploading the backup over the internet will be too slow. The typical solution to this problem is to copy everything to a hard drive, mail it to the offsite location and they plug it in to the server, where then you can copy off it - AKA the sneakernet. So far, this works fine with Borg - create a new repo on the hard drive and back up everything there and just mail that. Offsite location never gets to see the unencrypted data either. But now say I've generated another 100 GB of incompressible and unique data, and now want to update the remote repo with this backup - but it's still too large to upload over the internet. One possible solution would be to just update the repo on the hard drive, and once it's offsite, copying it across. But there's a potential point of failure here - if the repo on the hard drive gets corrupted in the meantime, copying it will also corrupt the offsite copy. Are there any solutions to this problem that Borg supports? -------------- next part -------------- An HTML attachment was scrubbed... URL: From public at enkore.de Wed Nov 16 09:08:42 2016 From: public at enkore.de (Marian Beermann) Date: Wed, 16 Nov 2016 15:08:42 +0100 Subject: [Borgbackup] Backup over sneakernet? In-Reply-To: References: Message-ID: <3050935a-5166-caf8-ca96-0241646343d0@enkore.de> Hi ?? a viable stop gap would be to enable append only mode and use some synchronization tool on both ends that only copies new segments to / from the disk. Then a "borg check" can be used after the process to verify that the repository arrived intact. Note that, generally speaking, even in non-append-only mode Borg only creates or deletes files in data/, never modifies them. There is still a corruption vector in that if a file in data/ is deleted some of it's data can be copied to a new file. If that new file were to become corrupted it could damage an existing archive. Replicating archives across multiple repositories w/ full cryptographic integrity has been requested a few times but not yet included. In your case it wouldn't really work, though (Perhaps if the disk always carries a full repository?). Cheers, Marian On 16.11.2016 13:57, ??? wrote: > Hello > > Suppose I have the following situation: > - 500 GB (incompressible, unique) data to backup to an untrusted offsite > location > - Slow (~1 Mb/s) and data capped upload speed > > Clearly uploading the backup over the internet will be too slow. > The typical solution to this problem is to copy everything to a hard > drive, mail it to the offsite location and they plug it in to the > server, where then you can copy off it - AKA the sneakernet. > > So far, this works fine with Borg - create a new repo on the hard drive > and back up everything there and just mail that. Offsite location never > gets to see the unencrypted data either. > > But now say I've generated another 100 GB of incompressible and unique > data, and now want to update the remote repo with this backup - but it's > still too large to upload over the internet. > > One possible solution would be to just update the repo on the hard > drive, and once it's offsite, copying it across. But there's a potential > point of failure here - if the repo on the hard drive gets corrupted in > the meantime, copying it will also corrupt the offsite copy. > > Are there any solutions to this problem that Borg supports? From tw at waldmann-edv.de Wed Nov 23 17:49:21 2016 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Wed, 23 Nov 2016 23:49:21 +0100 Subject: [Borgbackup] do not run borg check --repair on old attic archives Message-ID: <560c54e6-0b1c-d29d-8cd3-ae2e306e9f7d@waldmann-edv.de> PSA: do not run borg check --repair on repos that have archives made with attic <= 0.13. See: https://github.com/borgbackup/borg/issues/1837 for more details. The issue will be fixed in borg 1.0.9. -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From tw at waldmann-edv.de Sun Nov 27 00:27:17 2016 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sun, 27 Nov 2016 06:27:17 +0100 Subject: [Borgbackup] borgbackup 1.0.9rc1 Message-ID: Released borgbackup 1.0.9rc1 right now. https://github.com/borgbackup/borg/releases/tag/1.0.9rc1 https://github.com/borgbackup/borg/blob/1.0.9rc1/docs/changes.rst It would be helpful if you practically test this, so anything not discovered by unit tests can be fixed. The final 1.0.9 release is scheduled for December, so be quick. -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From dpierceprice at gmail.com Sun Nov 27 07:27:57 2016 From: dpierceprice at gmail.com (Douglas Pierce-Price) Date: Sun, 27 Nov 2016 13:27:57 +0100 Subject: [Borgbackup] BorgBackup Mac compatibility, e.g. for Photos library Message-ID: Hello I'm thinking of using BorgBackup on a Mac. Is there anything I need to be careful of in terms of compatibility with the Mac file system, or will it just work OK? (e.g. metadata, resource forks, whatever) Probably the most important thing I want to backup is the Photos library. Will BorgBackup handle this correctly? Does the fact that this is a Package make any difference? Or, would I run into problems if the Photos application is running and the Photos library is therefore open when I run BorgBackup? Would I need to close the library before running? Many thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From public at enkore.de Sun Nov 27 08:29:30 2016 From: public at enkore.de (Marian Beermann) Date: Sun, 27 Nov 2016 14:29:30 +0100 Subject: [Borgbackup] BorgBackup Mac compatibility, e.g. for Photos library In-Reply-To: References: Message-ID: <4d1ceffd-6f32-a66c-86b4-f9d031155cc3@enkore.de> Hi Douglas, (reply inline) On 27.11.2016 13:27, Douglas Pierce-Price wrote: > Hello > > I'm thinking of using BorgBackup on a Mac. Is there anything I need to > be careful of in terms of compatibility with the Mac file system, or > will it just work OK? (e.g. metadata, resource forks, whatever) ACLs, resource forks and flags are supported on OSX (cf. http://borgbackup.readthedocs.io/en/latest/installation.html#features-platforms ). AFAIK these are all filesystem specialities OSX has. > Probably the most important thing I want to backup is the Photos > library. Will BorgBackup handle this correctly? Probably. I don't know much about OSX; if all data the application needs are in the backed up repositories, then there shouldn't be a problem. > Does the fact that this is a Package make any difference? > > Or, would I run into problems if the Photos application is running and > the Photos library is therefore open when I run BorgBackup? Would I need > to close the library before running? It probably has some kind of embedded database (perhaps sqlite or something similar?), so it likely needs to be closed / not actively used during the backup. As with any backup it is a good idea to try a *real* restore and see if everything works out as expected. > Many thanks! > Cheers, Marian From public at enkore.de Sun Nov 27 08:30:20 2016 From: public at enkore.de (Marian Beermann) Date: Sun, 27 Nov 2016 14:30:20 +0100 Subject: [Borgbackup] BorgBackup Mac compatibility, e.g. for Photos library In-Reply-To: <4d1ceffd-6f32-a66c-86b4-f9d031155cc3@enkore.de> References: <4d1ceffd-6f32-a66c-86b4-f9d031155cc3@enkore.de> Message-ID: <38c233da-3ea0-d9bf-25ff-33a39d6d70c6@enkore.de> On 27.11.2016 14:29, Marian Beermann wrote: > Probably. I don't know much about OSX; if all data the application needs > are in the backed up repositories, then there shouldn't be a problem. ... are in the backed up directories (not repositories) From tkpapp at gmail.com Sun Nov 27 09:41:19 2016 From: tkpapp at gmail.com (Tamas Papp) Date: Sun, 27 Nov 2016 15:41:19 +0100 Subject: [Borgbackup] script for desktop notification Message-ID: <87mvglasts.fsf@gmail.com> Hi, I am a new borgbackup user. First, thanks for this fantastic tool! I have a question: on a laptop, I would like to initiate backups manually, so that I can decide whether I am on a fast connection that has no data charges. However, I don't want to forget my daily backup either. So I would like to write a small script that checks for today's backup and keeps nagging me (I put it in crontab). Here is my first attempt: --8<---------------cut here---------------start------------->8--- #!/bin/bash set -e # exit when no internet REPOSITORY=[...my repo...] LIST=`borgbackup list $REPOSITORY` DATE=`date +%Y-%m-%d` case $LIST in *"$DATE"* ) # found today's backup, OK ;; * ) notify-send "no backup today ? run borg-backup" ;; esac --8<---------------cut here---------------end--------------->8--- I am wondering if there is anything more idiomatic I could use though. Best, Tamas From alainm at bonseletrons.com.br Wed Nov 30 10:54:50 2016 From: alainm at bonseletrons.com.br (Alain Mouette) Date: Wed, 30 Nov 2016 13:54:50 -0200 Subject: [Borgbackup] File history Message-ID: <583EF64A.4080504@bonseletrons.com.br> Hi, I am searching for a backup system and Borg is curently the most atractive, but... I read many docs, but I didn't find this: Is there a way to view all previous versions of a specific file, or any equivalent method of finding previous versions of that file to find past versions? Thanks, -- Alain Mouette === Projetos especiais: === From adrian.klaver at aklaver.com Wed Nov 30 11:05:29 2016 From: adrian.klaver at aklaver.com (Adrian Klaver) Date: Wed, 30 Nov 2016 08:05:29 -0800 Subject: [Borgbackup] File history In-Reply-To: <583EF64A.4080504@bonseletrons.com.br> References: <583EF64A.4080504@bonseletrons.com.br> Message-ID: <1126a865-4d37-5e5b-d9c6-103a0fcc1376@aklaver.com> On 11/30/2016 07:54 AM, Alain Mouette wrote: > Hi, I am searching for a backup system and Borg is curently the most > atractive, but... > I read many docs, but I didn't find this: > > Is there a way to view all previous versions of a specific file, or any > equivalent method of finding previous versions of that file to find past > versions? The only thing I can think of is coming in version 1.1.0: http://borgbackup.readthedocs.io/en/1.1.0b2/usage.html#borg-diff Not sure if that meets your needs or not. > > Thanks, > -- Adrian Klaver adrian.klaver at aklaver.com From public at enkore.de Wed Nov 30 11:24:59 2016 From: public at enkore.de (Marian Beermann) Date: Wed, 30 Nov 2016 17:24:59 +0100 Subject: [Borgbackup] File history In-Reply-To: <1126a865-4d37-5e5b-d9c6-103a0fcc1376@aklaver.com> References: <583EF64A.4080504@bonseletrons.com.br> <1126a865-4d37-5e5b-d9c6-103a0fcc1376@aklaver.com> Message-ID: <455b90ec-3107-e62c-ffa8-8ae7cf1720da@enkore.de> mount in 1.1.x (currently beta) has a versions view, where you have a directory for every file with every version of that file Borg knows. Cheers, Marian On 30.11.2016 17:05, Adrian Klaver wrote: > On 11/30/2016 07:54 AM, Alain Mouette wrote: >> Hi, I am searching for a backup system and Borg is curently the most >> atractive, but... >> I read many docs, but I didn't find this: >> >> Is there a way to view all previous versions of a specific file, or any >> equivalent method of finding previous versions of that file to find past >> versions? > > The only thing I can think of is coming in version 1.1.0: > > http://borgbackup.readthedocs.io/en/1.1.0b2/usage.html#borg-diff > > Not sure if that meets your needs or not. > >> >> Thanks, >> > > From alainm at bonseletrons.com.br Wed Nov 30 11:30:45 2016 From: alainm at bonseletrons.com.br (Alain Mouette) Date: Wed, 30 Nov 2016 14:30:45 -0200 Subject: [Borgbackup] File history In-Reply-To: <455b90ec-3107-e62c-ffa8-8ae7cf1720da@enkore.de> References: <583EF64A.4080504@bonseletrons.com.br> <1126a865-4d37-5e5b-d9c6-103a0fcc1376@aklaver.com> <455b90ec-3107-e62c-ffa8-8ae7cf1720da@enkore.de> Message-ID: <583EFEB5.7070300@bonseletrons.com.br> Yes, Marian Beermann, that seems to fit perfectly with what I want/need. Is there already any doc for it? if so, could you point it to me, please? thanks, Alain Mouette === Projetos especiais: === On 30-11-2016 14:24, Marian Beermann wrote: > mount in 1.1.x (currently beta) has a versions view, where you have a > directory for every file with every version of that file Borg knows. > > Cheers, Marian > > On 30.11.2016 17:05, Adrian Klaver wrote: >> On 11/30/2016 07:54 AM, Alain Mouette wrote: >>> Hi, I am searching for a backup system and Borg is curently the most >>> atractive, but... >>> I read many docs, but I didn't find this: >>> >>> Is there a way to view all previous versions of a specific file, or any >>> equivalent method of finding previous versions of that file to find past >>> versions? >> The only thing I can think of is coming in version 1.1.0: >> >> http://borgbackup.readthedocs.io/en/1.1.0b2/usage.html#borg-diff >> >> Not sure if that meets your needs or not. >> >>> Thanks, >>> >> > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup From tkpapp at gmail.com Wed Nov 30 11:37:01 2016 From: tkpapp at gmail.com (Tamas Papp) Date: Wed, 30 Nov 2016 17:37:01 +0100 Subject: [Borgbackup] exclusion patterns question Message-ID: <87r35shqky.fsf@gmail.com> Hi, I need a bit of help with patterns. 1. How do I escape spaces? pp:/home/tamas/VirtualBox\ VMs did not work. 2. Is there a way I can simplify the specification of paths in my home directory? Eg instead of pp:/home/tamas/.cache something like pp:~/.cache or pp:$HOME/.cache This would also make files more portable across users. 3. Finally, could someone share an exclusion patterns file for a Linux desktop that one can use to get started? I found https://github.com/rubo77/rsync-homedir-excludes which I could modify, but if there is already one with the syntax of borgbackup, it would be nice. Best, Tamas From public at enkore.de Wed Nov 30 11:37:25 2016 From: public at enkore.de (Marian Beermann) Date: Wed, 30 Nov 2016 17:37:25 +0100 Subject: [Borgbackup] File history In-Reply-To: <583EFEB5.7070300@bonseletrons.com.br> References: <583EF64A.4080504@bonseletrons.com.br> <1126a865-4d37-5e5b-d9c6-103a0fcc1376@aklaver.com> <455b90ec-3107-e62c-ffa8-8ae7cf1720da@enkore.de> <583EFEB5.7070300@bonseletrons.com.br> Message-ID: Only a short synopsis in the docs: > Additional mount options supported by borg: > > versions: when used with a repository mount, this gives a merged, > versioned view of the files in the archives. > EXPERIMENTAL, layout may change in future. http://borgbackup.readthedocs.io/en/latest/usage.html#id31 On 30.11.2016 17:30, Alain Mouette wrote: > Yes, Marian Beermann, that seems to fit perfectly with what I want/need. > Is there already any doc for it? if so, could you point it to me, please? > > thanks, > > Alain Mouette > === Projetos especiais: === > > On 30-11-2016 14:24, Marian Beermann wrote: >> mount in 1.1.x (currently beta) has a versions view, where you have a >> directory for every file with every version of that file Borg knows. >> >> Cheers, Marian >> >> On 30.11.2016 17:05, Adrian Klaver wrote: >>> On 11/30/2016 07:54 AM, Alain Mouette wrote: >>>> Hi, I am searching for a backup system and Borg is curently the most >>>> atractive, but... >>>> I read many docs, but I didn't find this: >>>> >>>> Is there a way to view all previous versions of a specific file, or any >>>> equivalent method of finding previous versions of that file to find >>>> past >>>> versions? >>> The only thing I can think of is coming in version 1.1.0: >>> >>> http://borgbackup.readthedocs.io/en/1.1.0b2/usage.html#borg-diff >>> >>> Not sure if that meets your needs or not. >>> >>>> Thanks, >>>> >>> >> _______________________________________________ >> Borgbackup mailing list >> Borgbackup at python.org >> https://mail.python.org/mailman/listinfo/borgbackup > > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup From tw at waldmann-edv.de Fri Dec 2 12:13:48 2016 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Fri, 2 Dec 2016 18:13:48 +0100 Subject: [Borgbackup] exclusion patterns question In-Reply-To: <87r35shqky.fsf@gmail.com> References: <87r35shqky.fsf@gmail.com> Message-ID: Moin Tamas, > 1. How do I escape spaces? > > pp:/home/tamas/VirtualBox\ VMs If you have this inside an exclude file: do not escape them at all. Escaping or quoting is only needed on the shell / in shell scripts. > 2. Is there a way I can simplify the specification of paths in my home > directory? Eg instead of > > pp:/home/tamas/.cache > > something like > > pp:~/.cache > > or > > pp:$HOME/.cache The patterns in a exclude file are just taken "as is", no env vars expanded, no shell expansion. Whether it is expanded when used from the shell commandline depends on the expansion rules of your shell, try it. > 3. Finally, could someone share an exclusion patterns file for a > Linux desktop that one can use to get started? I have these (among others, for a full system backup): --exclude-caches \ --exclude "$SRC_MOUNT/home/*/.thunderbird/*/ImapMail/*" \ --exclude "$SRC_MOUNT/home/*/.cache/*" \ --exclude "$SRC_MOUNT/home/*/.local/share/zeitgeist/*" \ --exclude "$SRC_MOUNT/home/*/.local/share/Trash/*" \ --exclude "$SRC_MOUNT/var/cache/*" \ --exclude "$SRC_MOUNT/var/lib/apt/lists/*" \ --exclude "$SRC_MOUNT/var/tmp/*" Below SRC_MOUNT, I only have on-disk filesystems mounted (no /sys /proc /tmp etc.). Cheers, Thomas -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From dpierceprice at gmail.com Thu Dec 8 11:29:25 2016 From: dpierceprice at gmail.com (Douglas Pierce-Price) Date: Thu, 8 Dec 2016 17:29:25 +0100 Subject: [Borgbackup] BorgBackup Mac compatibility, e.g. for Photos library In-Reply-To: <38c233da-3ea0-d9bf-25ff-33a39d6d70c6@enkore.de> References: <4d1ceffd-6f32-a66c-86b4-f9d031155cc3@enkore.de> <38c233da-3ea0-d9bf-25ff-33a39d6d70c6@enkore.de> Message-ID: Thank you! I'll give it a try... On Sun, Nov 27, 2016 at 2:30 PM, Marian Beermann wrote: > On 27.11.2016 14:29, Marian Beermann wrote: > > Probably. I don't know much about OSX; if all data the application needs > > are in the backed up repositories, then there shouldn't be a problem. > > ... are in the backed up directories > (not repositories) > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > -------------- next part -------------- An HTML attachment was scrubbed... URL: From billk at iinet.net.au Fri Dec 9 23:10:50 2016 From: billk at iinet.net.au (Bill Kenworthy) Date: Sat, 10 Dec 2016 12:10:50 +0800 Subject: [Borgbackup] turn off warning? Message-ID: I have borgbackup being used in a number of scripts. The stable version (1.09 on gentoo) and earlier spit out a warning: Please upgrade to borg version 1.1+ on the server for safer AES-CTR nonce handling. Is there a way to turn it off? I did try a git version which works fine, except that it cant be used to back up a system with a .gvfs (all my GUI systems). BillK From billk at iinet.net.au Sat Dec 10 00:25:08 2016 From: billk at iinet.net.au (Bill Kenworthy) Date: Sat, 10 Dec 2016 13:25:08 +0800 Subject: [Borgbackup] turn off warning? In-Reply-To: References: Message-ID: <8cad9735-d3f1-f5b6-9f41-203c7efefe88@iinet.net.au> On 10/12/16 12:10, Bill Kenworthy wrote: > I have borgbackup being used in a number of scripts. The stable version > (1.09 on gentoo) and earlier spit out a warning: > > Please upgrade to borg version 1.1+ on the server for safer AES-CTR > nonce handling. > > Is there a way to turn it off? > > I did try a git version which works fine, except that it cant be used to > back up a system with a .gvfs (all my GUI systems). > > BillK > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > Please ignore - realised my GIT instances were a couple of weeks old and the latest juts gives a brief unreadable file message without the messy crash/exit. BillK From public at enkore.de Sat Dec 10 03:51:42 2016 From: public at enkore.de (Marian Beermann) Date: Sat, 10 Dec 2016 09:51:42 +0100 Subject: [Borgbackup] turn off warning? In-Reply-To: References: Message-ID: That warning isn't included in any 1.0.x client/server. The client that gives you this message must be from git / from 1.1.x (beta). It'd be great if you have details on that .gvfs crash / error, sounds like a regression of some sort. Cheers, Marian On 10.12.2016 05:10, Bill Kenworthy wrote: > I have borgbackup being used in a number of scripts. The stable version > (1.09 on gentoo) and earlier spit out a warning: > > Please upgrade to borg version 1.1+ on the server for safer AES-CTR > nonce handling. > > Is there a way to turn it off? > > I did try a git version which works fine, except that it cant be used to > back up a system with a .gvfs (all my GUI systems). > > BillK From billk at iinet.net.au Sat Dec 10 22:00:39 2016 From: billk at iinet.net.au (Bill Kenworthy) Date: Sun, 11 Dec 2016 11:00:39 +0800 Subject: [Borgbackup] turn off warning? In-Reply-To: References: Message-ID: <7a177377-090d-dedb-73b6-5719614d882c@iinet.net.au> On 10/12/16 16:51, Marian Beermann wrote: > That warning isn't included in any 1.0.x client/server. The client that > gives you this message must be from git / from 1.1.x (beta). > > It'd be great if you have details on that .gvfs crash / error, sounds > like a regression of some sort. > > Cheers, Marian > > On 10.12.2016 05:10, Bill Kenworthy wrote: >> I have borgbackup being used in a number of scripts. The stable version >> (1.09 on gentoo) and earlier spit out a warning: >> >> Please upgrade to borg version 1.1+ on the server for safer AES-CTR >> nonce handling. >> >> Is there a way to turn it off? >> >> I did try a git version which works fine, except that it cant be used to >> back up a system with a .gvfs (all my GUI systems). >> >> BillK > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > Its gone in current git - I just get: /home/config/home/wdk: [Errno 13] Permission denied: '/home/config/home/wdk/.gvfs' On gentoo. .gvfs is readable by the user only - root does not get access. This is what caused the crash - it would hang for a few seconds then exit, sometimes requiring the backup to be "borg check --repair". The current version just prints the above and continues merrily on as it should. BillK From jgoerzen at complete.org Thu Dec 15 13:38:21 2016 From: jgoerzen at complete.org (John Goerzen) Date: Thu, 15 Dec 2016 12:38:21 -0600 Subject: [Borgbackup] "borg check" without reading every byte Message-ID: <031139db-f1fd-3a27-c95e-38624da27072@complete.org> Hi folks, I have a question about borg check. I'm anticipating storing a backup on a remote host to which I do not have the ability to install borg (think sshfs or webdav or so). From the manpage description, it looks as if borg check will read every bit of data in the repo at least twice. Is there a way for it to check the consistency of the metadata trees without checking the CRC of every segment or reading every bit of file data? (I am pondering a move from obnam, which does have this feature) Thanks, John From public at enkore.de Thu Dec 15 14:06:20 2016 From: public at enkore.de (Marian Beermann) Date: Thu, 15 Dec 2016 20:06:20 +0100 Subject: [Borgbackup] "borg check" without reading every byte In-Reply-To: <031139db-f1fd-3a27-c95e-38624da27072@complete.org> References: <031139db-f1fd-3a27-c95e-38624da27072@complete.org> Message-ID: <8951a531-1296-64ea-96a8-e151f403414b@enkore.de> Hi John from the check help page: --repository-only only perform repository checks --archives-only only perform archives checks Cheers, Marian On 15.12.2016 19:38, John Goerzen wrote: > Hi folks, > > I have a question about borg check. I'm anticipating storing a backup > on a remote host to which I do not have the ability to install borg > (think sshfs or webdav or so). From the manpage description, it looks > as if borg check will read every bit of data in the repo at least > twice. Is there a way for it to check the consistency of the metadata > trees without checking the CRC of every segment or reading every bit of > file data? > > (I am pondering a move from obnam, which does have this feature) > > Thanks, > > John > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup From jgoerzen at complete.org Thu Dec 15 14:35:58 2016 From: jgoerzen at complete.org (John Goerzen) Date: Thu, 15 Dec 2016 13:35:58 -0600 Subject: [Borgbackup] "borg check" without reading every byte In-Reply-To: <8951a531-1296-64ea-96a8-e151f403414b@enkore.de> References: <031139db-f1fd-3a27-c95e-38624da27072@complete.org> <8951a531-1296-64ea-96a8-e151f403414b@enkore.de> Message-ID: Yes, I read that, but the documentation also says this: For the repository check, "For all objects stored in the segments, all metadata... and all data is read." Although now that I reread the description of the archive check, perhaps there is a case where it does not have to reread all data. I'll see if I can validate this. Thanks, John On 12/15/2016 01:06 PM, Marian Beermann wrote: > Hi John > > from the check help page: > > --repository-only only perform repository checks > --archives-only only perform archives checks > > Cheers, Marian > > On 15.12.2016 19:38, John Goerzen wrote: >> Hi folks, >> >> I have a question about borg check. I'm anticipating storing a backup >> on a remote host to which I do not have the ability to install borg >> (think sshfs or webdav or so). From the manpage description, it looks >> as if borg check will read every bit of data in the repo at least >> twice. Is there a way for it to check the consistency of the metadata >> trees without checking the CRC of every segment or reading every bit of >> file data? >> >> (I am pondering a move from obnam, which does have this feature) >> >> Thanks, >> >> John >> _______________________________________________ >> Borgbackup mailing list >> Borgbackup at python.org >> https://mail.python.org/mailman/listinfo/borgbackup > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup From public at enkore.de Thu Dec 15 14:54:13 2016 From: public at enkore.de (Marian Beermann) Date: Thu, 15 Dec 2016 20:54:13 +0100 Subject: [Borgbackup] "borg check" without reading every byte In-Reply-To: References: <031139db-f1fd-3a27-c95e-38624da27072@complete.org> <8951a531-1296-64ea-96a8-e151f403414b@enkore.de> Message-ID: <7a5bad05-da0d-7a17-1147-8a06784cabd7@enkore.de> On 15.12.2016 20:35, John Goerzen wrote: > Yes, I read that, but the documentation also says this: > > For the repository check, "For all objects stored in the segments, all > metadata... and all data is read." > > Although now that I reread the description of the archive check, perhaps > there is a case where it does not have to reread all data. I'll see if > I can validate this. Archives check only works with metadata. > Second, the consistency and correctness of the > archive metadata is verified: Cheers, Marian > Thanks, > > John > > > On 12/15/2016 01:06 PM, Marian Beermann wrote: >> Hi John >> >> from the check help page: >> >> --repository-only only perform repository checks >> --archives-only only perform archives checks >> >> Cheers, Marian >> >> On 15.12.2016 19:38, John Goerzen wrote: >>> Hi folks, >>> >>> I have a question about borg check. I'm anticipating storing a backup >>> on a remote host to which I do not have the ability to install borg >>> (think sshfs or webdav or so). From the manpage description, it looks >>> as if borg check will read every bit of data in the repo at least >>> twice. Is there a way for it to check the consistency of the metadata >>> trees without checking the CRC of every segment or reading every bit of >>> file data? >>> >>> (I am pondering a move from obnam, which does have this feature) >>> >>> Thanks, >>> >>> John >>> _______________________________________________ >>> Borgbackup mailing list >>> Borgbackup at python.org >>> https://mail.python.org/mailman/listinfo/borgbackup >> _______________________________________________ >> Borgbackup mailing list >> Borgbackup at python.org >> https://mail.python.org/mailman/listinfo/borgbackup > From jgoerzen at complete.org Sat Dec 17 15:08:22 2016 From: jgoerzen at complete.org (John Goerzen) Date: Sat, 17 Dec 2016 14:08:22 -0600 Subject: [Borgbackup] "borg check" without reading every byte In-Reply-To: <7a5bad05-da0d-7a17-1147-8a06784cabd7@enkore.de> References: <031139db-f1fd-3a27-c95e-38624da27072@complete.org> <8951a531-1296-64ea-96a8-e151f403414b@enkore.de> <7a5bad05-da0d-7a17-1147-8a06784cabd7@enkore.de> Message-ID: <32941c3e-4972-2bd6-7756-44b0c1342699@complete.org> So I did an experiment: rm data/21/219999 borg check -v --archives-only This did not flag any errors. Now it's possible that the segment I deleted was not holding metadata, and therefore never visited. However, I find it odd that it didn't at least notice it was missing. It would be nice to check that without reading every single bit of data to recheck CRCs, I guess. John On 12/15/2016 01:54 PM, Marian Beermann wrote: > On 15.12.2016 20:35, John Goerzen wrote: >> Yes, I read that, but the documentation also says this: >> >> For the repository check, "For all objects stored in the segments, all >> metadata... and all data is read." >> >> Although now that I reread the description of the archive check, perhaps >> there is a case where it does not have to reread all data. I'll see if >> I can validate this. > Archives check only works with metadata. > >> Second, the consistency and correctness of the >> archive metadata is verified: > Cheers, Marian > >> Thanks, >> >> John >> >> >> On 12/15/2016 01:06 PM, Marian Beermann wrote: >>> Hi John >>> >>> from the check help page: >>> >>> --repository-only only perform repository checks >>> --archives-only only perform archives checks >>> >>> Cheers, Marian >>> >>> On 15.12.2016 19:38, John Goerzen wrote: >>>> Hi folks, >>>> >>>> I have a question about borg check. I'm anticipating storing a backup >>>> on a remote host to which I do not have the ability to install borg >>>> (think sshfs or webdav or so). From the manpage description, it looks >>>> as if borg check will read every bit of data in the repo at least >>>> twice. Is there a way for it to check the consistency of the metadata >>>> trees without checking the CRC of every segment or reading every bit of >>>> file data? >>>> >>>> (I am pondering a move from obnam, which does have this feature) >>>> >>>> Thanks, >>>> >>>> John >>>> _______________________________________________ >>>> Borgbackup mailing list >>>> Borgbackup at python.org >>>> https://mail.python.org/mailman/listinfo/borgbackup >>> _______________________________________________ >>> Borgbackup mailing list >>> Borgbackup at python.org >>> https://mail.python.org/mailman/listinfo/borgbackup From public at enkore.de Sun Dec 18 10:18:05 2016 From: public at enkore.de (Marian Beermann) Date: Sun, 18 Dec 2016 16:18:05 +0100 Subject: [Borgbackup] "borg check" without reading every byte In-Reply-To: <32941c3e-4972-2bd6-7756-44b0c1342699@complete.org> References: <031139db-f1fd-3a27-c95e-38624da27072@complete.org> <8951a531-1296-64ea-96a8-e151f403414b@enkore.de> <7a5bad05-da0d-7a17-1147-8a06784cabd7@enkore.de> <32941c3e-4972-2bd6-7756-44b0c1342699@complete.org> Message-ID: On 17.12.2016 21:08, John Goerzen wrote: > So I did an experiment: > > rm data/21/219999 > borg check -v --archives-only > > This did not flag any errors. Now it's possible that the segment I > deleted was not holding metadata, and therefore never visited. However, > I find it odd that it didn't at least notice it was missing. It would > be nice to check that without reading every single bit of data to > recheck CRCs, I guess. That's kinda expected, since the repository check and the archives check work on different layers... one could implement a 'shallow' repository check like you say, but it begs the question whether that's really a useful thing to have; corruption of the data itself is more likely and the transaction logic ensures (on typical hardware, anyway) that the trailing end of the repository is intact (this is checked on every access of the repository). Cheers, Marian > John > > On 12/15/2016 01:54 PM, Marian Beermann wrote: >> On 15.12.2016 20:35, John Goerzen wrote: >>> Yes, I read that, but the documentation also says this: >>> >>> For the repository check, "For all objects stored in the segments, all >>> metadata... and all data is read." >>> >>> Although now that I reread the description of the archive check, perhaps >>> there is a case where it does not have to reread all data. I'll see if >>> I can validate this. >> Archives check only works with metadata. >> >>> Second, the consistency and correctness of the >>> archive metadata is verified: >> Cheers, Marian >> >>> Thanks, >>> >>> John >>> >>> >>> On 12/15/2016 01:06 PM, Marian Beermann wrote: >>>> Hi John >>>> >>>> from the check help page: >>>> >>>> --repository-only only perform repository checks >>>> --archives-only only perform archives checks >>>> >>>> Cheers, Marian >>>> >>>> On 15.12.2016 19:38, John Goerzen wrote: >>>>> Hi folks, >>>>> >>>>> I have a question about borg check. I'm anticipating storing a backup >>>>> on a remote host to which I do not have the ability to install borg >>>>> (think sshfs or webdav or so). From the manpage description, it looks >>>>> as if borg check will read every bit of data in the repo at least >>>>> twice. Is there a way for it to check the consistency of the metadata >>>>> trees without checking the CRC of every segment or reading every bit of >>>>> file data? >>>>> >>>>> (I am pondering a move from obnam, which does have this feature) >>>>> >>>>> Thanks, >>>>> >>>>> John >>>>> _______________________________________________ >>>>> Borgbackup mailing list >>>>> Borgbackup at python.org >>>>> https://mail.python.org/mailman/listinfo/borgbackup >>>> _______________________________________________ >>>> Borgbackup mailing list >>>> Borgbackup at python.org >>>> https://mail.python.org/mailman/listinfo/borgbackup > From tw at waldmann-edv.de Mon Dec 19 23:10:02 2016 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Tue, 20 Dec 2016 05:10:02 +0100 Subject: [Borgbackup] borgbackup 1.0.9 released Message-ID: https://github.com/borgbackup/borg/releases/tag/1.0.9 Security and Bug fixes, please upgrade. More details: see URL above. Cheers, Thomas -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From sitaramc at gmail.com Tue Dec 20 07:29:22 2016 From: sitaramc at gmail.com (Sitaram Chamarty) Date: Tue, 20 Dec 2016 17:59:22 +0530 Subject: [Borgbackup] borgbackup 1.0.9 released In-Reply-To: References: Message-ID: <20161220122922.GA16294@sita-lt.atc.tcs.com> On Tue, Dec 20, 2016 at 05:10:02AM +0100, Thomas Waldmann wrote: > https://github.com/borgbackup/borg/releases/tag/1.0.9 > > Security and Bug fixes, please upgrade. > > More details: see URL above. Hi Quick question about this instruction: borg upgrade --tam do I have to upgrade the version of borg on the server also before I run this? Or is it just the *client* that matters? (I know the instructions on that page only say "upgrade all clients" but I just want to be sure.) regards sitaram From public at enkore.de Tue Dec 20 07:48:01 2016 From: public at enkore.de (Marian Beermann) Date: Tue, 20 Dec 2016 13:48:01 +0100 Subject: [Borgbackup] borgbackup 1.0.9 released In-Reply-To: <20161220122922.GA16294@sita-lt.atc.tcs.com> References: <20161220122922.GA16294@sita-lt.atc.tcs.com> Message-ID: <78cdbd8d-7f01-18bd-74b6-89af02f1b2e6@enkore.de> Hi Sitaram, this only affects the clients, not the server. All 1.0.x releases are compatible with any 1.0.x server. Cheers, Marian On 20.12.2016 13:29, Sitaram Chamarty wrote: > On Tue, Dec 20, 2016 at 05:10:02AM +0100, Thomas Waldmann wrote: >> https://github.com/borgbackup/borg/releases/tag/1.0.9 >> >> Security and Bug fixes, please upgrade. >> >> More details: see URL above. > > Hi > > Quick question about this instruction: > > borg upgrade --tam > > do I have to upgrade the version of borg on the server also > before I run this? Or is it just the *client* that matters? > > (I know the instructions on that page only say "upgrade all > clients" but I just want to be sure.) > > regards > sitaram > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > From jgoerzen at complete.org Tue Dec 20 09:28:37 2016 From: jgoerzen at complete.org (John Goerzen) Date: Tue, 20 Dec 2016 08:28:37 -0600 Subject: [Borgbackup] Why does borg delete/prune write a bunch of new data? Message-ID: Hi folks, So I'm doing some testing of Borg. My ultimate aim is to rsync the backups to a dumb (WebDAV or S3-type) host. I made a run of borg over a real subset of my data, about 80GB worth. I then cleaned up and deleted a good chunk of data throughout that area, and made another archive with borg create. So far so good. Now I ran borg delete to remove the archive with all the extra data. Sure enough, about 2GB freed up on the disk after. However, watching the process with strace and examining the filesystem, I observed it wrote a considerable amount of new segments to the data directory. A little analysis with ls and du shows it wrote right around 2GB of new segments. (It also, of course, unlinked a considerable number of segments.) Having to rsync 2GB of new data every time I delete data is going to be rather sub-optimal on my poor DSL. Any ideas why it's doing this? FWIW the index file is only a few tens of MBs. I'm using encryption and lzma compression. I did double the max_segment_size from 5MB to 10MB (a lot of experience with obnam suggested this would improve the performance over the rsync situation) Thanks, John From sitaramc at gmail.com Tue Dec 20 13:30:35 2016 From: sitaramc at gmail.com (Sitaram Chamarty) Date: Wed, 21 Dec 2016 00:00:35 +0530 Subject: [Borgbackup] borgbackup 1.0.9 released In-Reply-To: <78cdbd8d-7f01-18bd-74b6-89af02f1b2e6@enkore.de> References: <20161220122922.GA16294@sita-lt.atc.tcs.com> <78cdbd8d-7f01-18bd-74b6-89af02f1b2e6@enkore.de> Message-ID: <20161220183035.GA19822@sita-lt.atc.tcs.com> On Tue, Dec 20, 2016 at 01:48:01PM +0100, Marian Beermann wrote: > Hi Sitaram, > > this only affects the clients, not the server. > > All 1.0.x releases are compatible with any 1.0.x server. thanks! One of my servers is only accessible via borg until next month so I needed to be clear. regards sitaram > > Cheers, Marian > > On 20.12.2016 13:29, Sitaram Chamarty wrote: > > On Tue, Dec 20, 2016 at 05:10:02AM +0100, Thomas Waldmann wrote: > >> https://github.com/borgbackup/borg/releases/tag/1.0.9 > >> > >> Security and Bug fixes, please upgrade. > >> > >> More details: see URL above. > > > > Hi > > > > Quick question about this instruction: > > > > borg upgrade --tam > > > > do I have to upgrade the version of borg on the server also > > before I run this? Or is it just the *client* that matters? > > > > (I know the instructions on that page only say "upgrade all > > clients" but I just want to be sure.) > > > > regards > > sitaram > > _______________________________________________ > > Borgbackup mailing list > > Borgbackup at python.org > > https://mail.python.org/mailman/listinfo/borgbackup > > > From jgoerzen at complete.org Tue Dec 20 18:09:48 2016 From: jgoerzen at complete.org (John Goerzen) Date: Tue, 20 Dec 2016 17:09:48 -0600 Subject: [Borgbackup] Why does borg delete/prune write a bunch of new data? In-Reply-To: References: Message-ID: I've done some digging into this, and it seems the reason is compact_segments() in repository.py. It both deletes the segments that are completely unused, and also (if I'm understanding correctly), takes segments containing some objects that are unused and some objects that are still used and writes new segments containing only the used objects. The end result is some space savings, at the cost of a lot of I/O. I wonder how hard it would be to support deleting unused segments without bothering to rewrite segments that are partially used? thanks, John On 12/20/2016 08:28 AM, John Goerzen wrote: > Hi folks, > > So I'm doing some testing of Borg. My ultimate aim is to rsync the > backups to a dumb (WebDAV or S3-type) host. > > I made a run of borg over a real subset of my data, about 80GB worth. > I then cleaned up and deleted a good chunk of data throughout that > area, and made another archive with borg create. > > So far so good. Now I ran borg delete to remove the archive with all > the extra data. Sure enough, about 2GB freed up on the disk after. > > However, watching the process with strace and examining the > filesystem, I observed it wrote a considerable amount of new segments > to the data directory. A little analysis with ls and du shows it > wrote right around 2GB of new segments. (It also, of course, unlinked > a considerable number of segments.) > > Having to rsync 2GB of new data every time I delete data is going to > be rather sub-optimal on my poor DSL. Any ideas why it's doing this? > FWIW the index file is only a few tens of MBs. > > I'm using encryption and lzma compression. I did double the > max_segment_size from 5MB to 10MB (a lot of experience with obnam > suggested this would improve the performance over the rsync situation) > > Thanks, > > John > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup From mario at emmenlauer.de Wed Dec 21 06:55:05 2016 From: mario at emmenlauer.de (Mario Emmenlauer) Date: Wed, 21 Dec 2016 12:55:05 +0100 Subject: [Borgbackup] faster / better deletion, for a bounty? Message-ID: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de> Dear Borg developers, thanks for the awesome tool! I've been using borgbackup now for more than 6 months and created 533 backups successfully! But now the disk is ~98% full and suddenly I have some troubles :-) I've received great feedback already in IRC but some questions still trouble me: (1) My archive is now 3.4 TB (reported with 'du'), but borg list says the deduplicated archive size is 1.82 TB. Why are the two numbers off by 50%? Below the full output of my borg list. (2) In the last months, my backup size went up quite a lot, even though I did not change anything in borg. So I'd like to reverse engineer which archives (or which files) contribute to the sudden increase in size. I tried "borg list" on all archives, but only 7 have ~3 GB of deduplicated space, and all others have less than 1 GB of dedup space! I assumed 533 archives of ~1 GB dedup size = 533 GB total, but my math must be quite wrong? I saw the documentation of "borg list" but it does not help me understand :-( How would I find the archives that free most space when deleted? (3) borg delete was incredibly slow for me. I killed it after two hours, and it had read 500GB of the archive by then (reported with iotop). I understood from IRC discussion that both prune and delete would require reading the full 3.4 TB once per run, to sanitize some index? That would break borg usage for me, since this will very much wear the disk and also takes ~8hrs on my encrypted drive! Am I doing some- thing wrong? Are there tricks or workarounds, for example when deleting only from localhost? I'd like to offer a bounty of ~?20-?25 for a better solution, or a generally much faster delete and/or much faster prune. If possible I'd rather not have borg read the full 3.4TB archive! PS: My preferred deletion pattern would keep an increasing number of archives over time, like monthly backups from the past 10 years, weekly from the past year, and daily from past month. I can build this list of deletions with bash easily! But borg delete or prune are currently *way* to slow to be used this way :-( #> borg list archive::somebackup Number of files: 1796064 Original size Compressed size Deduplicated size This archive: 95.27 GB 70.53 GB 178.00 MB All archives: 78.26 TB 65.13 TB 1.82 TB Unique chunks Total chunks Chunk index: 9733154 414693364 Thanks a lot and all the best, Mario Emmenlauer -- BioDataAnalysis GmbH, Mario Emmenlauer Tel. Buero: +49-89-74677203 Balanstr. 43 mailto: memmenlauer * biodataanalysis.de D-81669 M?nchen http://www.biodataanalysis.de/ From tw at waldmann-edv.de Wed Dec 21 08:31:06 2016 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Wed, 21 Dec 2016 14:31:06 +0100 Subject: [Borgbackup] faster / better deletion, for a bounty? In-Reply-To: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de> References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de> Message-ID: <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de> Hi Mario, > But now the disk is ~98% full Avoid that it fills up completely, borg needs free space, even for delete. > (1) My archive is now 3.4 TB (reported with 'du'), but borg list says > the deduplicated archive size is 1.82 TB. Why are the two numbers > off by 50%? Below the full output of my borg list. Did you activate append-only mode for the repo? While append-only is set, borg prune/delete will not be able to really remove data. > (2) In the last months, my backup size went up quite a lot, even though > I did not change anything in borg. So I'd like to reverse engineer > which archives (or which files) contribute to the sudden increase in > size. I tried "borg list" on all archives, but only 7 have ~3 GB of > deduplicated space, and all others have less than 1 GB of dedup space! > I assumed 533 archives of ~1 GB dedup size = 533 GB total, No, that is only the sum of the space ONLY used by a single archive. As soon as the same chunks are used by more than 1 archive, it does not show up as "unique chunks" any more. > How would I find the archives that free most space when deleted? For a single archive deletion, that is the unique chunks space ("deduplicated size") of that archive. For multiple archive deletion there is no easy way to see beforehands. > (3) borg delete was incredibly slow for me. I killed it after two hours, > and it had read 500GB of the archive by then (reported with iotop). > I understood from IRC discussion that both prune and delete would > require reading the full 3.4 TB once per run, to sanitize some index? No, they usually do not need to read all your data. The worst case might be that, though. > Are there tricks or workarounds, for example when > deleting only from localhost? If you use borg with encryption (default), you'ld need to use the encryption key on the repo machine. It depends on how much you trust that machine whether you want to do that or not. > I'd like to offer a bounty of ~?20-?25 for a better solution, or a > generally much faster delete and/or much faster prune. Some improvements will come with borg 1.1 (which is currently still in beta, so be very careful). > PS: My preferred deletion pattern would keep an increasing number of > archives over time, like monthly backups from the past 10 years, > weekly from the past year, and daily from past month. That's how borg prune works. > I can build > this list of deletions with bash easily! But borg delete or prune > are currently *way* to slow to be used this way :-( borg prune (when it deletes more than 1 archive per run) is faster than borg delete. It uses delete internally, but doing multiple deletes at once is a bit more efficient. Cheers, Thomas -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From tw at waldmann-edv.de Wed Dec 21 08:57:45 2016 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Wed, 21 Dec 2016 14:57:45 +0100 Subject: [Borgbackup] bounty advice In-Reply-To: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de> References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de> Message-ID: <9340d4ab-98b0-324f-19f6-969b8121c53c@waldmann-edv.de> > I'd like to offer a bounty of ~?20-?25 for a better solution, ... Some words about bounties: They are very welcome and give additional motivation for a task. Especially useful if the task itself isn't that interesting, but somehow needs to be done. Or to push some issue to better visibility (bountysource label) and attract more attention to it. Bounties for clear and small tasks work better than for complex or unclear goals. If the task is very complex or very unclear or even impossible, it will take a long time or will never get done. Bounties for stuff that totally lack fundamentals in current codebase are also going to take rather long up to infinite. -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From jgoerzen at complete.org Wed Dec 21 08:58:51 2016 From: jgoerzen at complete.org (John Goerzen) Date: Wed, 21 Dec 2016 07:58:51 -0600 Subject: [Borgbackup] faster / better deletion, for a bounty? In-Reply-To: <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de> References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de> <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de> Message-ID: <6f48fa2a-706b-1fd3-ff28-84abebe964cc@complete.org> On 12/21/2016 07:31 AM, Thomas Waldmann wrote: > >> (3) borg delete was incredibly slow for me. I killed it after two hours, >> and it had read 500GB of the archive by then (reported with iotop). >> I understood from IRC discussion that both prune and delete would >> require reading the full 3.4 TB once per run, to sanitize some index? > No, they usually do not need to read all your data. > > The worst case might be that, though. Hi Thomas, Can you elaborate on this? It may potentially be a pretty big problem for me storing backups on remote dumb storage. BTW, thanks to you and everyone for all the work you've done on Borgbackup. It has made incredible progress since I last looked at Attic (which was right around the time of the Borg fork.) I'm seriously evaluating a switch from Obnam. John From tw at waldmann-edv.de Wed Dec 21 09:16:27 2016 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Wed, 21 Dec 2016 15:16:27 +0100 Subject: [Borgbackup] faster / better deletion, for a bounty? In-Reply-To: <6f48fa2a-706b-1fd3-ff28-84abebe964cc@complete.org> References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de> <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de> <6f48fa2a-706b-1fd3-ff28-84abebe964cc@complete.org> Message-ID: >>> (3) borg delete was incredibly slow for me. I killed it after two hours, >>> and it had read 500GB of the archive by then (reported with iotop). >>> I understood from IRC discussion that both prune and delete would >>> require reading the full 3.4 TB once per run, to sanitize some index? >> No, they usually do not need to read all your data. >> >> The worst case might be that, though. > > Can you elaborate on this? Well, I think it is very rare / synthetic, but one could imagine an archive referencing one chunk in every segment file. If at time of deletion of that archive these references are the last references to these chunks, the chunks get unused. borg 1.0 behaviour is to compact segments to free up space (unconditionally, iirc). So, if there is a unused chunk in every segment (unused, but allocated disk space), it would read all the still used chunks from these segments and create new, compact segments from them. borg 1.1 introduces a threshold, so it won't compact segments if there is only little gain. > BTW, thanks to you and everyone for all the work you've done on > Borgbackup. It has made incredible progress since I last looked at > Attic (which was right around the time of the Borg fork.) I'm seriously > evaluating a switch from Obnam. Great. :) Your blog post from 2015 always shows in top search results when searching for a comparison of obnam, attic (borg), ... Would be great to have a refresh of that at some time. -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From jgoerzen at complete.org Wed Dec 21 09:36:40 2016 From: jgoerzen at complete.org (John Goerzen) Date: Wed, 21 Dec 2016 08:36:40 -0600 Subject: [Borgbackup] faster / better deletion, for a bounty? In-Reply-To: References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de> <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de> <6f48fa2a-706b-1fd3-ff28-84abebe964cc@complete.org> Message-ID: On 12/21/2016 08:16 AM, Thomas Waldmann wrote: > > Well, I think it is very rare / synthetic, but one could imagine an > archive referencing one chunk in every segment file. > > If at time of deletion of that archive these references are the last > references to these chunks, the chunks get unused. > > borg 1.0 behaviour is to compact segments to free up space > (unconditionally, iirc). > > So, if there is a unused chunk in every segment (unused, but allocated > disk space), it would read all the still used chunks from these segments > and create new, compact segments from them. > > borg 1.1 introduces a threshold, so it won't compact segments if there > is only little gain. Ah ha. So that would address the issue I raised in my other email as well. Great! I think I found this code in repository.py:494, hard-coded at rewriting the segment if it frees at least 15% of it. Could that threshold be made configurable? For my use case, I would probably set it to 80% or even 90%. Also, a question on decrementing the segment/chunk reference counts. Is that information kept solely in the cache, or is it also in the repo somewhere? If the latter, where does it get written? >> BTW, thanks to you and everyone for all the work you've done on >> Borgbackup. It has made incredible progress since I last looked at >> Attic (which was right around the time of the Borg fork.) I'm seriously >> evaluating a switch from Obnam. > Great. :) Your blog post from 2015 always shows in top search results > when searching for a comparison of obnam, attic (borg), ... > > Would be great to have a refresh of that at some time. Already planning on it, yes! The thing that prompted the re-eval was the discovery that Obnam somehow wound up with a lot of missing chunks in my repo, and that obnam fsck detects but does not correct this issue. There was apparently a bug in obnam forget that may have led to this awhile back (since fixed). Due to some annoying interactions between davfs2 and the webdav server, it occasionally returned EPERM on operations that would have been permitted with a retry. This caused it to crash in the middle of a forget, with annoying consequences. The new default chunker params are the big thing that address the issue I had with Attic. Storing multiple chunks in segments also should lead to a big performance benefit compared to Obnam as well. John From mario at emmenlauer.de Wed Dec 21 17:25:30 2016 From: mario at emmenlauer.de (Mario Emmenlauer) Date: Wed, 21 Dec 2016 23:25:30 +0100 Subject: [Borgbackup] faster / better deletion, for a bounty? In-Reply-To: <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de> References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de> <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de> Message-ID: <4397537a-b0eb-09d1-a154-76cff2ccf792@emmenlauer.de> Hi Thomas, On 21.12.2016 14:31, Thomas Waldmann wrote: >> (1) My archive is now 3.4 TB (reported with 'du'), but borg list says >> the deduplicated archive size is 1.82 TB. Why are the two numbers >> off by 50%? Below the full output of my borg list. > > Did you activate append-only mode for the repo? > > While append-only is set, borg prune/delete will not be able to really > remove data. This is actually before I performed any deletions. The disk usage is reported as 3.4 TB by du and df, whereas borg reports the total dedup size as "only" 1.8TB (so approx. 50% of the actual usage). Is this a typical overhead, or is something fishy in my setup? >> (2) In the last months, my backup size went up quite a lot, even though >> I did not change anything in borg. So I'd like to reverse engineer >> which archives (or which files) contribute to the sudden increase in >> size. I tried "borg list" on all archives, but only 7 have ~3 GB of >> deduplicated space, and all others have less than 1 GB of dedup space! >> I assumed 533 archives of ~1 GB dedup size = 533 GB total, > > No, that is only the sum of the space ONLY used by a single archive. > > As soon as the same chunks are used by more than 1 archive, it does not > show up as "unique chunks" any more. > >> How would I find the archives that free most space when deleted? > > For a single archive deletion, that is the unique chunks space > ("deduplicated size") of that archive. > > For multiple archive deletion there is no easy way to see beforehands. Would it be possible to somehow change this reporting in borg? I think I (possibly accidentally) backed up a few huge files for a few days, that now use up 50% of my archive space. Since the chunks are shared, I have no way of knowing which archives are the "bad guys". My only option seems to prune with a shotgun-approach until eventually I get lucky and free significant disk space. If I'm unlucky I can prune a lot before freeing any significant space... I think for example 'du' when used on hard links reports the shared disk usage on the first directory it encounters, and does not duplicate the size of hard links on subsequent directories. Would this be a sane behaviour for borg too? Or add a new field for "shared chunks size"? Thanks a lot for the help, and all the best, Mario Emmenlauer From jgoerzen at complete.org Wed Dec 21 19:37:22 2016 From: jgoerzen at complete.org (John Goerzen) Date: Wed, 21 Dec 2016 18:37:22 -0600 Subject: [Borgbackup] faster / better deletion, for a bounty? In-Reply-To: <4397537a-b0eb-09d1-a154-76cff2ccf792@emmenlauer.de> References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de> <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de> <4397537a-b0eb-09d1-a154-76cff2ccf792@emmenlauer.de> Message-ID: On 12/21/2016 04:25 PM, Mario Emmenlauer wrote: > Hi Thomas, > > On 21.12.2016 14:31, Thomas Waldmann wrote: >>> (1) My archive is now 3.4 TB (reported with 'du'), but borg list says >>> the deduplicated archive size is 1.82 TB. Why are the two numbers >>> off by 50%? Below the full output of my borg list. >> Did you activate append-only mode for the repo? >> >> While append-only is set, borg prune/delete will not be able to really >> remove data. > This is actually before I performed any deletions. The disk usage is > reported as 3.4 TB by du and df, whereas borg reports the total dedup > size as "only" 1.8TB (so approx. 50% of the actual usage). Is this a > typical overhead, or is something fishy in my setup? Hi Mario, On my system (which is zfs-backed for the moment), zfs list and df actually show *less* space used than borg does. I'm still trying to figure that one out ;-) If I understand the dedup size correctly -- and that's an *if* since I have not been using borg for more than a few days -- its meaning is /how much space will be freed if you delete just this one archive/. This makes a lot of sense to me, because it is exactly the same way zfs gives me the size of snapshots. If you have very little change in your datasets but a high number of archives, it would be possible for you to have terabytes of data under management and a sum of the dedup size of almost zero. This would not be an error, given the meaning listed. It is also, therefore, expected that if you remove an archive, the dedup size listed in other archives may increase, since if there was a chunk in common between the deleted archive and the other one, it wouldn't have shown up in the dedup size of either (since deleting /just that one archive/ would not free its space), but once one of the two archives is gone, it would be counted to the other. Does that make sense? How you count up space is a funny business when you have deduplication going on. Same when you have hard links in your filesystem. (du can say you've got 50GB in a directory, but you might find that rm -r on it only frees up 50K if there's a lot of hardlinks to other areas.) I think zfs might have a little clearer terminology on this: "referenced" is how much data is pointed to by a given snapshot, and "used" is how much space would be freed if only that one snapshot were deleted right now. That's like borg's archive size and dedup size. John > > >>> (2) In the last months, my backup size went up quite a lot, even though >>> I did not change anything in borg. So I'd like to reverse engineer >>> which archives (or which files) contribute to the sudden increase in >>> size. I tried "borg list" on all archives, but only 7 have ~3 GB of >>> deduplicated space, and all others have less than 1 GB of dedup space! >>> I assumed 533 archives of ~1 GB dedup size = 533 GB total, >> No, that is only the sum of the space ONLY used by a single archive. >> >> As soon as the same chunks are used by more than 1 archive, it does not >> show up as "unique chunks" any more. >> >>> How would I find the archives that free most space when deleted? >> For a single archive deletion, that is the unique chunks space >> ("deduplicated size") of that archive. >> >> For multiple archive deletion there is no easy way to see beforehands. > Would it be possible to somehow change this reporting in borg? I > think I (possibly accidentally) backed up a few huge files for a few > days, that now use up 50% of my archive space. Since the chunks are > shared, I have no way of knowing which archives are the "bad guys". > My only option seems to prune with a shotgun-approach until eventually > I get lucky and free significant disk space. If I'm unlucky I can > prune a lot before freeing any significant space... > > I think for example 'du' when used on hard links reports the shared > disk usage on the first directory it encounters, and does not duplicate > the size of hard links on subsequent directories. Would this be a sane > behaviour for borg too? Or add a new field for "shared chunks size"? > > > Thanks a lot for the help, and all the best, > > Mario Emmenlauer > > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup -------------- next part -------------- An HTML attachment was scrubbed... URL: From mario at emmenlauer.de Thu Dec 22 04:14:09 2016 From: mario at emmenlauer.de (Mario Emmenlauer) Date: Thu, 22 Dec 2016 10:14:09 +0100 Subject: [Borgbackup] faster / better deletion, for a bounty? In-Reply-To: References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de> <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de> <4397537a-b0eb-09d1-a154-76cff2ccf792@emmenlauer.de> Message-ID: <87916a73-0200-780b-ee92-299f3264fed3@emmenlauer.de> Hi John, On 22.12.2016 01:37, John Goerzen wrote: > On 12/21/2016 04:25 PM, Mario Emmenlauer wrote: >> Hi Thomas, >> >> On 21.12.2016 14:31, Thomas Waldmann wrote: >>>> (1) My archive is now 3.4 TB (reported with 'du'), but borg list says >>>> the deduplicated archive size is 1.82 TB. Why are the two numbers >>>> off by 50%? Below the full output of my borg list. >>> Did you activate append-only mode for the repo? >>> >>> While append-only is set, borg prune/delete will not be able to really >>> remove data. >> This is actually before I performed any deletions. The disk usage is >> reported as 3.4 TB by du and df, whereas borg reports the total dedup >> size as "only" 1.8TB (so approx. 50% of the actual usage). Is this a >> typical overhead, or is something fishy in my setup? > > Hi Mario, > > On my system (which is zfs-backed for the moment), zfs list and df actually show > *less* space used than borg does. I'm still trying to figure that one out ;-) Haha that's interesting! Let me know what you find out :-) > If I understand the dedup size correctly -- and that's an *if* since I have not > been using borg for more than a few days -- its meaning is /how much space will > be freed if you delete just this one archive/. This makes a lot of sense to me, > because it is exactly the same way zfs gives me the size of snapshots. > > If you have very little change in your datasets but a high number of archives, > it would be possible for you to have terabytes of data under management and a > sum of the dedup size of almost zero. This would not be an error, given the > meaning listed. > > It is also, therefore, expected that if you remove an archive, the dedup size > listed in other archives may increase, since if there was a chunk in common > between the deleted archive and the other one, it wouldn't have shown up in the > dedup size of either (since deleting /just that one archive/ would not free its > space), but once one of the two archives is gone, it would be counted to the other. > > Does that make sense? What you say makes perfect sense for a single archive. But borg reports also numbers for "all archives", which I understood to be the numbers for the full repository. Am I on the wrong track there? Because "all archives" is not the sum of the individual archives, so I assumed its the repo. For the repo, however, I think the dedup size should be equal to the disk size (except for overheads like meta data, index, etc). Therefore I was surprised to see that for me, its approx. 50% of disk usage. See here the output of borg list on one of my archives: Number of files: 1796064 Original size Compressed size Deduplicated size This archive: 95.27 GB 70.53 GB 178.00 MB All archives: 78.26 TB 65.13 TB 1.82 TB Unique chunks Total chunks Chunk index: 9733154 414693364 Cheers, Mario > How you count up space is a funny business when you have deduplication going > on. Same when you have hard links in your filesystem. (du can say you've got > 50GB in a directory, but you might find that rm -r on it only frees up 50K if > there's a lot of hardlinks to other areas.) > > I think zfs might have a little clearer terminology on this: "referenced" is how > much data is pointed to by a given snapshot, and "used" is how much space would > be freed if only that one snapshot were deleted right now. That's like borg's > archive size and dedup size. > > John > > >> >> >>>> (2) In the last months, my backup size went up quite a lot, even though >>>> I did not change anything in borg. So I'd like to reverse engineer >>>> which archives (or which files) contribute to the sudden increase in >>>> size. I tried "borg list" on all archives, but only 7 have ~3 GB of >>>> deduplicated space, and all others have less than 1 GB of dedup space! >>>> I assumed 533 archives of ~1 GB dedup size = 533 GB total, >>> No, that is only the sum of the space ONLY used by a single archive. >>> >>> As soon as the same chunks are used by more than 1 archive, it does not >>> show up as "unique chunks" any more. >>> >>>> How would I find the archives that free most space when deleted? >>> For a single archive deletion, that is the unique chunks space >>> ("deduplicated size") of that archive. >>> >>> For multiple archive deletion there is no easy way to see beforehands. >> Would it be possible to somehow change this reporting in borg? I >> think I (possibly accidentally) backed up a few huge files for a few >> days, that now use up 50% of my archive space. Since the chunks are >> shared, I have no way of knowing which archives are the "bad guys". >> My only option seems to prune with a shotgun-approach until eventually >> I get lucky and free significant disk space. If I'm unlucky I can >> prune a lot before freeing any significant space... >> >> I think for example 'du' when used on hard links reports the shared >> disk usage on the first directory it encounters, and does not duplicate >> the size of hard links on subsequent directories. Would this be a sane >> behaviour for borg too? Or add a new field for "shared chunks size"? >> >> >> Thanks a lot for the help, and all the best, >> >> Mario Emmenlauer >> >> >> _______________________________________________ >> Borgbackup mailing list >> Borgbackup at python.org >> https://mail.python.org/mailman/listinfo/borgbackup > Viele Gruesse, Mario Emmenlauer -- BioDataAnalysis GmbH, Mario Emmenlauer Tel. Buero: +49-89-74677203 Balanstr. 43 mailto: memmenlauer * biodataanalysis.de D-81669 M?nchen http://www.biodataanalysis.de/ From jgoerzen at complete.org Thu Dec 22 09:44:57 2016 From: jgoerzen at complete.org (John Goerzen) Date: Thu, 22 Dec 2016 08:44:57 -0600 Subject: [Borgbackup] faster / better deletion, for a bounty? In-Reply-To: <87916a73-0200-780b-ee92-299f3264fed3@emmenlauer.de> References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de> <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de> <4397537a-b0eb-09d1-a154-76cff2ccf792@emmenlauer.de> <87916a73-0200-780b-ee92-299f3264fed3@emmenlauer.de> Message-ID: <38a54745-2747-709d-ef2b-f2c0d58a4da6@complete.org> On 12/22/2016 03:14 AM, Mario Emmenlauer wrote: > What you say makes perfect sense for a single archive. But borg > reports also > numbers for "all archives", which I understood to be the numbers for the full > repository. Am I on the wrong track there? Because "all archives" is not the > sum of the individual archives, so I assumed its the repo. For the repo, > however, I think the dedup size should be equal to the disk size (except for > overheads like meta data, index, etc). Therefore I was surprised to see that > for me, its approx. 50% of disk usage. Ah, what you're saying there does seem to mesh with what's documented. You've got me then. I wonder, what does du -sh over your repo show? And is it any different if you add --apparent-size to du? John > > See here the output of borg list on one of my archives: > Number of files: 1796064 > Original size Compressed size Deduplicated size > This archive: 95.27 GB 70.53 GB 178.00 MB > All archives: 78.26 TB 65.13 TB 1.82 TB > Unique chunks Total chunks > Chunk index: 9733154 414693364 > > Cheers, > > Mario > > > >> How you count up space is a funny business when you have deduplication going >> on. Same when you have hard links in your filesystem. (du can say you've got >> 50GB in a directory, but you might find that rm -r on it only frees up 50K if >> there's a lot of hardlinks to other areas.) >> >> I think zfs might have a little clearer terminology on this: "referenced" is how >> much data is pointed to by a given snapshot, and "used" is how much space would >> be freed if only that one snapshot were deleted right now. That's like borg's >> archive size and dedup size. >> >> John >> >> >>> >>>>> (2) In the last months, my backup size went up quite a lot, even though >>>>> I did not change anything in borg. So I'd like to reverse engineer >>>>> which archives (or which files) contribute to the sudden increase in >>>>> size. I tried "borg list" on all archives, but only 7 have ~3 GB of >>>>> deduplicated space, and all others have less than 1 GB of dedup space! >>>>> I assumed 533 archives of ~1 GB dedup size = 533 GB total, >>>> No, that is only the sum of the space ONLY used by a single archive. >>>> >>>> As soon as the same chunks are used by more than 1 archive, it does not >>>> show up as "unique chunks" any more. >>>> >>>>> How would I find the archives that free most space when deleted? >>>> For a single archive deletion, that is the unique chunks space >>>> ("deduplicated size") of that archive. >>>> >>>> For multiple archive deletion there is no easy way to see beforehands. >>> Would it be possible to somehow change this reporting in borg? I >>> think I (possibly accidentally) backed up a few huge files for a few >>> days, that now use up 50% of my archive space. Since the chunks are >>> shared, I have no way of knowing which archives are the "bad guys". >>> My only option seems to prune with a shotgun-approach until eventually >>> I get lucky and free significant disk space. If I'm unlucky I can >>> prune a lot before freeing any significant space... >>> >>> I think for example 'du' when used on hard links reports the shared >>> disk usage on the first directory it encounters, and does not duplicate >>> the size of hard links on subsequent directories. Would this be a sane >>> behaviour for borg too? Or add a new field for "shared chunks size"? >>> >>> >>> Thanks a lot for the help, and all the best, >>> >>> Mario Emmenlauer >>> >>> >>> _______________________________________________ >>> Borgbackup mailing list >>> Borgbackup at python.org >>> https://mail.python.org/mailman/listinfo/borgbackup > > > Viele Gruesse, > > Mario Emmenlauer > > > -- > BioDataAnalysis GmbH, Mario Emmenlauer Tel. Buero: +49-89-74677203 > Balanstr. 43 mailto: memmenlauer * biodataanalysis.de > D-81669 M?nchen http://www.biodataanalysis.de/ From mario at emmenlauer.de Fri Dec 23 17:53:24 2016 From: mario at emmenlauer.de (Mario Emmenlauer) Date: Fri, 23 Dec 2016 23:53:24 +0100 Subject: [Borgbackup] extended prune (was: faster / better deletion, for a bounty?) Message-ID: Dear All, it seems I am pretty lucky this year to have an early Christmas present. First, I found the large archives in my repo by pure chance and could free 50% of disk space with only a few deletions. Now the actual disk usage is at 1.8TB again, which matches borg's report of deduplicated size. Furthermore, it seems that those huge deletions where the only "slow" ones, because later I could prune another ~200 of ~500 archives in just little over 10 minutes, with borg 1.0.9. Finally, I seemed unable to get prune do exactly what I hoped for. It might be me, but I did not find exactly the right combination of options. I take backups once per week, and if they are older than one year, I'd like to keep only every other week. In any case I was also curious to enable prune to handle a manual selection of archives, so I tried, and got it working pretty easily. I extended archive.py and helper.py with two new prune options --keep-list and --remove-list, where the former takes a list of archives to keep (all others are pruned) and the latter takes a list of archives to prune (all others are kept). My patch against borg 1.0.9 is available here https://github.com/emmenlau/borg/tree/emmenlau_better_prune and I'm happy to make a PR if anyone is interested (sorry for the bold name, its really just a very minor extension to prune). Finally, thanks a lot again for the very nice borg! Your code was very easy to read, and I found very helpful compile instructions in the readme! This allowed me to get productive within a few minutes! Nice work! It would be awesome to add the pyinstaller instructions to the readme, but they where sufficiently easy to find in an github issue report. Thanks, and happy holidays, Mario Emmenlauer From jgoerzen at complete.org Fri Dec 23 17:59:56 2016 From: jgoerzen at complete.org (John Goerzen) Date: Fri, 23 Dec 2016 16:59:56 -0600 Subject: [Borgbackup] extended prune (was: faster / better deletion, for a bounty?) In-Reply-To: References: Message-ID: Hi Mario, Just a couple quick comments: 1) Would those prune patches better be a 'borg delete' patch allowing specification of multiple archives to zap? 2) You might try a hostname-monthly- or hostname-weekly- pattern in your archive naming to let you achieve what you want with prune. John On 12/23/2016 04:53 PM, Mario Emmenlauer wrote: > Dear All, > > it seems I am pretty lucky this year to have an early Christmas > present. First, I found the large archives in my repo by pure > chance and could free 50% of disk space with only a few deletions. > Now the actual disk usage is at 1.8TB again, which matches borg's > report of deduplicated size. > > Furthermore, it seems that those huge deletions where the only > "slow" ones, because later I could prune another ~200 of ~500 > archives in just little over 10 minutes, with borg 1.0.9. > > Finally, I seemed unable to get prune do exactly what I hoped for. > It might be me, but I did not find exactly the right combination of > options. I take backups once per week, and if they are older than > one year, I'd like to keep only every other week. > In any case I was also curious to enable prune to handle a manual > selection of archives, so I tried, and got it working pretty easily. > I extended archive.py and helper.py with two new prune options > --keep-list and --remove-list, where the former takes a list of > archives to keep (all others are pruned) and the latter takes a > list of archives to prune (all others are kept). My patch against > borg 1.0.9 is available here > https://github.com/emmenlau/borg/tree/emmenlau_better_prune > and I'm happy to make a PR if anyone is interested (sorry for the > bold name, its really just a very minor extension to prune). > > > Finally, thanks a lot again for the very nice borg! Your code was > very easy to read, and I found very helpful compile instructions > in the readme! This allowed me to get productive within a few > minutes! Nice work! It would be awesome to add the pyinstaller > instructions to the readme, but they where sufficiently easy to > find in an github issue report. > > Thanks, and happy holidays, > > Mario Emmenlauer > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup From mario at emmenlauer.de Fri Dec 23 18:21:00 2016 From: mario at emmenlauer.de (Mario Emmenlauer) Date: Sat, 24 Dec 2016 00:21:00 +0100 Subject: [Borgbackup] extended prune (was: faster / better deletion, for a bounty?) In-Reply-To: References: Message-ID: Hi, whow that was a quick reply! :-) Thanks! :-) On 23.12.2016 23:59, John Goerzen wrote: > Just a couple quick comments: > > 1) Would those prune patches better be a 'borg delete' patch allowing > specification of multiple archives to zap? I'm fully open to that. My logic was that prune and delete are separated by the fact that prune performs multiple deletions in one go, and so my patch would fit prune more than delete. But I'm really new to borg, and any advise is happily accepted! What do others think? Is my patch worthwhile at all? > 2) You might try a hostname-monthly- or hostname-weekly- pattern in your > archive naming to let you achieve what you want with prune. Yes, as long as I'd keep a similar pattern its true. But I'm very fond of the new option because for my "huge" prune I was able to pick a crude mix of hand-picked archives together with different patterns for different times of different hosts. With some bash-foo that was less than ten minutes of work, and I could pass them all to prune in one go (hopefully making the best use of disk I/O). In fact it would be trivial to pick any other wild selection like the X largest archives or whatever, by combining borg's statistics with the new --remove-list option. But really this is just be me, and admittedly I did not invest too much time to try to understand prune's current behaviour :-) Oh and a related note: Thomas mentioned in IRC the idea that borg could use a garbage collection instead of immediate deletions. I very much cherish this idea because deletions could be instant, and disk space can be freed with "borg gc" whenever suitable (i.e. after a long repo re-organization with deletions, renames, new backups etc). Cheers, Mario > John > > On 12/23/2016 04:53 PM, Mario Emmenlauer wrote: >> Dear All, >> >> it seems I am pretty lucky this year to have an early Christmas >> present. First, I found the large archives in my repo by pure >> chance and could free 50% of disk space with only a few deletions. >> Now the actual disk usage is at 1.8TB again, which matches borg's >> report of deduplicated size. >> >> Furthermore, it seems that those huge deletions where the only >> "slow" ones, because later I could prune another ~200 of ~500 >> archives in just little over 10 minutes, with borg 1.0.9. >> >> Finally, I seemed unable to get prune do exactly what I hoped for. >> It might be me, but I did not find exactly the right combination of >> options. I take backups once per week, and if they are older than >> one year, I'd like to keep only every other week. >> In any case I was also curious to enable prune to handle a manual >> selection of archives, so I tried, and got it working pretty easily. >> I extended archive.py and helper.py with two new prune options >> --keep-list and --remove-list, where the former takes a list of >> archives to keep (all others are pruned) and the latter takes a >> list of archives to prune (all others are kept). My patch against >> borg 1.0.9 is available here >> https://github.com/emmenlau/borg/tree/emmenlau_better_prune >> and I'm happy to make a PR if anyone is interested (sorry for the >> bold name, its really just a very minor extension to prune). >> >> >> Finally, thanks a lot again for the very nice borg! Your code was >> very easy to read, and I found very helpful compile instructions >> in the readme! This allowed me to get productive within a few >> minutes! Nice work! It would be awesome to add the pyinstaller >> instructions to the readme, but they where sufficiently easy to >> find in an github issue report. >> >> Thanks, and happy holidays, >> >> Mario Emmenlauer Viele Gruesse, Mario Emmenlauer -- BioDataAnalysis GmbH, Mario Emmenlauer Tel. Buero: +49-89-74677203 Balanstr. 43 mailto: memmenlauer * biodataanalysis.de D-81669 M?nchen http://www.biodataanalysis.de/ From jdc at uwo.ca Fri Dec 23 19:33:23 2016 From: jdc at uwo.ca (Dan Christensen) Date: Fri, 23 Dec 2016 19:33:23 -0500 Subject: [Borgbackup] extended prune In-Reply-To: (Mario Emmenlauer's message of "Fri, 23 Dec 2016 23:53:24 +0100") References: Message-ID: <87y3z6kvxo.fsf@uwo.ca> On Dec 23, 2016, Mario Emmenlauer wrote: > Finally, I seemed unable to get prune do exactly what I hoped for. > It might be me, but I did not find exactly the right combination of > options. I take backups once per week, and if they are older than > one year, I'd like to keep only every other week. Would "--keep-within 1y --keep-monthly -1" do the trick? I think it would occasionally skip two weeklies, if three happened to fit into one month, but maybe that's close enough? You can test with "-n". Dan From mario at emmenlauer.de Sat Dec 24 06:48:56 2016 From: mario at emmenlauer.de (Mario Emmenlauer) Date: Sat, 24 Dec 2016 12:48:56 +0100 Subject: [Borgbackup] extended prune In-Reply-To: <87y3z6kvxo.fsf@uwo.ca> References: <87y3z6kvxo.fsf@uwo.ca> Message-ID: Hi Dan, thanks for the reply! Below more: On 24.12.2016 01:33, Dan Christensen wrote: > On Dec 23, 2016, Mario Emmenlauer wrote: > >> Finally, I seemed unable to get prune do exactly what I hoped for. >> It might be me, but I did not find exactly the right combination of >> options. I take backups once per week, and if they are older than >> one year, I'd like to keep only every other week. > > Would "--keep-within 1y --keep-monthly -1" do the trick? I think it > would occasionally skip two weeklies, if three happened to fit into one > month, but maybe that's close enough? You can test with "-n". It seems this will keep one monthly archive, is that possible? At least when I checked it seemed it would prune from 2015 everything except one monthly archive. I would prefer to keep two monthly archives. Even better, I would love that more then 24 months ago, one archive per month should be kept, for 12 to 24 months ago two archives per month should be kept, and in the past 12 months, weekly archives should be kept. And I have eight hosts in the same repository, so this pattern should apply per host. To make the best use of disk-IO, I'd like to combine all this into a single prune. With the new --remove-list option (and some bash-foo), this is really easy for me to achieve. Thanks and Cheers, Mario Emmenlauer -- BioDataAnalysis GmbH, Mario Emmenlauer Tel. Buero: +49-89-74677203 Balanstr. 43 mailto: memmenlauer * biodataanalysis.de D-81669 M?nchen http://www.biodataanalysis.de/ From public at enkore.de Sat Dec 24 06:53:01 2016 From: public at enkore.de (Marian Beermann) Date: Sat, 24 Dec 2016 12:53:01 +0100 Subject: [Borgbackup] extended prune (was: faster / better deletion, for a bounty?) In-Reply-To: References: Message-ID: <889a59c7-7fee-6b95-af84-3bd22d72ffc1@enkore.de> Hi Mario and John, Re. 1.) --remove-list feels a bit strange in prune -- usually it's run automatically, why would one need to always remove the same archive [name] over and over? I feel like this would be better suited to 'delete' with a syntax like borg delete repo::archive1 archive2 archive3 ... A bit awkward due to the whole :: thing (can't change that anymore), but ok I guess. What do you think? --keep-list on the other hand could make sense to preserve some important archives forever (without having to rename them outside the --prefix). I like it. Cheers, Marian On 23.12.2016 23:59, John Goerzen wrote: > Hi Mario, > > Just a couple quick comments: > > 1) Would those prune patches better be a 'borg delete' patch allowing > specification of multiple archives to zap? > > 2) You might try a hostname-monthly- or hostname-weekly- pattern in your > archive naming to let you achieve what you want with prune. > > John > > On 12/23/2016 04:53 PM, Mario Emmenlauer wrote: >> Dear All, >> >> it seems I am pretty lucky this year to have an early Christmas >> present. First, I found the large archives in my repo by pure >> chance and could free 50% of disk space with only a few deletions. >> Now the actual disk usage is at 1.8TB again, which matches borg's >> report of deduplicated size. >> >> Furthermore, it seems that those huge deletions where the only >> "slow" ones, because later I could prune another ~200 of ~500 >> archives in just little over 10 minutes, with borg 1.0.9. >> >> Finally, I seemed unable to get prune do exactly what I hoped for. >> It might be me, but I did not find exactly the right combination of >> options. I take backups once per week, and if they are older than >> one year, I'd like to keep only every other week. >> In any case I was also curious to enable prune to handle a manual >> selection of archives, so I tried, and got it working pretty easily. >> I extended archive.py and helper.py with two new prune options >> --keep-list and --remove-list, where the former takes a list of >> archives to keep (all others are pruned) and the latter takes a >> list of archives to prune (all others are kept). My patch against >> borg 1.0.9 is available here >> https://github.com/emmenlau/borg/tree/emmenlau_better_prune >> and I'm happy to make a PR if anyone is interested (sorry for the >> bold name, its really just a very minor extension to prune). >> >> >> Finally, thanks a lot again for the very nice borg! Your code was >> very easy to read, and I found very helpful compile instructions >> in the readme! This allowed me to get productive within a few >> minutes! Nice work! It would be awesome to add the pyinstaller >> instructions to the readme, but they where sufficiently easy to >> find in an github issue report. >> >> Thanks, and happy holidays, >> >> Mario Emmenlauer >> _______________________________________________ >> Borgbackup mailing list >> Borgbackup at python.org >> https://mail.python.org/mailman/listinfo/borgbackup > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > From jdc at uwo.ca Sat Dec 24 08:45:30 2016 From: jdc at uwo.ca (Dan Christensen) Date: Sat, 24 Dec 2016 08:45:30 -0500 Subject: [Borgbackup] extended prune In-Reply-To: (Mario Emmenlauer's message of "Sat, 24 Dec 2016 12:48:56 +0100") References: <87y3z6kvxo.fsf@uwo.ca> Message-ID: <87h95tl9tx.fsf@uwo.ca> On Dec 24, 2016, Mario Emmenlauer wrote: > On 24.12.2016 01:33, Dan Christensen wrote: > >> Would "--keep-within 1y --keep-monthly -1" do the trick? I think it >> would occasionally skip two weeklies, if three happened to fit into one >> month, but maybe that's close enough? You can test with "-n". > > It seems this will keep one monthly archive, is that possible? You are right. I don't think you can achieve what you want with the current pruning options. Dan From dastapov at gmail.com Sun Dec 25 17:27:59 2016 From: dastapov at gmail.com (Dmitry Astapov) Date: Sun, 25 Dec 2016 22:27:59 +0000 Subject: [Borgbackup] Why does borg delete/prune write a bunch of new data? In-Reply-To: References: Message-ID: I'm also shipping my backups off to S3, and I support this idea (I also now know why my S3 bills are constantly higher than I expect them to be :) On Tue, Dec 20, 2016 at 11:09 PM, John Goerzen wrote: > I've done some digging into this, and it seems the reason is > compact_segments() in repository.py. > > It both deletes the segments that are completely unused, and also (if > I'm understanding correctly), takes segments containing some objects > that are unused and some objects that are still used and writes new > segments containing only the used objects. > > The end result is some space savings, at the cost of a lot of I/O. I > wonder how hard it would be to support deleting unused segments without > bothering to rewrite segments that are partially used? > > thanks, > > John > > On 12/20/2016 08:28 AM, John Goerzen wrote: > > Hi folks, > > > > So I'm doing some testing of Borg. My ultimate aim is to rsync the > > backups to a dumb (WebDAV or S3-type) host. > > > > I made a run of borg over a real subset of my data, about 80GB worth. > > I then cleaned up and deleted a good chunk of data throughout that > > area, and made another archive with borg create. > > > > So far so good. Now I ran borg delete to remove the archive with all > > the extra data. Sure enough, about 2GB freed up on the disk after. > > > > However, watching the process with strace and examining the > > filesystem, I observed it wrote a considerable amount of new segments > > to the data directory. A little analysis with ls and du shows it > > wrote right around 2GB of new segments. (It also, of course, unlinked > > a considerable number of segments.) > > > > Having to rsync 2GB of new data every time I delete data is going to > > be rather sub-optimal on my poor DSL. Any ideas why it's doing this? > > FWIW the index file is only a few tens of MBs. > > > > I'm using encryption and lzma compression. I did double the > > max_segment_size from 5MB to 10MB (a lot of experience with obnam > > suggested this would improve the performance over the rsync situation) > > > > Thanks, > > > > John > > > > _______________________________________________ > > Borgbackup mailing list > > Borgbackup at python.org > > https://mail.python.org/mailman/listinfo/borgbackup > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > -- Dmitry Astapov -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgoerzen at complete.org Sun Dec 25 17:35:08 2016 From: jgoerzen at complete.org (John Goerzen) Date: Sun, 25 Dec 2016 16:35:08 -0600 Subject: [Borgbackup] Why does borg delete/prune write a bunch of new data? In-Reply-To: References: Message-ID: <7b2c9016-d9a6-322b-1b5b-853ab83785e1@complete.org> Out of curiousity, what tool are you using to send them to S3? On 12/25/2016 04:27 PM, Dmitry Astapov wrote: > I'm also shipping my backups off to S3, and I support this idea (I > also now know why my S3 bills are constantly higher than I expect them > to be :) > > On Tue, Dec 20, 2016 at 11:09 PM, John Goerzen > wrote: > > I've done some digging into this, and it seems the reason is > compact_segments() in repository.py. > > It both deletes the segments that are completely unused, and also (if > I'm understanding correctly), takes segments containing some objects > that are unused and some objects that are still used and writes new > segments containing only the used objects. > > The end result is some space savings, at the cost of a lot of I/O. I > wonder how hard it would be to support deleting unused segments > without > bothering to rewrite segments that are partially used? > > thanks, > > John > > On 12/20/2016 08:28 AM, John Goerzen wrote: > > Hi folks, > > > > So I'm doing some testing of Borg. My ultimate aim is to rsync the > > backups to a dumb (WebDAV or S3-type) host. > > > > I made a run of borg over a real subset of my data, about 80GB > worth. > > I then cleaned up and deleted a good chunk of data throughout that > > area, and made another archive with borg create. > > > > So far so good. Now I ran borg delete to remove the archive > with all > > the extra data. Sure enough, about 2GB freed up on the disk after. > > > > However, watching the process with strace and examining the > > filesystem, I observed it wrote a considerable amount of new > segments > > to the data directory. A little analysis with ls and du shows it > > wrote right around 2GB of new segments. (It also, of course, > unlinked > > a considerable number of segments.) > > > > Having to rsync 2GB of new data every time I delete data is going to > > be rather sub-optimal on my poor DSL. Any ideas why it's doing > this? > > FWIW the index file is only a few tens of MBs. > > > > I'm using encryption and lzma compression. I did double the > > max_segment_size from 5MB to 10MB (a lot of experience with obnam > > suggested this would improve the performance over the rsync > situation) > > > > Thanks, > > > > John > > > > _______________________________________________ > > Borgbackup mailing list > > Borgbackup at python.org > > https://mail.python.org/mailman/listinfo/borgbackup > > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > > > > > > -- > Dmitry Astapov -------------- next part -------------- An HTML attachment was scrubbed... URL: From dastapov at gmail.com Sun Dec 25 17:36:08 2016 From: dastapov at gmail.com (Dmitry Astapov) Date: Sun, 25 Dec 2016 22:36:08 +0000 Subject: [Borgbackup] Why does borg delete/prune write a bunch of new data? In-Reply-To: <7b2c9016-d9a6-322b-1b5b-853ab83785e1@complete.org> References: <7b2c9016-d9a6-322b-1b5b-853ab83785e1@complete.org> Message-ID: "s3cmd sync" from s3tools.org On Sun, Dec 25, 2016 at 10:35 PM, John Goerzen wrote: > Out of curiousity, what tool are you using to send them to S3? > > > On 12/25/2016 04:27 PM, Dmitry Astapov wrote: > > I'm also shipping my backups off to S3, and I support this idea (I also > now know why my S3 bills are constantly higher than I expect them to be :) > > On Tue, Dec 20, 2016 at 11:09 PM, John Goerzen > wrote: > >> I've done some digging into this, and it seems the reason is >> compact_segments() in repository.py. >> >> It both deletes the segments that are completely unused, and also (if >> I'm understanding correctly), takes segments containing some objects >> that are unused and some objects that are still used and writes new >> segments containing only the used objects. >> >> The end result is some space savings, at the cost of a lot of I/O. I >> wonder how hard it would be to support deleting unused segments without >> bothering to rewrite segments that are partially used? >> >> thanks, >> >> John >> >> On 12/20/2016 08:28 AM, John Goerzen wrote: >> > Hi folks, >> > >> > So I'm doing some testing of Borg. My ultimate aim is to rsync the >> > backups to a dumb (WebDAV or S3-type) host. >> > >> > I made a run of borg over a real subset of my data, about 80GB worth. >> > I then cleaned up and deleted a good chunk of data throughout that >> > area, and made another archive with borg create. >> > >> > So far so good. Now I ran borg delete to remove the archive with all >> > the extra data. Sure enough, about 2GB freed up on the disk after. >> > >> > However, watching the process with strace and examining the >> > filesystem, I observed it wrote a considerable amount of new segments >> > to the data directory. A little analysis with ls and du shows it >> > wrote right around 2GB of new segments. (It also, of course, unlinked >> > a considerable number of segments.) >> > >> > Having to rsync 2GB of new data every time I delete data is going to >> > be rather sub-optimal on my poor DSL. Any ideas why it's doing this? >> > FWIW the index file is only a few tens of MBs. >> > >> > I'm using encryption and lzma compression. I did double the >> > max_segment_size from 5MB to 10MB (a lot of experience with obnam >> > suggested this would improve the performance over the rsync situation) >> > >> > Thanks, >> > >> > John >> > >> > _______________________________________________ >> > Borgbackup mailing list >> > Borgbackup at python.org >> > https://mail.python.org/mailman/listinfo/borgbackup >> >> _______________________________________________ >> Borgbackup mailing list >> Borgbackup at python.org >> https://mail.python.org/mailman/listinfo/borgbackup >> > > > > -- > Dmitry Astapov > > > -- Dmitry Astapov -------------- next part -------------- An HTML attachment was scrubbed... URL: From borg at picturenow.co.uk Thu Dec 29 09:45:21 2016 From: borg at picturenow.co.uk (Iain Mac Donald) Date: Thu, 29 Dec 2016 14:45:21 +0000 Subject: [Borgbackup] Borg & Maildir Message-ID: <20161229144521.4efe9a85@flora.coachhouse> I have been running Borg for a week for our home server & family laptops and everything is working just great. The one problem I foresee is restoring deleted emails. Emails are stored in Maildir format, filenames aren't seen by users, many folders have high daily activity and some folders contain 10s of thousands of files. Identifying which file to restore would be very difficult. The only way I can think of doing it is by restoring a whole folder to a temporary location accessible to the IMAP server and then using an IMAP client to identify the deleted message and copying it back to the desired folder. Another approach might be doing text string searches on the Borg backups, within date ranges, but I'm not sure how to do that. I realise this is a bit off-topic but I was hoping someone on the list might have a better solution to this problem. Regards, Iain. From sitaramc at gmail.com Thu Dec 29 10:42:54 2016 From: sitaramc at gmail.com (Sitaram Chamarty) Date: Thu, 29 Dec 2016 21:12:54 +0530 Subject: [Borgbackup] Borg & Maildir In-Reply-To: <20161229144521.4efe9a85@flora.coachhouse> References: <20161229144521.4efe9a85@flora.coachhouse> Message-ID: <20161229154254.GB22090@sita-lt.atc.tcs.com> On Thu, Dec 29, 2016 at 02:45:21PM +0000, Iain Mac Donald wrote: > > I have been running Borg for a week for our home server & family > laptops and everything is working just great. > > The one problem I foresee is restoring deleted emails. Emails are > stored in Maildir format, filenames aren't seen by users, many > folders have high daily activity and some folders contain 10s of > thousands of files. Identifying which file to restore would be very > difficult. > > The only way I can think of doing it is by restoring a whole folder to > a temporary location accessible to the IMAP server and then using an > IMAP client to identify the deleted message and copying it back to the > desired folder. Another approach might be doing text string searches on > the Borg backups, within date ranges, but I'm not sure how to do that. > > I realise this is a bit off-topic but I was hoping someone on the list > might have a better solution to this problem. Just use a mail client that can detect (and delete) duplicate mails. Once you have that, do what you said above -- i.e., restore the whole folder to a temp location -- then copy all mails from there to the main one. When that is done, delete duplicates. Thunderbird has an extension for this. In mutt just use the pattern '~=' and delete all mails with that pattern. I'm sure other mail clients have something. The big downside to this is if you really only need a few mails, but the maildir has tons of mails. You'd be processing all those tons of mails just to grab one. Playing around with 'formail' and message-IDs may also help. regards sitaram From borg at picturenow.co.uk Thu Dec 29 11:22:24 2016 From: borg at picturenow.co.uk (Iain Mac Donald) Date: Thu, 29 Dec 2016 16:22:24 +0000 Subject: [Borgbackup] Borg & Maildir In-Reply-To: <20161229154254.GB22090@sita-lt.atc.tcs.com> References: <20161229144521.4efe9a85@flora.coachhouse> <20161229154254.GB22090@sita-lt.atc.tcs.com> Message-ID: <20161229162224.527f73d8@flora.coachhouse> On Thu, 29 Dec 2016 21:12:54 +0530 Sitaram Chamarty wrote: > Just use a mail client that can detect (and delete) duplicate > mails. We all use Claws Mail, which does have a "delete duplicate emails" tool. I'll give it a try on a test folder and see how it goes. Not sure I'd like to try that on, for example, my Sent folder which has more than 50,000 emails (and a few Gig in size). Ironically, a catastrophic failure, disc failure for instance, is probably easier to deal with than restoring a few emails. The latter is the more common occurrence in my experience. Thanks for the suggestion. Regards, Iain. From third07 at gmail.com Thu Dec 29 11:49:59 2016 From: third07 at gmail.com (Ed F.) Date: Thu, 29 Dec 2016 10:49:59 -0600 Subject: [Borgbackup] Borg & Maildir In-Reply-To: <20161229162224.527f73d8@flora.coachhouse> References: <20161229144521.4efe9a85@flora.coachhouse> <20161229154254.GB22090@sita-lt.atc.tcs.com> <20161229162224.527f73d8@flora.coachhouse> Message-ID: On Thu, Dec 29, 2016 at 10:22 AM, Iain Mac Donald wrote: > Ironically, a catastrophic failure, disc failure for instance, is > probably easier to deal with than restoring a few emails. The latter > is the more common occurrence in my experience. Use borg to FUSE mount an archive, and then use rsync with suitable options to restore any updated files. Ed From mario at emmenlauer.de Thu Dec 29 13:08:53 2016 From: mario at emmenlauer.de (Mario Emmenlauer) Date: Thu, 29 Dec 2016 19:08:53 +0100 Subject: [Borgbackup] Borg & Maildir In-Reply-To: References: <20161229144521.4efe9a85@flora.coachhouse> <20161229154254.GB22090@sita-lt.atc.tcs.com> <20161229162224.527f73d8@flora.coachhouse> Message-ID: <72b44801-5507-75ae-fc19-2cacabb5b0ef@emmenlauer.de> Dear Iain, On 29.12.2016 17:49, Ed F. wrote: > On Thu, Dec 29, 2016 at 10:22 AM, Iain Mac Donald wrote: > >> Ironically, a catastrophic failure, disc failure for instance, is >> probably easier to deal with than restoring a few emails. The latter >> is the more common occurrence in my experience. > > Use borg to FUSE mount an archive, and then use rsync with suitable > options to restore any updated files. I think Ed's suggestion is the most suitable one. After you fuse-mount the backup, you can rsync it to the mail server (if its not the same machine), while preserving all time stamps etc. Then it should be fairly easy to use rsync with --dry-run to perform a "dummy-sync" of the backup with the current folder. This will list all files that where deleted, and ignore all others. This way you get "cheaply" a list of deleted emails. With some luck this list is a lot shorter than your 50.000 emails, and you can more easily nail it down to the ones you want to restore... Viele Gruesse, Mario Emmenlauer -- BioDataAnalysis GmbH, Mario Emmenlauer Tel. Buero: +49-89-74677203 Balanstr. 43 mailto: memmenlauer * biodataanalysis.de D-81669 M?nchen http://www.biodataanalysis.de/ From borg at picturenow.co.uk Thu Dec 29 13:48:29 2016 From: borg at picturenow.co.uk (Iain Mac Donald) Date: Thu, 29 Dec 2016 18:48:29 +0000 Subject: [Borgbackup] Borg & Maildir In-Reply-To: <72b44801-5507-75ae-fc19-2cacabb5b0ef@emmenlauer.de> References: <20161229144521.4efe9a85@flora.coachhouse> <20161229154254.GB22090@sita-lt.atc.tcs.com> <20161229162224.527f73d8@flora.coachhouse> <72b44801-5507-75ae-fc19-2cacabb5b0ef@emmenlauer.de> Message-ID: <20161229184829.1e890c65@flora.coachhouse> On Thu, 29 Dec 2016 19:08:53 +0100 Mario Emmenlauer wrote: > I think Ed's suggestion is the most suitable one. After you fuse-mount > the backup, you can rsync it to the mail server (if its not the same > machine), while preserving all time stamps etc. Ed & Mario, Thanks, guys! I think that sounds like a plan for dealing with bigger folders. Mail server and backup server is the same machine which probably makes things much easier. I'll do a test run in the next few days. Regards, Iain. From tve at voneicken.com Sat Dec 31 15:41:59 2016 From: tve at voneicken.com (Thorsten von Eicken) Date: Sat, 31 Dec 2016 20:41:59 +0000 Subject: [Borgbackup] extract pattern never matched Message-ID: <01000159569e1cad-aef532ce-1ab3-46ed-9b8e-b2255a3252b8-000000@email.amazonses.com> How are the extract patterns supposed to work? I can get the sh patterns to work, but not the fm ones. Example: # borg list backup at backup:/big/h/home::home-2016-08-31T03:18-0700 | egrep big/home/weather/home/tve/relay drwxrwxr-x tve tve 0 Sun, 2016-07-24 22:22:53 big/home/weather/home/tve/relay -rw-rw-r-- tve tve 284 Mon, 2013-12-02 23:44:04 big/home/weather/home/tve/relay/aprs2-servers -rw-r--r-- root root 5256288 Mon, 2016-02-01 00:22:27 big/home/weather/home/tve/relay/cwop.log.1.xz -rwxrwxr-x tve tve 34 Sun, 2014-02-23 15:48:26 big/home/weather/home/tve/relay/doit -rw-rw-r-- tve tve 154513244 Sun, 2015-04-26 16:40:49 big/home/weather/home/tve/relay/nohup.out.1.xz -rw------- tve tve 17961084 Sun, 2015-06-14 10:34:13 big/home/weather/home/tve/relay/nohup.out.2.xz -rw------- tve tve 13316988 Sat, 2015-07-18 17:24:47 big/home/weather/home/tve/relay/nohup.out.3.xz -rw------- tve tve 15975004 Sat, 2015-10-10 22:15:35 big/home/weather/home/tve/relay/nohup.out.4.xz -rw------- tve tve 15338236 Sat, 2015-10-10 22:15:37 big/home/weather/home/tve/relay/nohup.out.5.xz -rw------- tve tve 16072328 Tue, 2015-11-24 10:57:43 big/home/weather/home/tve/relay/nohup.out.6.xz -rw------- tve tve 11614156 Sun, 2015-12-27 15:31:29 big/home/weather/home/tve/relay/nohup.out.7.xz -rw------- tve tve 17320288 Thu, 2016-02-18 21:20:05 big/home/weather/home/tve/relay/nohup.out.8.xz -rw-rw-r-- tve tve 26109380 Sun, 2016-07-24 22:18:42 big/home/weather/home/tve/relay/nohup.out.9.xz -rwxrwxr-x tve tve 3227 Mon, 2014-11-17 21:57:51 big/home/weather/home/tve/relay/relay.rb Now a specific extract of one file, which works: # borg extract --strip-components 6 backup at backup:/big/h/home::home-2016-08-31T03:18-0700 big/home/weather/home/tve/relay/nohup.out.9.xz # ls -ls nohup.out.9.xz 25500 -rw-rw-r-- 1 tve tve 26109380 Jul 24 22:18 nohup.out.9.xz But a pattern extract fails, why? # borg extract --strip-components 6 backup at backup:/big/h/home::home-2016-08-31T03:18-0700 'big/home/weather/home/tve/relay/nohup*' Include pattern 'big/home/weather/home/tve/relay/nohup*' never matched. The same as a shell pattern works: # borg extract --strip-components 6 backup at backup:/big/h/home::home-2016-08-31T03:18-0700 'sh:big/home/weather/home/tve/relay/nohup*' # ls -ls nohup.out.9.xz 25500 -rw-rw-r-- 1 tve tve 26109380 Jul 24 22:18 nohup.out.9.xz Borg client: # borg debug-info Platform: Linux h 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 x86_64 x86_64 Linux: debian stretch/sid Borg: 1.0.8 Python: CPython 3.5.2 PID: 18886 CWD: /tmp/relay sys.argv: ['borg', 'debug-info'] SSH_ORIGINAL_COMMAND: None Borg server: # borg debug-info Platform: Linux backup 3.10.104-3-ARCH #1 SMP PREEMPT Mon Nov 14 18:37:24 MST 2016 armv7l Linux: arch Borg: 1.0.8 Python: CPython 3.5.2 PID: 15160 CWD: /home/tve sys.argv: ['/usr/bin/borg', 'debug-info'] SSH_ORIGINAL_COMMAND: None