From melkor.lord at gmail.com  Sat Oct  1 14:25:46 2016
From: melkor.lord at gmail.com (Melkor Lord)
Date: Sat, 1 Oct 2016 20:25:46 +0200
Subject: [Borgbackup] [IDEA] Add support for external commands
Message-ID: <CAOXaFywzL43e8ro-sJZ6cSaTL8SQRFk14F6mVCAu-WO31aFujg@mail.gmail.com>

Hi,

Playing a bit with Borg made me think of something that could be very
useful. It's just a rough idea. It involves extending "stdin" via "borg
serve".

We can already to something like : someapp | borg create
/path/to/repo::name -

It would be really useful to extend that to "borg serve". Get the output of
any arbitrary command to backup (stdout) and display any errors (stderr) if
they should occur.

Typically, this would be useful for remote backups of application data such
as SQL dumps, LDAP dumps, whatever... Something like :

borg create --remote-cmd user at host: -- mysqldump [options] db.sql

or

borg create --remote-cmd user at host: -- somebigscriptdoingstuff

This would connect to "user at host" and "--remote-cmd" would instruct the
remote "borg serve" process to execute whatever is after "--" and retrieve
stdout, stderr and also the returncode of the executed app.

if "--remote-cmd" is used, no path allowed after the ":" in "user at host:"
part and "--" would be required. This would allow to use complex commands
without the need of tricky shell escapes.

What do you think?

A nice addition to that would be an option like "--skip-cmd-fail" that
would NOT create the backup (or create it and delete it afterwards if I
understand the way Borg currently works) if the returncode of the remote
command is non-zero. Even better, for strange and unusual commands,
"--skip-cmd-fail=1,2,3" that would list the returncodes triggering the
non-creation of the backup.

This kind of feature would be a huge life saver for admins.

PS: I know we currently can do something like :
ssh user at host "do-stuff" | borg create /path/to/repo::name
but this lacks the more thorough controls borg could achieve by wrapping
all the process.

-- 
Unix _IS_ user friendly, it's just selective about who its friends are.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161001/31d0b186/attachment.html>

From tw at waldmann-edv.de  Sat Oct  1 14:43:54 2016
From: tw at waldmann-edv.de (Thomas Waldmann)
Date: Sat, 1 Oct 2016 20:43:54 +0200
Subject: [Borgbackup] [IDEA] Add support for external commands
In-Reply-To: <CAOXaFywzL43e8ro-sJZ6cSaTL8SQRFk14F6mVCAu-WO31aFujg@mail.gmail.com>
References: <CAOXaFywzL43e8ro-sJZ6cSaTL8SQRFk14F6mVCAu-WO31aFujg@mail.gmail.com>
Message-ID: <771d30a8-4a03-dc55-2909-233c3d843428@waldmann-edv.de>

> PS: I know we currently can do something like :
> ssh user at host "do-stuff" | borg create /path/to/repo::name
> but this lacks the more thorough controls borg could achieve by wrapping
> all the process.

I just wanted to suggest you just do that.

In general, borg does not try to be a shell / scripting thing /
scheduler / etc.

Just using a shell, script language, cron, etc. is usually better for
this than reinventing this inside borg.


-- 

GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393


From wtraylor at areyouthinking.org  Sat Oct  1 14:45:39 2016
From: wtraylor at areyouthinking.org (Walker Traylor)
Date: Sun, 2 Oct 2016 01:45:39 +0700
Subject: [Borgbackup] [IDEA] Add support for external commands
In-Reply-To: <CAOXaFywzL43e8ro-sJZ6cSaTL8SQRFk14F6mVCAu-WO31aFujg@mail.gmail.com>
References: <CAOXaFywzL43e8ro-sJZ6cSaTL8SQRFk14F6mVCAu-WO31aFujg@mail.gmail.com>
Message-ID: <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org>

"PS: I know we currently can do something like :
ssh user at host "do-stuff" | borg create /path/to/repo::name
but this lacks the more thorough controls borg could achieve by wrapping all the process."

What more thorough controls do you require?

Walker Traylor
walker at walkertraylor.com <mailto:walker at walkertraylor.com>


> On Oct 2, 2016, at 1:25 AM, Melkor Lord <melkor.lord at gmail.com> wrote:
> 
> Hi,
> 
> Playing a bit with Borg made me think of something that could be very useful. It's just a rough idea. It involves extending "stdin" via "borg serve".
> 
> We can already to something like : someapp | borg create /path/to/repo::name -
> 
> It would be really useful to extend that to "borg serve". Get the output of any arbitrary command to backup (stdout) and display any errors (stderr) if they should occur.
> 
> Typically, this would be useful for remote backups of application data such as SQL dumps, LDAP dumps, whatever... Something like :
> 
> borg create --remote-cmd user at host: -- mysqldump [options] db.sql
> 
> or
> 
> borg create --remote-cmd user at host: -- somebigscriptdoingstuff
> 
> This would connect to "user at host" and "--remote-cmd" would instruct the remote "borg serve" process to execute whatever is after "--" and retrieve stdout, stderr and also the returncode of the executed app.
> 
> if "--remote-cmd" is used, no path allowed after the ":" in "user at host:" part and "--" would be required. This would allow to use complex commands without the need of tricky shell escapes.
> 
> What do you think?
> 
> A nice addition to that would be an option like "--skip-cmd-fail" that would NOT create the backup (or create it and delete it afterwards if I understand the way Borg currently works) if the returncode of the remote command is non-zero. Even better, for strange and unusual commands, "--skip-cmd-fail=1,2,3" that would list the returncodes triggering the non-creation of the backup.
> 
> This kind of feature would be a huge life saver for admins.
> 
> PS: I know we currently can do something like :
> ssh user at host "do-stuff" | borg create /path/to/repo::name
> but this lacks the more thorough controls borg could achieve by wrapping all the process.
> 
> -- 
> Unix _IS_ user friendly, it's just selective about who its friends are.
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161002/f2c59f2d/attachment-0001.html>

From tw at waldmann-edv.de  Sat Oct  1 17:30:43 2016
From: tw at waldmann-edv.de (Thomas Waldmann)
Date: Sat, 1 Oct 2016 23:30:43 +0200
Subject: [Borgbackup] borgbackup beta 1.1.0b2 released
Message-ID: <90fe9cd3-3fa6-4a1d-c36f-a36201ccf9b3@waldmann-edv.de>

https://github.com/borgbackup/borg/releases/tag/1.1.0b2

More details: see URL above.

Cheers,

Thomas

-- 

GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393


From melkor.lord at gmail.com  Sat Oct  1 19:38:28 2016
From: melkor.lord at gmail.com (Melkor Lord)
Date: Sun, 2 Oct 2016 01:38:28 +0200
Subject: [Borgbackup] [IDEA] Add support for external commands
In-Reply-To: <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org>
References: <CAOXaFywzL43e8ro-sJZ6cSaTL8SQRFk14F6mVCAu-WO31aFujg@mail.gmail.com>
 <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org>
Message-ID: <CAOXaFyxa_yuVBFb9DSic__2D_e7kYhfBFKY4wBKXqv3+DtKGQg@mail.gmail.com>

On Sat, Oct 1, 2016 at 8:45 PM, Walker Traylor <wtraylor at areyouthinking.org>
wrote:

"PS: I know we currently can do something like :
> ssh user at host "do-stuff" | borg create /path/to/repo::name
> but this lacks the more thorough controls borg could achieve by wrapping
> all the process."
>
> What more thorough controls do you require?
>

Quite simple : if "do-stuff" fails for some reason, there's a lot of work
to determine how to suppress the "failed" backup that Borg's going to make
anyway.

ssh user at host "echo '' ; /bin/false" | borg create /path/to/repo::test -

see? Now there's a "test" backup which is empty and serves no purpose at
all. Because of the pipe, I don't know that the ssh command failed. Borg
didn't fail but I have no way to distinguish "test" between "legit but no
new data" from "failed empty useless data" type of backup.

As I see it coming, NO, I will NOT use Bash for scripting to get the
$PIPESTATUS variable because Bash isn't available everywhere. I love Bash
for my everyday shell usage but I never use it in scripts, I require
/bin/sh (dash on Debian and derivatives) because POSIX scripts work
everywhere the same way without side effets (dash, busybox ash, whatever
POSIX compliant shell).

Borg (serve) wrapping the command could detect the failure (and all cases
with the --skip-cmd-fail=x,y,z) and notify the calling Borg to not create
the backup.

Hope it's clear.

-- 
Unix _IS_ user friendly, it's just selective about who its friends are.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161002/c3cf1708/attachment.html>

From melkor.lord at gmail.com  Sat Oct  1 19:42:39 2016
From: melkor.lord at gmail.com (Melkor Lord)
Date: Sun, 2 Oct 2016 01:42:39 +0200
Subject: [Borgbackup] [IDEA] Add support for external commands
In-Reply-To: <771d30a8-4a03-dc55-2909-233c3d843428@waldmann-edv.de>
References: <CAOXaFywzL43e8ro-sJZ6cSaTL8SQRFk14F6mVCAu-WO31aFujg@mail.gmail.com>
 <771d30a8-4a03-dc55-2909-233c3d843428@waldmann-edv.de>
Message-ID: <CAOXaFyyG9YKrfBTv04-w0=uhZGA=tc5C-gtna=kSQaAb0EXyJQ@mail.gmail.com>

On Sat, Oct 1, 2016 at 8:43 PM, Thomas Waldmann <tw at waldmann-edv.de> wrote:

> PS: I know we currently can do something like :
> > ssh user at host "do-stuff" | borg create /path/to/repo::name
> > but this lacks the more thorough controls borg could achieve by wrapping
> > all the process.
>
> I just wanted to suggest you just do that.
>
> In general, borg does not try to be a shell / scripting thing /
> scheduler / etc.
>
> Just using a shell, script language, cron, etc. is usually better for
> this than reinventing this inside borg.
>

I know and like Borg for that. In this case, this more "wrapping a process
and monitor its correct execution" rather providing scripting abilities or
anything out of the scope of Borg. What's why I don't think this is
reinventing the wheel. Just trying to have "snow tires" for the winter :-)


-- 
Unix _IS_ user friendly, it's just selective about who its friends are.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161002/2fae2975/attachment.html>

From sitaramc at gmail.com  Sat Oct  1 22:17:11 2016
From: sitaramc at gmail.com (Sitaram Chamarty)
Date: Sun, 2 Oct 2016 07:47:11 +0530
Subject: [Borgbackup] [IDEA] Add support for external commands
In-Reply-To: <CAOXaFyxa_yuVBFb9DSic__2D_e7kYhfBFKY4wBKXqv3+DtKGQg@mail.gmail.com>
References: <CAOXaFywzL43e8ro-sJZ6cSaTL8SQRFk14F6mVCAu-WO31aFujg@mail.gmail.com>
 <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org>
 <CAOXaFyxa_yuVBFb9DSic__2D_e7kYhfBFKY4wBKXqv3+DtKGQg@mail.gmail.com>
Message-ID: <02b02db8-6469-fb85-78c6-5408a9defef4@gmail.com>

On 10/02/2016 05:08 AM, Melkor Lord wrote:
> On Sat, Oct 1, 2016 at 8:45 PM, Walker Traylor <wtraylor at areyouthinking.org <mailto:wtraylor at areyouthinking.org>> wrote:
> 
>     "PS: I know we currently can do something like :
>     ssh user at host "do-stuff" | borg create /path/to/repo::name
>     but this lacks the more thorough controls borg could achieve by wrapping all the process."
> 
>     What more thorough controls do you require?
> 
> 
> Quite simple : if "do-stuff" fails for some reason, there's a lot of work to determine how to suppress the "failed" backup that Borg's going to make anyway.
> 
> ssh user at host "echo '' ; /bin/false" | borg create /path/to/repo::test -
> 
> see? Now there's a "test" backup which is empty and serves no purpose at all. Because of the pipe, I don't know that the ssh command failed. Borg didn't fail but I have no way to distinguish "test" between "legit but no new data" from "failed empty useless data" type of backup.
> 
> As I see it coming, NO, I will NOT use Bash for scripting to get the $PIPESTATUS variable because Bash isn't available everywhere. I love Bash for my everyday shell usage but I never use it in scripts, I require /bin/sh (dash on Debian and derivatives) because POSIX scripts work everywhere the same way without side effets (dash, busybox ash, whatever POSIX compliant shell).
> 
> Borg (serve) wrapping the command could detect the failure (and all cases with the --skip-cmd-fail=x,y,z) and notify the calling Borg to not create the backup.
> 
> Hope it's clear.

Speaking as a borg user (i.e., not a borg developer), it seems to me
that the output of the command, which could be arbitrarily large, needs
to be preserved somewhere, somehow, until the exit happens, and then --
if the exit is good -- create the archive.

I'm not a borg dev, and I can't speak for them, but it seems to me that
is not something borg should be doing.  The output could be arbitrarily
large before this happens.

If I needed something like this, I'd make a wrapper that calls the
command needed and uses a TMPDIR to buffer it's output, *then* call borg
depending on the exit.

Another way is to write a wrapper that creates the archive anyway, but
if the pipe fails, deletes it.  Here's something I cooked up in a few
minutes.  Sure it's kludgey, but it gets the job done.

(As for the bash rant; to each his own.  I would suggest that any
environment that *requires* such complex features should be able to
install the right tools, but that's just my opinion.  In any case, it
sounds weird that you can install borg but not bash).

    #!/bin/bash

    repo="$1"
    archive="$2"    # cannot be '{now}' or similar; sorry!

    if [ "$3" = "abort" ]
    then
        borg delete $repo::$archive
    else
        cmd="$3"

        ( eval $cmd || ( ( sleep 1; $0 $repo $archive abort ) >&2 & ) ) | borg create $repo::$archive -
    fi

From melkor.lord at gmail.com  Sun Oct  2 00:00:39 2016
From: melkor.lord at gmail.com (Melkor Lord)
Date: Sun, 2 Oct 2016 06:00:39 +0200
Subject: [Borgbackup] [IDEA] Add support for external commands
In-Reply-To: <02b02db8-6469-fb85-78c6-5408a9defef4@gmail.com>
References: <CAOXaFywzL43e8ro-sJZ6cSaTL8SQRFk14F6mVCAu-WO31aFujg@mail.gmail.com>
 <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org>
 <CAOXaFyxa_yuVBFb9DSic__2D_e7kYhfBFKY4wBKXqv3+DtKGQg@mail.gmail.com>
 <02b02db8-6469-fb85-78c6-5408a9defef4@gmail.com>
Message-ID: <CAOXaFyzVzTRYOVm5hO+R+onLV2Mq2bOMzP2iOYKub_pbL0r_ew@mail.gmail.com>

On Sun, Oct 2, 2016 at 4:17 AM, Sitaram Chamarty <sitaramc at gmail.com> wrote:

Speaking as a borg user (i.e., not a borg developer), it seems to me
> that the output of the command, which could be arbitrarily large, needs
> to be preserved somewhere, somehow, until the exit happens, and then --
> if the exit is good -- create the archive.
>

No. This is not the intended use. Please (re)read my first post. The goal
is to apply the equivalent of "someapp | borg create ..." to a networked
version via "borg serve".

I'm not a borg dev, and I can't speak for them, but it seems to me that
> is not something borg should be doing.  The output could be arbitrarily
> large before this happens.
>

This is not a problem, this is the intended usage! Again from my first post
(either you didn't read or did not understand, no offense) :

(mockup)
borg create --remote-cmd user at host: -- mysqldump [options] db.sql

"borg serve" running on "host" would simply execute "mysqldump [options]
db.sql" and yes, the output could be arbitrary large but we don't care
since it is piped into Borg on *PURPOSE* !

"borg serve" running on "host" would just have to capture stdout to feed
"borg create" on the backup server _AND_ capture stderror as well as the
returncode of the executed process (here: "mysqldump" for instance).

"borg serve" running on "host" would then notify "borg create" if the
returncode is non-zero (or another returncode if supported via the adequate
option) to NOT create the backup and optionally return the contents of
stderr of the failed process to display or mail to the admin.

I hope it's clearer now.


> If I needed something like this, I'd make a wrapper that calls the
> command needed and uses a TMPDIR to buffer it's output, *then* call borg
> depending on the exit.
>

This approach is impossible in the case of "borg serve".


> Another way is to write a wrapper that creates the archive anyway, but
> if the pipe fails, deletes it.


That would be necessary in some cases yes. I have some in mind like making
backups of MSSQL servers. There's no way to create the SQL dump and output
to stdout so, a script would first create the SQL backup to a file and if
everything is fine, "cat" the file to stdout for "borg create" to capture.


> (As for the bash rant; to each his own.  I would suggest that any
> environment that *requires* such complex features should be able to
> install the right tools, but that's just my opinion.  In any case, it
> sounds weird that you can install borg but not bash).
>

Where did you see any rant? I use and love Bash everyday on interactive
shells but ALL scripts I write are in pure POSIX shell and I have yet to
find a case where Bash is required over a pure POSIX shell. Believe me, I
wrote highly complex apps in pure Shell script for the past 2 decades.

As of the weirdness you imply in your last sentence, please don't draw
conclusions when you don't have all the details. There are numerous
situations where you can't install software on servers. They can be
"certified" servers where no installation is allowed. They can be
proprietary or limited systems/appliances where you have minimal shell
support via SSH. Ever tried to install Bash on a VMware ESXi host?

In all those cases, we could not install Borg either (well, using some
cxfreeze magic could make it possible though) but these are special cases.

-- 
Unix _IS_ user friendly, it's just selective about who its friends are.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161002/d88f8895/attachment-0001.html>

From public at enkore.de  Sun Oct  2 04:03:15 2016
From: public at enkore.de (Marian Beermann)
Date: Sun, 2 Oct 2016 10:03:15 +0200
Subject: [Borgbackup] [IDEA] Add support for external commands
In-Reply-To: <CAOXaFyxa_yuVBFb9DSic__2D_e7kYhfBFKY4wBKXqv3+DtKGQg@mail.gmail.com>
References: <CAOXaFywzL43e8ro-sJZ6cSaTL8SQRFk14F6mVCAu-WO31aFujg@mail.gmail.com>
 <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org>
 <CAOXaFyxa_yuVBFb9DSic__2D_e7kYhfBFKY4wBKXqv3+DtKGQg@mail.gmail.com>
Message-ID: <a242a83a-5e9f-b53f-3039-9c88a783f21e@enkore.de>

On 02.10.2016 01:38, Melkor Lord wrote:
> On Sat, Oct 1, 2016 at 8:45 PM, Walker Traylor
> <wtraylor at areyouthinking.org <mailto:wtraylor at areyouthinking.org>> wrote:
>
>     "PS: I know we currently can do something like :
>     ssh user at host "do-stuff" | borg create /path/to/repo::name
>     but this lacks the more thorough controls borg could achieve by
>     wrapping all the process."
>
>     What more thorough controls do you require?
>
>
> Quite simple : if "do-stuff" fails for some reason, there's a lot of
> work to determine how to suppress the "failed" backup that Borg's going
> to make anyway.
>
> ssh user at host "echo '' ; /bin/false" | borg create /path/to/repo::test -
>
> see? Now there's a "test" backup which is empty and serves no purpose at
> all. Because of the pipe, I don't know that the ssh command failed. Borg
> didn't fail but I have no way to distinguish "test" between "legit but
> no new data" from "failed empty useless data" type of backup.
>

That's a good point.

I could imagine something like

borg create /path/to/repo::test --from-command "ssh user at host ..."

Which would execute that command (in a local shell) and put stdout into
an archive, while leaving stderr connected to stderr. If the
command/shell exits with a nonzero status, a rollback is made and no
backup would be created.

Cheers, Marian

From public at enkore.de  Sun Oct  2 04:07:08 2016
From: public at enkore.de (Marian Beermann)
Date: Sun, 2 Oct 2016 10:07:08 +0200
Subject: [Borgbackup] [IDEA] Add support for external commands
In-Reply-To: <a242a83a-5e9f-b53f-3039-9c88a783f21e@enkore.de>
References: <CAOXaFywzL43e8ro-sJZ6cSaTL8SQRFk14F6mVCAu-WO31aFujg@mail.gmail.com>
 <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org>
 <CAOXaFyxa_yuVBFb9DSic__2D_e7kYhfBFKY4wBKXqv3+DtKGQg@mail.gmail.com>
 <a242a83a-5e9f-b53f-3039-9c88a783f21e@enkore.de>
Message-ID: <5b275fc8-15fb-ae6e-2c38-5076169b5a37@enkore.de>

On 02.10.2016 10:03, Marian Beermann wrote:
> On 02.10.2016 01:38, Melkor Lord wrote:
>> On Sat, Oct 1, 2016 at 8:45 PM, Walker Traylor
>> <wtraylor at areyouthinking.org <mailto:wtraylor at areyouthinking.org>> wrote:
>>
>>     "PS: I know we currently can do something like :
>>     ssh user at host "do-stuff" | borg create /path/to/repo::name
>>     but this lacks the more thorough controls borg could achieve by
>>     wrapping all the process."
>>
>>     What more thorough controls do you require?
>>
>>
>> Quite simple : if "do-stuff" fails for some reason, there's a lot of
>> work to determine how to suppress the "failed" backup that Borg's going
>> to make anyway.
>>
>> ssh user at host "echo '' ; /bin/false" | borg create /path/to/repo::test -
>>
>> see? Now there's a "test" backup which is empty and serves no purpose at
>> all. Because of the pipe, I don't know that the ssh command failed. Borg
>> didn't fail but I have no way to distinguish "test" between "legit but
>> no new data" from "failed empty useless data" type of backup.
>>
> 
> That's a good point.
> 
> I could imagine something like
> 
> borg create /path/to/repo::test --from-command "ssh user at host ..."
> 
> Which would execute that command (in a local shell) and put stdout into
> an archive, while leaving stderr connected to stderr. If the
> command/shell exits with a nonzero status, a rollback is made and no
> backup would be created.
> 
> Cheers, Marian
> 

What I don't quite understand yet is where borg serve comes into play.
If I do, e.g. "ssh someone at somewhere false" then the error code is
propagated to the host and ssh itself exits with it. So no specific
handling for SSH would be needed, unless I'm overlooking something here.

Cheers, Marian

From melkor.lord at gmail.com  Sun Oct  2 11:17:22 2016
From: melkor.lord at gmail.com (Melkor Lord)
Date: Sun, 2 Oct 2016 17:17:22 +0200
Subject: [Borgbackup] [IDEA] Add support for external commands
In-Reply-To: <5b275fc8-15fb-ae6e-2c38-5076169b5a37@enkore.de>
References: <CAOXaFywzL43e8ro-sJZ6cSaTL8SQRFk14F6mVCAu-WO31aFujg@mail.gmail.com>
 <1B5610A0-5306-41F0-A3AC-B118774A48CF@areyouthinking.org>
 <CAOXaFyxa_yuVBFb9DSic__2D_e7kYhfBFKY4wBKXqv3+DtKGQg@mail.gmail.com>
 <a242a83a-5e9f-b53f-3039-9c88a783f21e@enkore.de>
 <5b275fc8-15fb-ae6e-2c38-5076169b5a37@enkore.de>
Message-ID: <CAOXaFyzjPXK91FeWh09611o1SmKiz0+6gZd7_n3hJB4URpwd7A@mail.gmail.com>

On Sun, Oct 2, 2016 at 10:07 AM, Marian Beermann <public at enkore.de> wrote:

What I don't quite understand yet is where borg serve comes into play.
> If I do, e.g. "ssh someone at somewhere false" then the error code is
> propagated to the host and ssh itself exits with it. So no specific
> handling for SSH would be needed, unless I'm overlooking something here.
>

Isn't the job of "borg serve" to perform the backup itself on the remote
host and just feed back "borg create" (the caller) with
backed-up-data-ready-for-storage? Or do I not understand clearly the
purpose of "borg serve"?

This way, the strain is put on the hosts executing "borg serve", thus
allowing the backup server "borg create" to perform parallel backups (on
different repos of course) without hogging the CPU.

That's why I was thinking this is a job for "borg serve". Another benefit :
When "borg serve" executes the command but it "fails" (non-zero returncode
or --skip-cmd-fail=1,2,n... ) it can "notify" the caller (borg create) to
not create the backup in the first place instead of creating one and then
rollback. I bet this is quite I/O costly on already big repos.

-- 
Unix _IS_ user friendly, it's just selective about who its friends are.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161002/6edf9edd/attachment.html>

From sitaramc at gmail.com  Sat Oct  8 07:06:39 2016
From: sitaramc at gmail.com (Sitaram Chamarty)
Date: Sat, 8 Oct 2016 16:36:39 +0530
Subject: [Borgbackup] bypassing "Cache newer than repo" error
Message-ID: <bfbb30e3-8f60-4514-2a71-45f837d790a2@gmail.com>

Hi

Some of the directories are backed up to two (in one case three)
different external (USB) hard disks.  When I finish with the first USB
drive, unmount it, and mount the next one and try the backup, borg tells
me the cache is newer.

I could not find anything about how to bypass this in the docs.  I have
now created a complicated system of separately maintaining the cache
directories for each external disk (labelled in some way that correlates
with the physical disk in question) and manually shuffle them around.

Any pointers would be appreciated.

regards
sitaram

PS: Yes removing ~/.cache/borg works and is generally harmless.  But
that makes ALL the files show up as "A ...", whereas I like to eyeball
the list to see if something got updated which I did not expec to be
updated based on what I have been working on since the last backup.

From public at enkore.de  Sat Oct  8 07:26:46 2016
From: public at enkore.de (Marian Beermann)
Date: Sat, 8 Oct 2016 13:26:46 +0200
Subject: [Borgbackup] bypassing "Cache newer than repo" error
In-Reply-To: <bfbb30e3-8f60-4514-2a71-45f837d790a2@gmail.com>
References: <bfbb30e3-8f60-4514-2a71-45f837d790a2@gmail.com>
Message-ID: <e36e6aa7-5591-20b9-f0ac-d2c2c9ec5c7f@enkore.de>

Hi sitaram

This sounds like you created one repository and copied it to multiple
drives/locations?

In that case this is to be expected - Borg distinguishes different
repositories by their ID, which is independent of the location and would
be the same for these copied repositories.

You can change the repository ID in the "config" file of the repository
(it's hex, keep it the same length), which separates the repositories.

Note: for encrypted repositories it's a very unsafe thing to have
multiple independently updated copies of a repository; if they diverge
(minutely different contents) and an attacker gains access to more than
one copy, the privacy of the repository contents may be compromised.

Cheers, Marian

FAQs:
http://borgbackup.readthedocs.io/en/stable/faq.html#can-i-copy-or-synchronize-my-repo-to-another-location

On 08.10.2016 13:06, Sitaram Chamarty wrote:
> Hi
>
> Some of the directories are backed up to two (in one case three)
> different external (USB) hard disks.  When I finish with the first USB
> drive, unmount it, and mount the next one and try the backup, borg tells
> me the cache is newer.
>
> I could not find anything about how to bypass this in the docs.  I have
> now created a complicated system of separately maintaining the cache
> directories for each external disk (labelled in some way that correlates
> with the physical disk in question) and manually shuffle them around.
>
> Any pointers would be appreciated.
>
> regards
> sitaram
>
> PS: Yes removing ~/.cache/borg works and is generally harmless.  But
> that makes ALL the files show up as "A ...", whereas I like to eyeball
> the list to see if something got updated which I did not expec to be
> updated based on what I have been working on since the last backup.
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
>


From sitaramc at gmail.com  Sat Oct  8 08:17:22 2016
From: sitaramc at gmail.com (Sitaram Chamarty)
Date: Sat, 8 Oct 2016 17:47:22 +0530
Subject: [Borgbackup] bypassing "Cache newer than repo" error
In-Reply-To: <e36e6aa7-5591-20b9-f0ac-d2c2c9ec5c7f@enkore.de>
References: <bfbb30e3-8f60-4514-2a71-45f837d790a2@gmail.com>
 <e36e6aa7-5591-20b9-f0ac-d2c2c9ec5c7f@enkore.de>
Message-ID: <777b7ccd-cb62-45d6-1c99-1bb9efb12730@gmail.com>

Hi Marian

Thanks for your quick reply; much appreciated.

On 10/08/2016 04:56 PM, Marian Beermann wrote:

> This sounds like you created one repository and copied it to multiple
> drives/locations?

I was going to say "no way" but it appears that is what I did.  Now I
also understand why it's happening only to one specific repo (a rather
large one I got lazy about creating the first time, and simply did a
copy).  I'd clean forgotten!

> You can change the repository ID in the "config" file of the repository
> (it's hex, keep it the same length), which separates the repositories.

I assume it has no other semantics so I can just randomly change some
hex digits into others?

> Note: for encrypted repositories it's a very unsafe thing to have
> multiple independently updated copies of a repository; if they diverge
> (minutely different contents) and an attacker gains access to more than
> one copy, the privacy of the repository contents may be compromised.

I do have multiple independently updated copies of a repo, but -- other
than this one where #2 was created by a file-system level copy of #1,
all the others are "borg init"-ed independently on each disk.

The passphrase I use is the same, but I assume the internal key was
randomly generated each time and would not be the same, so the attack
you speak of should not happen for those repos even if someone got hold
of them.

Is that understanding correct?

Thanks again and best regards
sitaram

> 
> Cheers, Marian
> 
> FAQs:
> http://borgbackup.readthedocs.io/en/stable/faq.html#can-i-copy-or-synchronize-my-repo-to-another-location
> 
> On 08.10.2016 13:06, Sitaram Chamarty wrote:
>> Hi
>>
>> Some of the directories are backed up to two (in one case three)
>> different external (USB) hard disks.  When I finish with the first USB
>> drive, unmount it, and mount the next one and try the backup, borg tells
>> me the cache is newer.
>>
>> I could not find anything about how to bypass this in the docs.  I have
>> now created a complicated system of separately maintaining the cache
>> directories for each external disk (labelled in some way that correlates
>> with the physical disk in question) and manually shuffle them around.
>>
>> Any pointers would be appreciated.
>>
>> regards
>> sitaram
>>
>> PS: Yes removing ~/.cache/borg works and is generally harmless.  But
>> that makes ALL the files show up as "A ...", whereas I like to eyeball
>> the list to see if something got updated which I did not expec to be
>> updated based on what I have been working on since the last backup.
>> _______________________________________________
>> Borgbackup mailing list
>> Borgbackup at python.org
>> https://mail.python.org/mailman/listinfo/borgbackup
>>
> 
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
> 


From public at enkore.de  Sat Oct  8 08:43:39 2016
From: public at enkore.de (Marian Beermann)
Date: Sat, 8 Oct 2016 14:43:39 +0200
Subject: [Borgbackup] bypassing "Cache newer than repo" error
In-Reply-To: <777b7ccd-cb62-45d6-1c99-1bb9efb12730@gmail.com>
References: <bfbb30e3-8f60-4514-2a71-45f837d790a2@gmail.com>
 <e36e6aa7-5591-20b9-f0ac-d2c2c9ec5c7f@enkore.de>
 <777b7ccd-cb62-45d6-1c99-1bb9efb12730@gmail.com>
Message-ID: <c16a5af3-ed34-87cc-7506-fe1080b8f9a5@enkore.de>

Hi Sitaram

On 08.10.2016 14:17, Sitaram Chamarty wrote:
>> This sounds like you created one repository and copied it to multiple
>> drives/locations?
>
> I was going to say "no way" but it appears that is what I did.  Now I
> also understand why it's happening only to one specific repo (a rather
> large one I got lazy about creating the first time, and simply did a
> copy).  I'd clean forgotten!
>
>> You can change the repository ID in the "config" file of the repository
>> (it's hex, keep it the same length), which separates the repositories.
>
> I assume it has no other semantics so I can just randomly change some
> hex digits into others?

Yes.

Caveat: if you use key-file mode you'll have to make the same change in
the key files you use. The default location is
~/.config/borg/keys/<names>. Every key file starts with "BORG_KEY
<repository ID>", that's where you need to make the change.

In repokey or unencrypted mode this doesn't matter.

>> Note: for encrypted repositories it's a very unsafe thing to have
>> multiple independently updated copies of a repository; if they diverge
>> (minutely different contents) and an attacker gains access to more than
>> one copy, the privacy of the repository contents may be compromised.
>
> I do have multiple independently updated copies of a repo, but -- other
> than this one where #2 was created by a file-system level copy of #1,
> all the others are "borg init"-ed independently on each disk.
>
> The passphrase I use is the same, but I assume the internal key was
> randomly generated each time and would not be the same, so the attack
> you speak of should not happen for those repos even if someone got hold
> of them.
>
> Is that understanding correct?

Repositories independently "borg init"-ed are really independent, no
problems there.

But in the case, where you "borg init" and then copy (cp -r / rsync /
etc.) the repo the keys will also be the same. This possibly leads to
the situation where different data is encrypted with the same key, which
is highly problematic.

Cheers, Marian

> Thanks again and best regards
> sitaram
>
>>
>> Cheers, Marian
>>
>> FAQs:
>>
http://borgbackup.readthedocs.io/en/stable/faq.html#can-i-copy-or-synchronize-my-repo-to-another-location
>>
>> On 08.10.2016 13:06, Sitaram Chamarty wrote:
>>> Hi
>>>
>>> Some of the directories are backed up to two (in one case three)
>>> different external (USB) hard disks.  When I finish with the first USB
>>> drive, unmount it, and mount the next one and try the backup, borg tells
>>> me the cache is newer.
>>>
>>> I could not find anything about how to bypass this in the docs.  I have
>>> now created a complicated system of separately maintaining the cache
>>> directories for each external disk (labelled in some way that correlates
>>> with the physical disk in question) and manually shuffle them around.
>>>
>>> Any pointers would be appreciated.
>>>
>>> regards
>>> sitaram
>>>
>>> PS: Yes removing ~/.cache/borg works and is generally harmless.  But
>>> that makes ALL the files show up as "A ...", whereas I like to eyeball
>>> the list to see if something got updated which I did not expec to be
>>> updated based on what I have been working on since the last backup.
>>> _______________________________________________
>>> Borgbackup mailing list
>>> Borgbackup at python.org
>>> https://mail.python.org/mailman/listinfo/borgbackup
>>>
>>
>> _______________________________________________
>> Borgbackup mailing list
>> Borgbackup at python.org
>> https://mail.python.org/mailman/listinfo/borgbackup
>>
>


From sitaramc at gmail.com  Sat Oct  8 09:58:47 2016
From: sitaramc at gmail.com (Sitaram Chamarty)
Date: Sat, 8 Oct 2016 19:28:47 +0530
Subject: [Borgbackup] bypassing "Cache newer than repo" error
In-Reply-To: <c16a5af3-ed34-87cc-7506-fe1080b8f9a5@enkore.de>
References: <bfbb30e3-8f60-4514-2a71-45f837d790a2@gmail.com>
 <e36e6aa7-5591-20b9-f0ac-d2c2c9ec5c7f@enkore.de>
 <777b7ccd-cb62-45d6-1c99-1bb9efb12730@gmail.com>
 <c16a5af3-ed34-87cc-7506-fe1080b8f9a5@enkore.de>
Message-ID: <3693aa79-c2ed-edf7-303b-2f84662b8039@gmail.com>


Hi Marian,

On 10/08/2016 06:13 PM, Marian Beermann wrote:

> On 08.10.2016 14:17, Sitaram Chamarty wrote:
>>> This sounds like you created one repository and copied it to multiple
>>> drives/locations?
>>
>> I was going to say "no way" but it appears that is what I did.  Now I
>> also understand why it's happening only to one specific repo (a rather
>> large one I got lazy about creating the first time, and simply did a
>> copy).  I'd clean forgotten!
>>
>>> You can change the repository ID in the "config" file of the repository
>>> (it's hex, keep it the same length), which separates the repositories.
>>
>> I assume it has no other semantics so I can just randomly change some
>> hex digits into others?
> 
> Yes.
> 
> Caveat: if you use key-file mode you'll have to make the same change in
> the key files you use. The default location is
> ~/.config/borg/keys/<names>. Every key file starts with "BORG_KEY
> <repository ID>", that's where you need to make the change.

Thanks.  I don't use keyfile but it's good to know.

> In repokey or unencrypted mode this doesn't matter.
> 
>>> Note: for encrypted repositories it's a very unsafe thing to have
>>> multiple independently updated copies of a repository; if they diverge
>>> (minutely different contents) and an attacker gains access to more than
>>> one copy, the privacy of the repository contents may be compromised.
>>
>> I do have multiple independently updated copies of a repo, but -- other
>> than this one where #2 was created by a file-system level copy of #1,
>> all the others are "borg init"-ed independently on each disk.
>>
>> The passphrase I use is the same, but I assume the internal key was
>> randomly generated each time and would not be the same, so the attack
>> you speak of should not happen for those repos even if someone got hold
>> of them.
>>
>> Is that understanding correct?
> 
> Repositories independently "borg init"-ed are really independent, no
> problems there.
> 
> But in the case, where you "borg init" and then copy (cp -r / rsync /
> etc.) the repo the keys will also be the same. This possibly leads to
> the situation where different data is encrypted with the same key, which
> is highly problematic.

Thanks for confirming my understanding.  (Yeah that one repo is at risk;
I'll delete the two copies and start new, separate, repos on each disk).

Thanks again!

regards
sitaram

> 
> Cheers, Marian
> 
>> Thanks again and best regards
>> sitaram
>>
>>>
>>> Cheers, Marian
>>>
>>> FAQs:
>>>
> http://borgbackup.readthedocs.io/en/stable/faq.html#can-i-copy-or-synchronize-my-repo-to-another-location
>>>
>>> On 08.10.2016 13:06, Sitaram Chamarty wrote:
>>>> Hi
>>>>
>>>> Some of the directories are backed up to two (in one case three)
>>>> different external (USB) hard disks.  When I finish with the first USB
>>>> drive, unmount it, and mount the next one and try the backup, borg tells
>>>> me the cache is newer.
>>>>
>>>> I could not find anything about how to bypass this in the docs.  I have
>>>> now created a complicated system of separately maintaining the cache
>>>> directories for each external disk (labelled in some way that correlates
>>>> with the physical disk in question) and manually shuffle them around.
>>>>
>>>> Any pointers would be appreciated.
>>>>
>>>> regards
>>>> sitaram
>>>>
>>>> PS: Yes removing ~/.cache/borg works and is generally harmless.  But
>>>> that makes ALL the files show up as "A ...", whereas I like to eyeball
>>>> the list to see if something got updated which I did not expec to be
>>>> updated based on what I have been working on since the last backup.
>>>> _______________________________________________
>>>> Borgbackup mailing list
>>>> Borgbackup at python.org
>>>> https://mail.python.org/mailman/listinfo/borgbackup
>>>>
>>>
>>> _______________________________________________
>>> Borgbackup mailing list
>>> Borgbackup at python.org
>>> https://mail.python.org/mailman/listinfo/borgbackup
>>>
>>
> 


From tw at waldmann-edv.de  Mon Oct 17 12:03:28 2016
From: tw at waldmann-edv.de (Thomas Waldmann)
Date: Mon, 17 Oct 2016 18:03:28 +0200
Subject: [Borgbackup] borgbackup 1.0.8rc1
Message-ID: <7dc7552e-879d-a119-8b08-a0fad9730bac@waldmann-edv.de>

Released borgbackup 1.0.8rc1 right now.

https://github.com/borgbackup/borg/releases/tag/1.0.8rc1

https://github.com/borgbackup/borg/blob/1.0.8rc1/docs/changes.rst#version-108rc1-2016-10-17

It would be helpful if you practically test this, so anything not
discovered by unit tests can be fixed.

The final 1.0.8 release is scheduled for 2016-10-29, so be quick.


-- 

GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393


From public at enkore.de  Thu Oct 20 06:15:09 2016
From: public at enkore.de (Marian Beermann)
Date: Thu, 20 Oct 2016 12:15:09 +0200
Subject: [Borgbackup] What don't you like about Borg?
Message-ID: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>

It's easy to lose track of what's important and what annoys people, or
doesn't work for them.

We all know about the good stuff Borg does, where it shines. I want to
know about the bad stuff. Where it's annoying, doesn't work like one
wanted to...


Cheers, Marian

-

Things that annoy me:

- No good desktop GUI (I'm not a good designer, my own attempt kinda
failed).
- Sometimes it's slow and it's hard to tell why without knowing a lot
about internals
- Error messages are often kinda obscure
- When used on the command line progress output is often missing (in
current beta) even with --progress


From pschiffe at redhat.com  Thu Oct 20 06:40:00 2016
From: pschiffe at redhat.com (Peter Schiffer)
Date: Thu, 20 Oct 2016 12:40:00 +0200
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
Message-ID: <CA+cv5A4=FN1WRpm2FXFZh+O3Xs_zPZm-Yxwio2gM_KaFKcvxew@mail.gmail.com>

I'm currently moving from borg to burp because I'm missing centralized
management - managing clients from server (what to backup, when, etc). Burp
enables this and with burp-ui it's possible to manage multiple burp servers
and all of their clients from single web ui:
https://git.ziirish.me/ziirish/burp-ui

In borg I also miss native support of various remote locations, like S3,
samba, ftp..

And desktop GUI as well. Even simple deja dup is just fine for regular
users..

peter

On Thu, Oct 20, 2016 at 12:15 PM, Marian Beermann <public at enkore.de> wrote:

> It's easy to lose track of what's important and what annoys people, or
> doesn't work for them.
>
> We all know about the good stuff Borg does, where it shines. I want to
> know about the bad stuff. Where it's annoying, doesn't work like one
> wanted to...
>
>
> Cheers, Marian
>
> -
>
> Things that annoy me:
>
> - No good desktop GUI (I'm not a good designer, my own attempt kinda
> failed).
> - Sometimes it's slow and it's hard to tell why without knowing a lot
> about internals
> - Error messages are often kinda obscure
> - When used on the command line progress output is often missing (in
> current beta) even with --progress
>
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161020/17116d21/attachment.html>

From heiko.helmle at horiba.com  Thu Oct 20 07:40:43 2016
From: heiko.helmle at horiba.com (heiko.helmle at horiba.com)
Date: Thu, 20 Oct 2016 13:40:43 +0200
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
Message-ID: <OF68B0BDC6.BB5351AF-ONC1258052.00389C38-C1258052.00401E44@smtpgw.horiba.co.jp>

> 
> Things that annoy me:
> 
> - No good desktop GUI (I'm not a good designer, my own attempt kinda
> failed).
> - Sometimes it's slow and it's hard to tell why without knowing a lot
> about internals
> - Error messages are often kinda obscure
> - When used on the command line progress output is often missing (in
> current beta) even with --progress


well I guess borg is best run by a cron job, so progress indicator and/or 
GUI doesn't matter to me much.

on the basic feature side i'm missing an indicator that checks if the file 
was changed during backup (like tar does).

and different storage backends would be very nice - ssh/borg serve is good 
but sometimes all you have is FTP (and ftpfs is ugly...).

Best Regards
 Heiko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161020/912ceb8f/attachment.html>

From public at enkore.de  Thu Oct 20 07:57:10 2016
From: public at enkore.de (Marian Beermann)
Date: Thu, 20 Oct 2016 13:57:10 +0200
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <OF68B0BDC6.BB5351AF-ONC1258052.00389C38-C1258052.00401E44@smtpgw.horiba.co.jp>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <OF68B0BDC6.BB5351AF-ONC1258052.00389C38-C1258052.00401E44@smtpgw.horiba.co.jp>
Message-ID: <772df8ae-f788-048e-8c1a-b6a278f5f27d@enkore.de>

There is something for changed files, iirc it has been improved since
1.0.x. It looks like this in 1.0.x:

$ mkdir files
$ touch files/1
$ touch files/2
$ borg create repo::a1 files --list --filter AME -v
A files/1
A files/2
$ touch files/2
$ echo "change" > files/2
$ touch files/3
$ borg create repo::a2 files --list --filter AME -v
A files/2
A files/3
$

(A, M => added/modified, E => error, U => unchanged, see
http://borgbackup.readthedocs.io/en/stable/usage.html#item-flags for all
and FAQ
http://borgbackup.readthedocs.io/en/stable/faq.html#a-status-oddity )

Cheers, Marian

On 20.10.2016 13:40, heiko.helmle at horiba.com wrote:
> 
>>
>> Things that annoy me:
>>
>> - No good desktop GUI (I'm not a good designer, my own attempt kinda
>> failed).
>> - Sometimes it's slow and it's hard to tell why without knowing a lot
>> about internals
>> - Error messages are often kinda obscure
>> - When used on the command line progress output is often missing (in
>> current beta) even with --progress
> 
> 
> well I guess borg is best run by a cron job, so progress indicator
> and/or GUI doesn't matter to me much.
> 
> on the basic feature side i'm missing an indicator that checks if the
> file was changed during backup (like tar does).
> 
> and different storage backends would be very nice - ssh/borg serve is
> good but sometimes all you have is FTP (and ftpfs is ugly...).
> 
> Best Regards
>  Heiko
> 
> 
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
> 


From heiko.helmle at horiba.com  Thu Oct 20 07:59:57 2016
From: heiko.helmle at horiba.com (heiko.helmle at horiba.com)
Date: Thu, 20 Oct 2016 13:59:57 +0200
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <772df8ae-f788-048e-8c1a-b6a278f5f27d@enkore.de>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <OF68B0BDC6.BB5351AF-ONC1258052.00389C38-C1258052.00401E44@smtpgw.horiba.co.jp>
 <772df8ae-f788-048e-8c1a-b6a278f5f27d@enkore.de>
Message-ID: <OF29A8CBC9.F6980189-ONC1258052.0041C07B-C1258052.0041E112@smtpgw.horiba.co.jp>

"Borgbackup" <borgbackup-bounces+heiko.helmle=horiba.com at python.org> wrote 
on 20.10.2016 13:57:10:

> From: Marian Beermann <public at enkore.de>
> To: borgbackup at python.org
> Date: 20.10.2016 13:57
> Subject: Re: [Borgbackup] What don't you like about Borg?
> Sent by: "Borgbackup" 
<borgbackup-bounces+heiko.helmle=horiba.com at python.org>
> 
> There is something for changed files, iirc it has been improved since
> 1.0.x. It looks like this in 1.0.x:
> 

That's not what I meant. TAR brings a warning if the file changed _during_ 
backup.

If another process writes to a file during backup the backupped file is 
probably corrupt. That's why tar gives a warning in that case.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161020/3a754542/attachment.html>

From public at enkore.de  Thu Oct 20 08:04:47 2016
From: public at enkore.de (Marian Beermann)
Date: Thu, 20 Oct 2016 14:04:47 +0200
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <OF29A8CBC9.F6980189-ONC1258052.0041C07B-C1258052.0041E112@smtpgw.horiba.co.jp>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <OF68B0BDC6.BB5351AF-ONC1258052.00389C38-C1258052.00401E44@smtpgw.horiba.co.jp>
 <772df8ae-f788-048e-8c1a-b6a278f5f27d@enkore.de>
 <OF29A8CBC9.F6980189-ONC1258052.0041C07B-C1258052.0041E112@smtpgw.horiba.co.jp>
Message-ID: <2650d102-fc67-e6ca-553f-5c52f24a13a9@enkore.de>

On 20.10.2016 13:59, heiko.helmle at horiba.com wrote:
> 
> "Borgbackup" <borgbackup-bounces+heiko.helmle=horiba.com at python.org>
> wrote on 20.10.2016 13:57:10:
> 
>> From: Marian Beermann <public at enkore.de>
>> To: borgbackup at python.org
>> Date: 20.10.2016 13:57
>> Subject: Re: [Borgbackup] What don't you like about Borg?
>> Sent by: "Borgbackup"
> <borgbackup-bounces+heiko.helmle=horiba.com at python.org>
>>
>> There is something for changed files, iirc it has been improved since
>> 1.0.x. It looks like this in 1.0.x:
>>
> 
> That's not what I meant. TAR brings a warning if the file changed
> _during_ backup.
> 
> If another process writes to a file during backup the backupped file is
> probably corrupt. That's why tar gives a warning in that case.

That would be very useful indeed.

I looked up how tar does it:

	  if ((timespec_cmp (get_stat_ctime (&final_stat), original_ctime) != 0
	       /* Original ctime will change if the file is a directory and
		  --remove-files is given */
	       && !(remove_files_option && is_dir))
	      || original_size < final_stat.st_size)
	    {
	      WARNOPT (WARN_FILE_CHANGED,
		       (0, 0, _("%s: file changed as we read it"),
			quotearg_colon (p)));
	      set_exit_status (TAREXIT_DIFFERS);
	    }

It stores the ctime before reading the file and then compares it after
it is done with a file. This works on *nix, since the c(hange)time is
always updated and can't be changed by user space processes.

Cheers, Marian

From wtraylor at areyouthinking.org  Thu Oct 20 08:22:33 2016
From: wtraylor at areyouthinking.org (Walker Traylor)
Date: Thu, 20 Oct 2016 19:22:33 +0700
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
Message-ID: <4CC6B42F-B0CD-440A-8C0F-FE47B8BA2B80@areyouthinking.org>

-Documentation, getting started guide could use improvement.  I wanted to learn how to set it up as fast as possible and had to learn a lot more about internals and command switches than should be necessary on a first try for a 1.0 release with normal use case.  

-Repair files notice.   I would like to see in "borg list" the archives which have been ?repaired? by borg repair (zero?d out.)

-Improved ability to cancel and resume.  I have to repair files a lot for some reason, probably on archives which were interrupted, which is troubling.   


Walker Traylor
walker at walkertraylor.com <mailto:walker at walkertraylor.com>
m: +1.703.389.4507
skype: wtraylor
linkedin.com/in/walkertraylor <http://www.linkedin.com/in/walkertraylor>
> On Oct 20, 2016, at 5:15 PM, Marian Beermann <public at enkore.de> wrote:
> 
> It's easy to lose track of what's important and what annoys people, or
> doesn't work for them.
> 
> We all know about the good stuff Borg does, where it shines. I want to
> know about the bad stuff. Where it's annoying, doesn't work like one
> wanted to...
> 
> 
> Cheers, Marian
> 
> -
> 
> Things that annoy me:
> 
> - No good desktop GUI (I'm not a good designer, my own attempt kinda
> failed).
> - Sometimes it's slow and it's hard to tell why without knowing a lot
> about internals
> - Error messages are often kinda obscure
> - When used on the command line progress output is often missing (in
> current beta) even with --progress
> 
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161020/bcb4094d/attachment.html>

From gait at ATComputing.nl  Thu Oct 20 08:02:49 2016
From: gait at ATComputing.nl (Gerrit A. Smit)
Date: Thu, 20 Oct 2016 14:02:49 +0200
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <OF29A8CBC9.F6980189-ONC1258052.0041C07B-C1258052.0041E112@smtpgw.horiba.co.jp>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <OF68B0BDC6.BB5351AF-ONC1258052.00389C38-C1258052.00401E44@smtpgw.horiba.co.jp>
 <772df8ae-f788-048e-8c1a-b6a278f5f27d@enkore.de>
 <OF29A8CBC9.F6980189-ONC1258052.0041C07B-C1258052.0041E112@smtpgw.horiba.co.jp>
Message-ID: <11fabffc-7a72-9c63-a6e7-04d98b21f875@ATComputing.nl>

Op 20-10-16 om 13:59 schreef heiko.helmle at horiba.com:
  
> If another process writes to a file during backup the backupped file is probably corrupt. That's why tar gives a warning in that case.

That's why I like to dump snapshots (ZFS, BtrFS).

Gerrit

From nl at nachtgeist.net  Thu Oct 20 08:52:52 2016
From: nl at nachtgeist.net (Daniel Reichelt)
Date: Thu, 20 Oct 2016 14:52:52 +0200
Subject: [Borgbackup] Fwd: Re:  What don't you like about Borg?
In-Reply-To: <efcd237d-a868-ca40-3ceb-7eaf830f7906@enkore.de>
References: <efcd237d-a868-ca40-3ceb-7eaf830f7906@enkore.de>
Message-ID: <27f1c625-d6c1-3640-e498-e586c14cfae6@nachtgeist.net>

(manual fwrd since I previously replied only to Marian...)


-------- Forwarded Message --------
Subject: Re: [Borgbackup] What don't you like about Borg?
Date: Thu, 20 Oct 2016 12:55:19 +0200
From: Marian Beermann <public at enkore.de>
To: Daniel Reichelt <hacking at nachtgeist.net>

That annoyed me as well, but I couldn't come up with a good solution.
Your idea looks like a very good match to me.

Cheers, Marian

On 20.10.2016 12:38, Daniel Reichelt wrote:
>> We all know about the good stuff Borg does, where it shines. I want to
>> know about the bad stuff. Where it's annoying, doesn't work like one
>> wanted to...
> 
> Nice thinking :-)
> 
> 
> My $0.02 about annoyances:
> 
> I think the handling of restores is a pretty cumbersome. Most of the
> time I know how old a version of a file/sub-tree I need to restore, but
> most certainly I do not know the exact name of the archive that stuff is
> stored in.
> 
> Before I switched to borg, I used to restore from rdiff-backup's
> "repository" (to stay with borg's terminology) with s.th. like
> 
> rdiff-back -r 5d /path/to/repo/path/to/file
> 
> which restored the subtree /path/to/file to CWD in the state is was
> known to rdiff-backup 5 days ago.
> 
> 
> Now with borg, I have to do a borg list /path/to/repo | grep
> $someYear-$someMonth followed by mouse-selection, borg restore
> /path/to/repo/$middleMouseClick and so on.
> 
> Of course rdiff-backup and borg differ profoundly in the sense that
> rdiff-backup sees a repository logically anchored to a fixed path which
> is backed up whereas borg stores whatever was specified on the cmdline
> to an archive.
> 
> With that in mind it would really be nice to have something like
> 
> borg restore --arch-prefix user-homes-host-0815 --as-of 5d /path/to/repo
> /path/to/file-to-restore-1 /path2/to2/file-to-restore-2
> 
> which then would restore from repo the files file-to-restore-1 [and so
> on] from the archives prefixed with user-homes-host-0815 and doing the
> final selection of the archive to use as source by a time match, in this
> case the latest archive that precedes the point in time [now - 5 days].
> 
> 
> What do you think?
> 
> 
> Cheers
> Daniel
> 
> 
> 
> 
> 
>>
>>
>> Cheers, Marian
>>
>> -
>>
>> Things that annoy me:
>>
>> - No good desktop GUI (I'm not a good designer, my own attempt kinda
>> failed).
>> - Sometimes it's slow and it's hard to tell why without knowing a lot
>> about internals
>> - Error messages are often kinda obscure
>> - When used on the command line progress output is often missing (in
>> current beta) even with --progress
>>
>> _______________________________________________
>> Borgbackup mailing list
>> Borgbackup at python.org
>> https://mail.python.org/mailman/listinfo/borgbackup
>>
> 


From dsjstc at gmail.com  Thu Oct 20 10:24:16 2016
From: dsjstc at gmail.com (DS Jstc)
Date: Thu, 20 Oct 2016 07:24:16 -0700
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <mailman.1168.1476965096.2290.borgbackup@python.org>
References: <mailman.1168.1476965096.2290.borgbackup@python.org>
Message-ID: <1ee06473-f530-1a6d-6245-f85cbc05097d@gmail.com>

I'm a recent convert to borg from Crashplan.  I'm really impressed with 
the simplicity and usability, particularly when paired with borgmatic 
(something like it should be made part of the core distribution).

The only thing I'm really missing is a restore GUI.

When all goes well, I don't interact with my backup system for months at 
a time.  That means I'll have to re-learn the command syntax when the 
time comes to find and restore a file.  Which is unfortunate -- when I'm 
restoring a file, I'm usually stressed and under time pressure.

My ideal restore gui would have the following features:

- very fast and simple pathname string filtering
- filter with an OR list of several directories
- filter with a date range
- easy to show change history for file
- easy to find other files sharing chunks with all prior versions of 
this file
- drag to restore

From dac at conceptual-analytics.com  Thu Oct 20 10:39:33 2016
From: dac at conceptual-analytics.com (Dave Cottingham)
Date: Thu, 20 Oct 2016 10:39:33 -0400
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
Message-ID: <CAG2bJn6VQP2_DU6j1vKL_WPbOBuCtCQekFbWHB1gxeiXjwGi5Q@mail.gmail.com>

My use of borg has been limited to testing and playing with it, so my
comments may be ill informed.

But the reason I haven't used it for my backups is the lack of multiple
backends. ssh is great in principle, but when it comes to buying online
space for the backups, none of the more affordable options do ssh.

Perhaps a documented backend API would enable the user community to help
out with this.

Thanks,
Dave Cottingham


On Thu, Oct 20, 2016 at 6:15 AM, Marian Beermann <public at enkore.de> wrote:

> It's easy to lose track of what's important and what annoys people, or
> doesn't work for them.
>
> We all know about the good stuff Borg does, where it shines. I want to
> know about the bad stuff. Where it's annoying, doesn't work like one
> wanted to...
>
>
> Cheers, Marian
>
> -
>
> Things that annoy me:
>
> - No good desktop GUI (I'm not a good designer, my own attempt kinda
> failed).
> - Sometimes it's slow and it's hard to tell why without knowing a lot
> about internals
> - Error messages are often kinda obscure
> - When used on the command line progress output is often missing (in
> current beta) even with --progress
>
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161020/90f47586/attachment.html>

From anarcat at debian.org  Thu Oct 20 11:06:58 2016
From: anarcat at debian.org (Antoine =?utf-8?Q?Beaupr=C3=A9?=)
Date: Thu, 20 Oct 2016 11:06:58 -0400
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <CAG2bJn6VQP2_DU6j1vKL_WPbOBuCtCQekFbWHB1gxeiXjwGi5Q@mail.gmail.com>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <CAG2bJn6VQP2_DU6j1vKL_WPbOBuCtCQekFbWHB1gxeiXjwGi5Q@mail.gmail.com>
Message-ID: <87insnrre5.fsf@angela.anarc.at>

On 2016-10-20 10:39:33, Dave Cottingham wrote:
> My use of borg has been limited to testing and playing with it, so my
> comments may be ill informed.
>
> But the reason I haven't used it for my backups is the lack of multiple
> backends. ssh is great in principle, but when it comes to buying online
> space for the backups, none of the more affordable options do ssh.
>
> Perhaps a documented backend API would enable the user community to help
> out with this.

A while back, I have tried to document the inner workings of Attic
(later borg) in:

http://borgbackup.readthedocs.io/en/latest/internals.html

Arbitrary backend support is a hard problem, because the client-server
architecture of borg is deeply coupled with the borg internals. I have
looked at the RPC interface here:

https://github.com/borgbackup/borg/issues/102#issuecomment-145749103

And it's obvious to me there is a lot of "intelligence" on the
server-side, in fact, SSH is not merely a transport as much as a RPC
conduit to allow borg to call itself on the remote end.

See also:

https://github.com/borgbackup/borg/issues/1070

This is also a frustrating blocker for me: there are very cheap backups
providers out there that could be leveraged to provide virtually
unlimited, secure backups to borg. Backblaze, for example, has
ridiculous prices (50$/machine/year for unlimited backups, business
use). In comparison, rsync.net, which supports borg, is 2.40$/GB/year,
so you would get a measly 20GB a year with 50$...

On top of this one, the things missing from borg are, for me:

* a complete GUI (no restore, no desktop automation):
  https://github.com/borgbackup/borg/issues/314
* API stability commitment (there was a huge discussion about this, but
  it's still unclear if things will change under our feet, which makes
  it hard to commit to borg for larger deployments):
  https://github.com/borgbackup/borg/issues/26
* internationalization: https://github.com/borgbackup/borg/pull/305
* extensible snapshot support:
  https://github.com/borgbackup/borg/issues/983#issuecomment-222513148
* config files: https://github.com/borgbackup/borg/issues/315

See also this issue for a broader usability review:

https://github.com/borgbackup/borg/issues/326

I recently did consulting for a community group here and couldn't
honestly recommend using Borg because they would not be autonomous in
restoring their backups, because they are not familiar with the command
line. I am also worried about long-term stability for them and they
needed low-cost offsite backups (that means not having to manage a
server). Another example: in my previous job, config files, snapshot
support and API stability would have been the issues.

I still use borg for my personal use, but it would be great to push it
forward to a greater public. I know this is a huge commitment and that
brings a lot of support requests and further issues, but I believe the
benefits are worth it. I wish I would have the feeling I could
contribute to this within the borg project, but my efforts, so far, have
been mostly met with refusal.

A.

-- 
Les plus beaux chants sont les chants de revendications
Le vers doit faire l'amour dans la t?te des populations.
? l'?cole de la po?sie, on n'apprend pas: on se bat!
                        - L?o Ferr?, "Pr?face"

From public at enkore.de  Thu Oct 20 11:51:59 2016
From: public at enkore.de (Marian Beermann)
Date: Thu, 20 Oct 2016 17:51:59 +0200
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <87insnrre5.fsf@angela.anarc.at>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <CAG2bJn6VQP2_DU6j1vKL_WPbOBuCtCQekFbWHB1gxeiXjwGi5Q@mail.gmail.com>
 <87insnrre5.fsf@angela.anarc.at>
Message-ID: <9593548c-05c9-4337-f222-38a1bf91434a@enkore.de>

On 20.10.2016 17:06, Antoine Beaupr? wrote:
> On 2016-10-20 10:39:33, Dave Cottingham wrote:
>> My use of borg has been limited to testing and playing with it, so my
>> comments may be ill informed.
>>
>> But the reason I haven't used it for my backups is the lack of multiple
>> backends. ssh is great in principle, but when it comes to buying online
>> space for the backups, none of the more affordable options do ssh.
>>
>> Perhaps a documented backend API would enable the user community to help
>> out with this.
> 
> A while back, I have tried to document the inner workings of Attic
> (later borg) in:
> 
> http://borgbackup.readthedocs.io/en/latest/internals.html
> 
> Arbitrary backend support is a hard problem, because the client-server
> architecture of borg is deeply coupled with the borg internals. I have
> looked at the RPC interface here:
> 
> https://github.com/borgbackup/borg/issues/102#issuecomment-145749103
> 
> And it's obvious to me there is a lot of "intelligence" on the
> server-side, in fact, SSH is not merely a transport as much as a RPC
> conduit to allow borg to call itself on the remote end.

I'm not sure if I ever wrote it on GitHub; personally I think the
Repository API is a relatively good API to implement other backends,
when an "named object API" is added, which would be used for special
cases like repokey and manifest storage, since these would usually
require special handling by backends.

Then something like pluggy could be used to provide a simple
plug-and-play mechanism for repository backends with named (versioned) APIs.

These should still be very stable.

Cheers, Marian

> See also:
> 
> https://github.com/borgbackup/borg/issues/1070
> 
> This is also a frustrating blocker for me: there are very cheap backups
> providers out there that could be leveraged to provide virtually
> unlimited, secure backups to borg. Backblaze, for example, has
> ridiculous prices (50$/machine/year for unlimited backups, business
> use). In comparison, rsync.net, which supports borg, is 2.40$/GB/year,
> so you would get a measly 20GB a year with 50$...

rsync.net price is 0.36 $/GB/year:
http://www.rsync.net/products/attic.html

It's still not the cheapest option, but personally I'm satisfied with
their service.

TN: I'm not paid / I don't receive any benefits for talking about rsync.net

Cheers, Marian

From anarcat at debian.org  Thu Oct 20 12:14:53 2016
From: anarcat at debian.org (Antoine =?utf-8?Q?Beaupr=C3=A9?=)
Date: Thu, 20 Oct 2016 12:14:53 -0400
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <9593548c-05c9-4337-f222-38a1bf91434a@enkore.de>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <CAG2bJn6VQP2_DU6j1vKL_WPbOBuCtCQekFbWHB1gxeiXjwGi5Q@mail.gmail.com>
 <87insnrre5.fsf@angela.anarc.at>
 <9593548c-05c9-4337-f222-38a1bf91434a@enkore.de>
Message-ID: <87funrro8y.fsf@angela.anarc.at>

On 2016-10-20 11:51:59, Marian Beermann wrote:
>> And it's obvious to me there is a lot of "intelligence" on the
>> server-side, in fact, SSH is not merely a transport as much as a RPC
>> conduit to allow borg to call itself on the remote end.
>
> I'm not sure if I ever wrote it on GitHub; personally I think the
> Repository API is a relatively good API to implement other backends,
> when an "named object API" is added, which would be used for special
> cases like repokey and manifest storage, since these would usually
> require special handling by backends.

Yeah, I was wondering if there was a way to separate the storage API
instead of the RemoteRepository API... but I am not sure it would work.

> Then something like pluggy could be used to provide a simple
> plug-and-play mechanism for repository backends with named (versioned) APIs.
>
> These should still be very stable.

I am not sure I would bother with a third party plugin module... Just
class derivation and module discovery should be sufficient.

>> See also:
>> 
>> https://github.com/borgbackup/borg/issues/1070
>> 
>> This is also a frustrating blocker for me: there are very cheap backups
>> providers out there that could be leveraged to provide virtually
>> unlimited, secure backups to borg. Backblaze, for example, has
>> ridiculous prices (50$/machine/year for unlimited backups, business
>> use). In comparison, rsync.net, which supports borg, is 2.40$/GB/year,
>> so you would get a measly 20GB a year with 50$...
>
> rsync.net price is 0.36 $/GB/year:
> http://www.rsync.net/products/attic.html
>
> It's still not the cheapest option, but personally I'm satisfied with
> their service.

That's pretty cool, I didn't know that! I was just looking at:

http://rsync.net/pricing.html

> TN: I'm not paid / I don't receive any benefits for talking about rsync.net

Similarly for other providers for me.

A.

-- 
Premature optimization is the root of all evil
                        - Donald Knuth

From aperucchi at jahia.com  Thu Oct 20 12:33:41 2016
From: aperucchi at jahia.com (Alessandro Perucchi)
Date: Thu, 20 Oct 2016 18:33:41 +0200
Subject: [Borgbackup] What don't you like about Borg?
Message-ID: <B429F693-679B-459E-84C4-3291AF11C24C@jahia.com>

Hello,

for me that?s simple? handling of locks.

When I am doing a backup, then I should be able to do read only operations.
At least on other backups, not on the one currently done.
Or simply have the list of backups done.
Or the list of files in a backup? even if a backup/restore is currently running.

What would be great is to have a more fine grained lock, so we can do restore at the same time that a backup is running.

At the moment, every time I do a backup, I check it afterwards? and sometimes the check is taking a long long long time? and if I need to do something, then I need to kill the process.

:-/

Kind regards,
Alessandro

From lists.borg at pjw.xsmail.com  Thu Oct 20 19:25:20 2016
From: lists.borg at pjw.xsmail.com (lists.borg at pjw.xsmail.com)
Date: Thu, 20 Oct 2016 17:25:20 -0600
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <9593548c-05c9-4337-f222-38a1bf91434a@enkore.de>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <CAG2bJn6VQP2_DU6j1vKL_WPbOBuCtCQekFbWHB1gxeiXjwGi5Q@mail.gmail.com>
 <87insnrre5.fsf@angela.anarc.at>
 <9593548c-05c9-4337-f222-38a1bf91434a@enkore.de>
Message-ID: <1477005920.1826288.762551913.7F26BAFC@webmail.messagingengine.com>

On Thu, Oct 20, 2016, at 09:51 AM, Marian Beermann wrote:
> On 2016-10-20 10:39:33, Dave Cottingham wrote:
> > This is also a frustrating blocker for me: there are very cheap backups
> > providers out there that could be leveraged to provide virtually
> > unlimited, secure backups to borg. Backblaze, for example, has
> > ridiculous prices (50$/machine/year for unlimited backups, business
> > use). In comparison, rsync.net, which supports borg, is 2.40$/GB/year,
> > so you would get a measly 20GB a year with 50$...
>
> rsync.net price is 0.36 $/GB/year:
> http://www.rsync.net/products/attic.html

That's 3 cents per GB/mo.
50GB costs me $18 /yr.

Do be aware rsync.net defaults to borg v0.29.

For current borg (1.07) use,
--remote-path /usr/local/bin/borg1/borg1

Or set BORG_REMOTE_PATH to same.

From tmhikaru at gmail.com  Fri Oct 21 00:49:18 2016
From: tmhikaru at gmail.com (tmhikaru at gmail.com)
Date: Thu, 20 Oct 2016 21:49:18 -0700
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <CA+cv5A4=FN1WRpm2FXFZh+O3Xs_zPZm-Yxwio2gM_KaFKcvxew@mail.gmail.com>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <CA+cv5A4=FN1WRpm2FXFZh+O3Xs_zPZm-Yxwio2gM_KaFKcvxew@mail.gmail.com>
Message-ID: <20161021044918.GA5752@raspberrypi>

The lack of centralized management of borg was a big problem for me as well,
though I tried working around it in two different ways - the first was to
have the server that ran the backup script to login to the remote client as
root with an ssh key and run borg to connect back to the server.  While this
allowed me to centrally script the backup of all machines remotely, it still
relied on the remote clients cpu and memory to do the majority of work and
required the client machine have borg.  Of my machines, my server is many
times stronger than the clients it is backing up, and this is unfortunately
exactly backwards from the way borg seems to be intended to be used.

The other way I attempted to work around it was to use sshfs mounts on the
server to access all of the filesystems on the remote machines.  This worked
nearly perfectly, except that I had to manually blacklist mountpoints on the
remote machines such as /proc/.  One major disadvantage with this is
unfortunately sshfs did not and may still not support xattrs, so for
instance this method cannot be used to backup an selinux system properly. 
This had the huge advantage of being able to use the cpu&ram of the server
that stored the backup to manage compressing the data that was being read
from the client side, and also didn't require the client to have borg
installed - both of the machines I used had no distro support for borg, and
one of them didn't even have python 3.  Although sshfs adds a bit of
overhead, the result was a 2-3x faster backup of the same data from one
machine, and another that was having trouble running the borg client at all
was able to use this method to back up its data flawlessly.  Yet another
plus in its favor is it did not require the client to be able to login to
the server's storage, and only required that the client have a keyed ssh
login as root, which allowed the server to mount its root by sshfs.

Unfortunately although the second attempt I came up with worked MUCH better,
the showstopper for me was when I realized that it was likely simply
impossible for the machine that was having trouble running the client at all
to be able to restore files from a backup that had been made to the server. 
Even though I was able to do the backups using sshfs on the server to get
past the problem of the borg client hanging forever, this would not help in
the case where I'd want to connect the backup drive directly to the weak
machine and restore files.  Its hardware is simply incapable of running
borg reliably.

And that's the other elephant in the room - borg requires too much memory
and cpu on the client end compared to the other backup solutions I have
used.

In the past I used rsync to do incremental backups of my multiple machines. 
Since my time with borg, I've returned to rsync.  I really liked the
deduplicating feature of borg as it saved quite a lot of space with the
multiple machines sharing the single repository.  Given this experience, I
investigated and have since been using a hardlink program that handles
things rsync does not in incremental backups, such as file renames, and also
finds duplicates across backups from separate machines.  Rsync is an
imperfect solution, and for an example does not support system level selinux
xattrs on purpose as well as having fairly annoying bugs with xattrs crop up
from time to time, but aside from that it does work across all of my
machines and is supported in each distro.  Compared to borg, rsync requires
nearly no cpu and memory, and can be used from a central server to
orchestrate backups with just a ssh key login to the remote client. 
Restoring files from an rsynced backup is often literally as simple as
mounting the backup drive and running cp -a...

All in all, I simply do not meet the requirements of using borg and gave up
trying to find a way.

Tim McGrath

On Thu, Oct 20, 2016 at 12:40:00PM +0200, Peter Schiffer wrote:
> I'm currently moving from borg to burp because I'm missing centralized
> management - managing clients from server (what to backup, when, etc). Burp
> enables this and with burp-ui it's possible to manage multiple burp servers
> and all of their clients from single web ui:
> https://git.ziirish.me/ziirish/burp-ui
> 
> In borg I also miss native support of various remote locations, like S3,
> samba, ftp..
> 
> And desktop GUI as well. Even simple deja dup is just fine for regular
> users..
> 
> peter
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 465 bytes
Desc: Digital signature
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161020/c5b69da4/attachment.sig>

From jungleboogie0 at gmail.com  Fri Oct 21 11:44:54 2016
From: jungleboogie0 at gmail.com (jungle Boogie)
Date: Fri, 21 Oct 2016 08:44:54 -0700
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <87insnrre5.fsf@angela.anarc.at>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <CAG2bJn6VQP2_DU6j1vKL_WPbOBuCtCQekFbWHB1gxeiXjwGi5Q@mail.gmail.com>
 <87insnrre5.fsf@angela.anarc.at>
Message-ID: <CAKE2PDsHOdPfZDu_sGBmDdB_4sTb3+Q-JamoPtpMoaCoiSpzZQ@mail.gmail.com>

On 20 October 2016 at 08:06, Antoine Beaupr? <anarcat at debian.org> wrote:
> I recently did consulting for a community group here and couldn't
> honestly recommend using Borg because they would not be autonomous in
> restoring their backups, because they are not familiar with the command
> line. I am also worried about long-term stability for them and they
> needed low-cost offsite backups (that means not having to manage a
> server). Another example: in my previous job, config files, snapshot
> support and API stability would have been the issues.
>
> I still use borg for my personal use, but it would be great to push it
> forward to a greater public. I know this is a huge commitment and that
> brings a lot of support requests and further issues, but I believe the
> benefits are worth it. I wish I would have the feeling I could
> contribute to this within the borg project, but my efforts, so far, have
> been mostly met with refusal.

As far as I'm concerned the greater public now means people who do
their computing on tablets and smart phones. Borg would never work on
those devices.

I don't think borg is to blame for the lack of integration to services
like dropbx, jungle disk, amazon s3, backblaze, of which the greater
public is likely only familiar with dropbox. Why would those companies
want someone else's code running on their infrastructure? It's nice
that rsync.net offers a service with borg, but what version is it? How
fast will they update it to get latest features and important bug
fixes? I had jungledisk for about three years and the client was never
once updated; I think towards the end of the service I had with them
they were updating their website to support stronger TLS ciphers and
then were going to roll out a new client. The greater public doesn't
care about that.

If we're assuming the greater public are folks who use computers and
laptops, you're right, they probably won't want command line stuff and
to setup things with cronjobs. But again, they don't care what the app
is as long as it helps them feel their backups are safe. They likely
won't care about encrypted backups and strong ciphers. They'll likely
chose a backup plan that has the best prices on black Friday, and they
may not even renew the following year.


That said, I look forward to the improvements that come to borg!

-- 
-------
inum: 883510009027723
sip: jungleboogie at sip2sip.info

From jungleboogie0 at gmail.com  Fri Oct 21 11:57:22 2016
From: jungleboogie0 at gmail.com (jungle Boogie)
Date: Fri, 21 Oct 2016 08:57:22 -0700
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
Message-ID: <CAKE2PDs1+yjCXR6ZYJ+nCtw0M6knVr5-nGHQNax7dAqkhGwbEg@mail.gmail.com>

On 20 October 2016 at 03:15, Marian Beermann <public at enkore.de> wrote:
> Things that annoy me:
>
> - No good desktop GUI (I'm not a good designer, my own attempt kinda
> failed).
> - Sometimes it's slow and it's hard to tell why without knowing a lot
> about internals
> - Error messages are often kinda obscure
> - When used on the command line progress output is often missing (in
> current beta) even with --progress

I think a desktop UI is more practical than a web UI. If borg were to
have a webUI and some kind of centralized management, I wouldn't want
it without TLS, especially for public internet usage. So that means a
self signed cert or bundling with letsencrypt. This means the webUI
will be running as root to access those lower ports. Some people may
be fine with borg running as root.

Can borg output its updates to files? I have it setup as a cron job
and the output sent to me via email is a little hard to read. What's
this mean:
U /var/unbound/unbound.conf

I haven't updated that file in awhile, but I have it backed up all the time.


-- 
-------
inum: 883510009027723
sip: jungleboogie at sip2sip.info

From public at enkore.de  Fri Oct 21 12:10:47 2016
From: public at enkore.de (Marian Beermann)
Date: Fri, 21 Oct 2016 18:10:47 +0200
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <CAKE2PDs1+yjCXR6ZYJ+nCtw0M6knVr5-nGHQNax7dAqkhGwbEg@mail.gmail.com>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <CAKE2PDs1+yjCXR6ZYJ+nCtw0M6knVr5-nGHQNax7dAqkhGwbEg@mail.gmail.com>
Message-ID: <a7c0242d-1753-6b9b-a3f3-3e1411091051@enkore.de>

On 21.10.2016 17:57, jungle Boogie wrote:
> On 20 October 2016 at 03:15, Marian Beermann <public at enkore.de> wrote:
>> Things that annoy me:
>>
>> - No good desktop GUI (I'm not a good designer, my own attempt kinda
>> failed).
>> - Sometimes it's slow and it's hard to tell why without knowing a lot
>> about internals
>> - Error messages are often kinda obscure
>> - When used on the command line progress output is often missing (in
>> current beta) even with --progress
> 
> I think a desktop UI is more practical than a web UI. If borg were to
> have a webUI and some kind of centralized management, I wouldn't want
> it without TLS, especially for public internet usage. So that means a
> self signed cert or bundling with letsencrypt. This means the webUI
> will be running as root to access those lower ports. Some people may
> be fine with borg running as root.
> 
> Can borg output its updates to files? I have it setup as a cron job
> and the output sent to me via email is a little hard to read. What's
> this mean:
> U /var/unbound/unbound.conf
> 
> I haven't updated that file in awhile, but I have it backed up all the time.
> 

You can use --filter to filter out Unchanged files.

Since Borg accepts external logging configurations (BORG_LOGGING_CONF,
https://docs.python.org/3/library/logging.config.html#configuration-file-format
) it's relatively easy to separate it's output. In 1.1.x things like
file listings go to a different, named logger than normal output, for
example.

This can be used to largely avoid monkeypatching for a GUI.

Cheers, Marian

From public at enkore.de  Fri Oct 21 12:12:55 2016
From: public at enkore.de (Marian Beermann)
Date: Fri, 21 Oct 2016 18:12:55 +0200
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <CAKE2PDsHOdPfZDu_sGBmDdB_4sTb3+Q-JamoPtpMoaCoiSpzZQ@mail.gmail.com>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <CAG2bJn6VQP2_DU6j1vKL_WPbOBuCtCQekFbWHB1gxeiXjwGi5Q@mail.gmail.com>
 <87insnrre5.fsf@angela.anarc.at>
 <CAKE2PDsHOdPfZDu_sGBmDdB_4sTb3+Q-JamoPtpMoaCoiSpzZQ@mail.gmail.com>
Message-ID: <25ad8422-919a-f5f4-bf42-1f012f5a1c35@enkore.de>

rsync.net: "borg" still refers to 0.xx, but "borg1" is the latest 1.0.x
release. I'm not sure how fast exactly they updated it to 1.0.7, but I
think less than a week. AFAIK the don't use the binaries Thomas' makes,
but compile their own (file size is different from what's released).

BORG_REMOTE_PATH=borg1 and it works.


> As far as I'm concerned the greater public now means people who do
> their computing on tablets and smart phones. Borg would never work on
> those devices.

I think textshell managed to run it on Android and even back up the root
FS, but I somewhat have doubts regarding the viability of a working restore.
It might work for documents and stuff like that, but probably not for a
system backup. In either case it's cumbersome to use compared to some
Android app (which I feel are also cumbersome to use, but that's another
matter).

Cheers, Marian

From jungleboogie0 at gmail.com  Wed Oct 26 19:51:26 2016
From: jungleboogie0 at gmail.com (jungle Boogie)
Date: Wed, 26 Oct 2016 16:51:26 -0700
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <25ad8422-919a-f5f4-bf42-1f012f5a1c35@enkore.de>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <CAG2bJn6VQP2_DU6j1vKL_WPbOBuCtCQekFbWHB1gxeiXjwGi5Q@mail.gmail.com>
 <87insnrre5.fsf@angela.anarc.at>
 <CAKE2PDsHOdPfZDu_sGBmDdB_4sTb3+Q-JamoPtpMoaCoiSpzZQ@mail.gmail.com>
 <25ad8422-919a-f5f4-bf42-1f012f5a1c35@enkore.de>
Message-ID: <CAKE2PDuD+FwF=f+CY9dE_YGKnWC65Hra=WM7m4GjgT7CGf7arw@mail.gmail.com>

On 21 October 2016 at 09:12, Marian Beermann <public at enkore.de> wrote:
> rsync.net: "borg" still refers to 0.xx, but "borg1" is the latest 1.0.x
> release. I'm not sure how fast exactly they updated it to 1.0.7, but I
> think less than a week. AFAIK the don't use the binaries Thomas' makes,
> but compile their own (file size is different from what's released).
>
> BORG_REMOTE_PATH=borg1 and it works.
>

That's good to hear about rsync. That's worth a shot with them!
What's that variable do?

>
>> As far as I'm concerned the greater public now means people who do
>> their computing on tablets and smart phones. Borg would never work on
>> those devices.
>
> I think textshell managed to run it on Android and even back up the root
> FS, but I somewhat have doubts regarding the viability of a working restore.
> It might work for documents and stuff like that, but probably not for a
> system backup. In either case it's cumbersome to use compared to some
> Android app (which I feel are also cumbersome to use, but that's another
> matter).
>

So borg can work on those devices. I shouldn't have said never (sorry
for the double negative ;))


> Cheers, Marian

-- 
-------
inum: 883510009027723
sip: jungleboogie at sip2sip.info

From public at enkore.de  Thu Oct 27 12:38:53 2016
From: public at enkore.de (Marian Beermann)
Date: Thu, 27 Oct 2016 18:38:53 +0200
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <CAKE2PDuD+FwF=f+CY9dE_YGKnWC65Hra=WM7m4GjgT7CGf7arw@mail.gmail.com>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <CAG2bJn6VQP2_DU6j1vKL_WPbOBuCtCQekFbWHB1gxeiXjwGi5Q@mail.gmail.com>
 <87insnrre5.fsf@angela.anarc.at>
 <CAKE2PDsHOdPfZDu_sGBmDdB_4sTb3+Q-JamoPtpMoaCoiSpzZQ@mail.gmail.com>
 <25ad8422-919a-f5f4-bf42-1f012f5a1c35@enkore.de>
 <CAKE2PDuD+FwF=f+CY9dE_YGKnWC65Hra=WM7m4GjgT7CGf7arw@mail.gmail.com>
Message-ID: <26bb7618-6e2e-16ab-50aa-22fcae66ede7@enkore.de>

On 27.10.2016 01:57, jungle Boogie wrote:
> On 26 October 2016 at 16:39, jungle Boogie <jungleboogie0 at gmail.com>
wrote:
>>> You can use --filter to filter out Unchanged files.
>>>
>>
>> Can you give an example of that? I don't know what STATUSCHARS I need
>> to list files that have recently changed.
>
>
> I think I got this working with ?filter=AME
>
> What's AME represent and where can I see that?
>

Sorry, forgot to add a link:
http://borgbackup.readthedocs.io/en/stable/usage.html#item-flags

> That's good to hear about rsync. That's worth a shot with them!
> What's that variable do?

It has Borg use the 1.x release installed on rsync.net's servers; the
command "borg" refers to 0.xx on their servers.

Cheers, Marian


From gmatht at gmail.com  Fri Oct 28 04:30:17 2016
From: gmatht at gmail.com (John McCabe-Dansted)
Date: Fri, 28 Oct 2016 16:30:17 +0800
Subject: [Borgbackup] What don't you like about Borg?
In-Reply-To: <26bb7618-6e2e-16ab-50aa-22fcae66ede7@enkore.de>
References: <a004b55b-7035-ba44-d3d2-7e6ad84f1f38@enkore.de>
 <CAG2bJn6VQP2_DU6j1vKL_WPbOBuCtCQekFbWHB1gxeiXjwGi5Q@mail.gmail.com>
 <87insnrre5.fsf@angela.anarc.at>
 <CAKE2PDsHOdPfZDu_sGBmDdB_4sTb3+Q-JamoPtpMoaCoiSpzZQ@mail.gmail.com>
 <25ad8422-919a-f5f4-bf42-1f012f5a1c35@enkore.de>
 <CAKE2PDuD+FwF=f+CY9dE_YGKnWC65Hra=WM7m4GjgT7CGf7arw@mail.gmail.com>
 <26bb7618-6e2e-16ab-50aa-22fcae66ede7@enkore.de>
Message-ID: <CAA-P7fX8jp05nxXu48x44jzuFQ1cFN7znh5FDmKqTP7BnRUTkQ@mail.gmail.com>

Personal Peeves:
1. I often end up with stale lock files.
2.  Not well suited to multiple clients backing up to a single deduplicated
repository.
3. No Partclone integration for storing sparse disk images.

I think (3) Is more the responsibility of Partclone, since if Partclone had
a way of dd'ing images or mounting partclone backups, it would be trivial
for Borg to support it.


On 28 October 2016 at 00:38, Marian Beermann <public at enkore.de> wrote:

> On 27.10.2016 01:57, jungle Boogie wrote:
> > On 26 October 2016 at 16:39, jungle Boogie <jungleboogie0 at gmail.com>
> wrote:
> >>> You can use --filter to filter out Unchanged files.
> >>>
> >>
> >> Can you give an example of that? I don't know what STATUSCHARS I need
> >> to list files that have recently changed.
> >
> >
> > I think I got this working with ?filter=AME
> >
> > What's AME represent and where can I see that?
> >
>
> Sorry, forgot to add a link:
> http://borgbackup.readthedocs.io/en/stable/usage.html#item-flags
>
> > That's good to hear about rsync. That's worth a shot with them!
> > What's that variable do?
>
> It has Borg use the 1.x release installed on rsync.net's servers; the
> command "borg" refers to 0.xx on their servers.
>
> Cheers, Marian
>
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
>


-- 
John C. McCabe-Dansted
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161028/a986345a/attachment.html>

From tw at waldmann-edv.de  Sat Oct 29 07:37:02 2016
From: tw at waldmann-edv.de (Thomas Waldmann)
Date: Sat, 29 Oct 2016 13:37:02 +0200
Subject: [Borgbackup] borgbackup 1.0.8 released
Message-ID: <fd40cc89-4570-7e18-5466-fba4b1475e58@waldmann-edv.de>

https://github.com/borgbackup/borg/releases/tag/1.0.8

Bug fixes, please upgrade.

More details: see URL above.

Cheers,

Thomas

-- 

GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393


From giovanni at panozzo.it  Sun Oct 30 14:27:17 2016
From: giovanni at panozzo.it (Giovanni Panozzo)
Date: Sun, 30 Oct 2016 19:27:17 +0100
Subject: [Borgbackup] Chunk size experiments
Message-ID: <ec4ad308-48f1-2b02-5a60-ac8d32305d51@panozzo.it>


Hello to all, I'm new to this ML and to borg.

I started using it on small servers, I liked it and now I would like to 
use it also in bigger fileservers.

The problem:
I'm running a FreeNAS (FreeBSD 10) 5.2 TB fileserver, and I'm testing 
borg on it.

I started with only a subdirectory of 2.1 TB of 500868 files,
with --chunker-params 19,23,21,4095
and here are the results:

- Time taken: 20h 44mins 52sec
- Data usage after dedup: 1.76 TB
- Chunks: 1193475

Time is ok, deduplication also is very good.

But chunks... one million of chunks is twice the number of the files.
Is it ok ? It took half an hour to delete a repository with 119347 
files. I know, deleting a repo is not a common operation, so it can be 
not very important.

I tried to raise some chunker-params values, but it seems that 23 is the 
maximum value I can have as CHUNK_MAX_EXP...

Any suggestions ?

Thank you. Borg backup is a great tool!


From public at enkore.de  Sun Oct 30 15:55:27 2016
From: public at enkore.de (Marian Beermann)
Date: Sun, 30 Oct 2016 20:55:27 +0100
Subject: [Borgbackup] Chunk size experiments
In-Reply-To: <ec4ad308-48f1-2b02-5a60-ac8d32305d51@panozzo.it>
References: <ec4ad308-48f1-2b02-5a60-ac8d32305d51@panozzo.it>
Message-ID: <340e5620-bfea-5df8-932c-6ca10cdeef3e@enkore.de>

On 30.10.2016 19:27, Giovanni Panozzo wrote:
> Hello to all, I'm new to this ML and to borg.

Hi :)

> I started using it on small servers, I liked it and now I would like to
> use it also in bigger fileservers.
> 
> The problem:
> I'm running a FreeNAS (FreeBSD 10) 5.2 TB fileserver, and I'm testing
> borg on it.
> 
> I started with only a subdirectory of 2.1 TB of 500868 files,
> with --chunker-params 19,23,21,4095
> and here are the results:
> 
> - Time taken: 20h 44mins 52sec
> - Data usage after dedup: 1.76 TB
> - Chunks: 1193475
> 
> Time is ok, deduplication also is very good.
> 
> But chunks... one million of chunks is twice the number of the files.
> Is it ok ?

With 19,23,21 the target average chunk size is 2^21 bytes = 2 MB. 2.1 TB
/ 500k files is ~4.2 MB so that's totally fine.

> It took half an hour to delete a repository with 119347
> files. I know, deleting a repo is not a common operation, so it can be
> not very important.

Thanks for that information. I looked into it and it's a defect in the
Python standard library. For now I created a Borg ticket:
https://github.com/borgbackup/borg/issues/1776

> I tried to raise some chunker-params values, but it seems that 23 is the
> maximum value I can have as CHUNK_MAX_EXP...

Correct. This is a conscious limitation, because it's easier to know a
"hard limit" on the chunks that can appear. Larger chunk sizes would
make very little if any difference.

> Any suggestions ?

A change we're making for 1.1 and that also mostly works ok with 1.0 is
to increase the segment size in the repository config from 5 MB to 500
MB. Things like prune/delete will go slower in 1.0, though. But writing
should be a fair amount faster, if it's not CPU limited (but your
numbers above look like it is).

Cheers, Marian

From giovanni at panozzo.it  Sun Oct 30 16:07:40 2016
From: giovanni at panozzo.it (Giovanni Panozzo)
Date: Sun, 30 Oct 2016 21:07:40 +0100
Subject: [Borgbackup] Chunk size experiments
In-Reply-To: <340e5620-bfea-5df8-932c-6ca10cdeef3e@enkore.de>
References: <ec4ad308-48f1-2b02-5a60-ac8d32305d51@panozzo.it>
 <340e5620-bfea-5df8-932c-6ca10cdeef3e@enkore.de>
Message-ID: <dbcae7cc-b6a6-a452-7e53-eec55f157b8a@panozzo.it>

 >
 > With 19,23,21 the target average chunk size is 2^21 bytes = 2 MB. 2.1 TB
 > / 500k files is ~4.2 MB so that's totally fine.

Thank you.

 >
 >> It took half an hour to delete a repository with 119347
 >> files. I know, deleting a repo is not a common operation, so it can be
 >> not very important.

My fault: I forgot one digit during copy&paste :( they are 1193475

I think the slowness could also be a filesystem issue, I'm using ZFS and 
one million of files to delete can be a big deal.
But feel free to improve it at the python side, and, as we agree, it's 
not a common operation to delete a repository.

 > A change we're making for 1.1 and that also mostly works ok with 1.0 is
 > to increase the segment size in the repository config from 5 MB to 500
 > MB. Things like prune/delete will go slower in 1.0, though. But writing
 > should be a fair amount faster, if it's not CPU limited (but your
 > numbers above look like it is).

Thank you. I will experiment with 1.1 when available :)

From archi.laurent at gmail.com  Fri Nov 11 05:18:28 2016
From: archi.laurent at gmail.com (Laurent Archi)
Date: Fri, 11 Nov 2016 11:18:28 +0100
Subject: [Borgbackup] Borg return status (RC in mode full)
Message-ID: <CAOs9Qc7ZDW7ULbdnQo=gQJXmqV-x9gX9fc7ckvCQabmuc9i+5A@mail.gmail.com>

Hi,

In documentation (Official site) Borg backup with option "--show-rc" return
0,1 or 2...and same the PID. Ok for this.

But in Perl script when i exec a command, this end of execution return 512
similar at rc = 2.
512 is for me = file already exists" and my question is "have you this full
codes for more precisions ?"
(sorry for my english)
Best regards and thanks

-- 
----~o00o-----//{ ??`(_)??` }\\-----o00o~------

               Laurent Archambault
                    Under Linux
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161111/46a0d86e/attachment.html>

From nospam at kota.moe  Wed Nov 16 07:57:45 2016
From: nospam at kota.moe (=?UTF-8?B?4oCN5bCP5aSq?=)
Date: Wed, 16 Nov 2016 23:57:45 +1100
Subject: [Borgbackup] Backup over sneakernet?
Message-ID: <CACsxjPb025Ox5XnZbPqcyUdzQGuYCXxNvMxCcNMbESY-M-KcGQ@mail.gmail.com>

Hello

Suppose I have the following situation:
- 500 GB (incompressible, unique) data to backup to an untrusted offsite
location
- Slow (~1 Mb/s) and data capped upload speed

Clearly uploading the backup over the internet will be too slow.
The typical solution to this problem is to copy everything to a hard drive,
mail it to the offsite location and they plug it in to the server, where
then you can copy off it - AKA the sneakernet.

So far, this works fine with Borg - create a new repo on the hard drive and
back up everything there and just mail that. Offsite location never gets to
see the unencrypted data either.

But now say I've generated another 100 GB of incompressible and unique
data, and now want to update the remote repo with this backup - but it's
still too large to upload over the internet.

One possible solution would be to just update the repo on the hard drive,
and once it's offsite, copying it across. But there's a potential point of
failure here - if the repo on the hard drive gets corrupted in the
meantime, copying it will also corrupt the offsite copy.

Are there any solutions to this problem that Borg supports?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161116/cd06b1bf/attachment.html>

From public at enkore.de  Wed Nov 16 09:08:42 2016
From: public at enkore.de (Marian Beermann)
Date: Wed, 16 Nov 2016 15:08:42 +0100
Subject: [Borgbackup] Backup over sneakernet?
In-Reply-To: <CACsxjPb025Ox5XnZbPqcyUdzQGuYCXxNvMxCcNMbESY-M-KcGQ@mail.gmail.com>
References: <CACsxjPb025Ox5XnZbPqcyUdzQGuYCXxNvMxCcNMbESY-M-KcGQ@mail.gmail.com>
Message-ID: <3050935a-5166-caf8-ca96-0241646343d0@enkore.de>

Hi ??

a viable stop gap would be to enable append only mode and
use some synchronization tool on both ends that only copies new segments
to / from the disk. Then a "borg check" can be used after the process to
verify that the repository arrived intact.

Note that, generally speaking, even in non-append-only mode Borg only
creates or deletes files in data/, never modifies them. There is still a
corruption vector in that if a file in data/ is deleted some of it's
data can be copied to a new file. If that new file were to become
corrupted it could damage an existing archive.

Replicating archives across multiple repositories w/ full cryptographic
integrity has been requested a few times but not yet included. In your
case it wouldn't really work, though (Perhaps if the disk always carries
a full repository?).

Cheers, Marian

On 16.11.2016 13:57, ??? wrote:
> Hello
> 
> Suppose I have the following situation:
> - 500 GB (incompressible, unique) data to backup to an untrusted offsite
> location
> - Slow (~1 Mb/s) and data capped upload speed
> 
> Clearly uploading the backup over the internet will be too slow.
> The typical solution to this problem is to copy everything to a hard
> drive, mail it to the offsite location and they plug it in to the
> server, where then you can copy off it - AKA the sneakernet.
> 
> So far, this works fine with Borg - create a new repo on the hard drive
> and back up everything there and just mail that. Offsite location never
> gets to see the unencrypted data either.
> 
> But now say I've generated another 100 GB of incompressible and unique
> data, and now want to update the remote repo with this backup - but it's
> still too large to upload over the internet.
> 
> One possible solution would be to just update the repo on the hard
> drive, and once it's offsite, copying it across. But there's a potential
> point of failure here - if the repo on the hard drive gets corrupted in
> the meantime, copying it will also corrupt the offsite copy.
> 
> Are there any solutions to this problem that Borg supports?


From tw at waldmann-edv.de  Wed Nov 23 17:49:21 2016
From: tw at waldmann-edv.de (Thomas Waldmann)
Date: Wed, 23 Nov 2016 23:49:21 +0100
Subject: [Borgbackup] do not run borg check --repair on old attic archives
Message-ID: <560c54e6-0b1c-d29d-8cd3-ae2e306e9f7d@waldmann-edv.de>

PSA:

do not run borg check --repair on repos that have archives made with
attic <= 0.13.

See: https://github.com/borgbackup/borg/issues/1837 for more details.

The issue will be fixed in borg 1.0.9.

-- 

GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393


From tw at waldmann-edv.de  Sun Nov 27 00:27:17 2016
From: tw at waldmann-edv.de (Thomas Waldmann)
Date: Sun, 27 Nov 2016 06:27:17 +0100
Subject: [Borgbackup] borgbackup 1.0.9rc1
Message-ID: <af039621-5eab-8875-25cf-202e2eb82e06@waldmann-edv.de>

Released borgbackup 1.0.9rc1 right now.

https://github.com/borgbackup/borg/releases/tag/1.0.9rc1

https://github.com/borgbackup/borg/blob/1.0.9rc1/docs/changes.rst

It would be helpful if you practically test this, so anything not
discovered by unit tests can be fixed.

The final 1.0.9 release is scheduled for December, so be quick.

-- 

GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393


From dpierceprice at gmail.com  Sun Nov 27 07:27:57 2016
From: dpierceprice at gmail.com (Douglas Pierce-Price)
Date: Sun, 27 Nov 2016 13:27:57 +0100
Subject: [Borgbackup] BorgBackup Mac compatibility, e.g. for Photos library
Message-ID: <CAK4ix-MDCpKvFazSb9vtgYNbeaofwoOEk8E2XczMTs7pf__gPw@mail.gmail.com>

Hello

I'm thinking of using BorgBackup on a Mac. Is there anything I need to be
careful of in terms of compatibility with the Mac file system, or will it
just work OK? (e.g. metadata, resource forks, whatever)

Probably the most important thing I want to backup is the Photos library.
Will BorgBackup handle this correctly?

Does the fact that this is a Package make any difference?

Or, would I run into problems if the Photos application is running and the
Photos library is therefore open when I run BorgBackup? Would I need to
close the library before running?

Many thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161127/cc8c9c12/attachment.html>

From public at enkore.de  Sun Nov 27 08:29:30 2016
From: public at enkore.de (Marian Beermann)
Date: Sun, 27 Nov 2016 14:29:30 +0100
Subject: [Borgbackup] BorgBackup Mac compatibility,
 e.g. for Photos library
In-Reply-To: <CAK4ix-MDCpKvFazSb9vtgYNbeaofwoOEk8E2XczMTs7pf__gPw@mail.gmail.com>
References: <CAK4ix-MDCpKvFazSb9vtgYNbeaofwoOEk8E2XczMTs7pf__gPw@mail.gmail.com>
Message-ID: <4d1ceffd-6f32-a66c-86b4-f9d031155cc3@enkore.de>

Hi Douglas,

(reply inline)

On 27.11.2016 13:27, Douglas Pierce-Price wrote:
> Hello
> 
> I'm thinking of using BorgBackup on a Mac. Is there anything I need to
> be careful of in terms of compatibility with the Mac file system, or
> will it just work OK? (e.g. metadata, resource forks, whatever)

ACLs, resource forks and flags are supported on OSX (cf.
http://borgbackup.readthedocs.io/en/latest/installation.html#features-platforms
). AFAIK these are all filesystem specialities OSX has.

> Probably the most important thing I want to backup is the Photos
> library. Will BorgBackup handle this correctly?

Probably. I don't know much about OSX; if all data the application needs
are in the backed up repositories, then there shouldn't be a problem.

> Does the fact that this is a Package make any difference?
> 
> Or, would I run into problems if the Photos application is running and
> the Photos library is therefore open when I run BorgBackup? Would I need
> to close the library before running?

It probably has some kind of embedded database (perhaps sqlite or
something similar?), so it likely needs to be closed / not actively used
during the backup.

As with any backup it is a good idea to try a *real* restore and see if
everything works out as expected.

> Many thanks!
> 

Cheers, Marian


From public at enkore.de  Sun Nov 27 08:30:20 2016
From: public at enkore.de (Marian Beermann)
Date: Sun, 27 Nov 2016 14:30:20 +0100
Subject: [Borgbackup] BorgBackup Mac compatibility,
 e.g. for Photos library
In-Reply-To: <4d1ceffd-6f32-a66c-86b4-f9d031155cc3@enkore.de>
References: <CAK4ix-MDCpKvFazSb9vtgYNbeaofwoOEk8E2XczMTs7pf__gPw@mail.gmail.com>
 <4d1ceffd-6f32-a66c-86b4-f9d031155cc3@enkore.de>
Message-ID: <38c233da-3ea0-d9bf-25ff-33a39d6d70c6@enkore.de>

On 27.11.2016 14:29, Marian Beermann wrote:
> Probably. I don't know much about OSX; if all data the application needs
> are in the backed up repositories, then there shouldn't be a problem.

... are in the backed up directories
(not repositories)

From tkpapp at gmail.com  Sun Nov 27 09:41:19 2016
From: tkpapp at gmail.com (Tamas Papp)
Date: Sun, 27 Nov 2016 15:41:19 +0100
Subject: [Borgbackup] script for desktop notification
Message-ID: <87mvglasts.fsf@gmail.com>

Hi,

I am a new borgbackup user. First, thanks for this fantastic tool!

I have a question: on a laptop, I would like to initiate backups
manually, so that I can decide whether I am on a fast connection that
has no data charges. However, I don't want to forget my daily backup
either.

So I would like to write a small script that checks for today's backup
and keeps nagging me (I put it in crontab). Here is my first attempt:

--8<---------------cut here---------------start------------->8---
#!/bin/bash
set -e                          # exit when no internet
REPOSITORY=[...my repo...]
LIST=`borgbackup list $REPOSITORY`
DATE=`date +%Y-%m-%d`

case $LIST in
    *"$DATE"* )
        # found today's backup, OK
        ;;
    * )
        notify-send "no backup today ? run borg-backup"
        ;;
esac
--8<---------------cut here---------------end--------------->8---

I am wondering if there is anything more idiomatic I could use though.

Best,

Tamas

From alainm at bonseletrons.com.br  Wed Nov 30 10:54:50 2016
From: alainm at bonseletrons.com.br (Alain Mouette)
Date: Wed, 30 Nov 2016 13:54:50 -0200
Subject: [Borgbackup] File history
Message-ID: <583EF64A.4080504@bonseletrons.com.br>

Hi, I am searching for a backup system and Borg is curently the most 
atractive, but...
I read many docs, but I didn't find this:

Is there a way to view all previous versions of a specific file, or any 
equivalent method of finding previous versions of that file to find past 
versions?

Thanks,

-- 

Alain Mouette
=== Projetos especiais: <http://lnkd.in/dEu8cNq> ===


From adrian.klaver at aklaver.com  Wed Nov 30 11:05:29 2016
From: adrian.klaver at aklaver.com (Adrian Klaver)
Date: Wed, 30 Nov 2016 08:05:29 -0800
Subject: [Borgbackup] File history
In-Reply-To: <583EF64A.4080504@bonseletrons.com.br>
References: <583EF64A.4080504@bonseletrons.com.br>
Message-ID: <1126a865-4d37-5e5b-d9c6-103a0fcc1376@aklaver.com>

On 11/30/2016 07:54 AM, Alain Mouette wrote:
> Hi, I am searching for a backup system and Borg is curently the most
> atractive, but...
> I read many docs, but I didn't find this:
>
> Is there a way to view all previous versions of a specific file, or any
> equivalent method of finding previous versions of that file to find past
> versions?

The only thing I can think of is coming in version 1.1.0:

http://borgbackup.readthedocs.io/en/1.1.0b2/usage.html#borg-diff

Not sure if that meets your needs or not.

>
> Thanks,
>


-- 
Adrian Klaver
adrian.klaver at aklaver.com

From public at enkore.de  Wed Nov 30 11:24:59 2016
From: public at enkore.de (Marian Beermann)
Date: Wed, 30 Nov 2016 17:24:59 +0100
Subject: [Borgbackup] File history
In-Reply-To: <1126a865-4d37-5e5b-d9c6-103a0fcc1376@aklaver.com>
References: <583EF64A.4080504@bonseletrons.com.br>
 <1126a865-4d37-5e5b-d9c6-103a0fcc1376@aklaver.com>
Message-ID: <455b90ec-3107-e62c-ffa8-8ae7cf1720da@enkore.de>

mount in 1.1.x (currently beta) has a versions view, where you have a
directory for every file with every version of that file Borg knows.

Cheers, Marian

On 30.11.2016 17:05, Adrian Klaver wrote:
> On 11/30/2016 07:54 AM, Alain Mouette wrote:
>> Hi, I am searching for a backup system and Borg is curently the most
>> atractive, but...
>> I read many docs, but I didn't find this:
>>
>> Is there a way to view all previous versions of a specific file, or any
>> equivalent method of finding previous versions of that file to find past
>> versions?
> 
> The only thing I can think of is coming in version 1.1.0:
> 
> http://borgbackup.readthedocs.io/en/1.1.0b2/usage.html#borg-diff
> 
> Not sure if that meets your needs or not.
> 
>>
>> Thanks,
>>
> 
> 


From alainm at bonseletrons.com.br  Wed Nov 30 11:30:45 2016
From: alainm at bonseletrons.com.br (Alain Mouette)
Date: Wed, 30 Nov 2016 14:30:45 -0200
Subject: [Borgbackup] File history
In-Reply-To: <455b90ec-3107-e62c-ffa8-8ae7cf1720da@enkore.de>
References: <583EF64A.4080504@bonseletrons.com.br>
 <1126a865-4d37-5e5b-d9c6-103a0fcc1376@aklaver.com>
 <455b90ec-3107-e62c-ffa8-8ae7cf1720da@enkore.de>
Message-ID: <583EFEB5.7070300@bonseletrons.com.br>

Yes, Marian Beermann, that seems to fit perfectly with what I want/need.
Is there already any doc for it? if so, could you point it to me, please?

thanks,

Alain Mouette
=== Projetos especiais: <http://lnkd.in/dEu8cNq> ===

On 30-11-2016 14:24, Marian Beermann wrote:
> mount in 1.1.x (currently beta) has a versions view, where you have a
> directory for every file with every version of that file Borg knows.
>
> Cheers, Marian
>
> On 30.11.2016 17:05, Adrian Klaver wrote:
>> On 11/30/2016 07:54 AM, Alain Mouette wrote:
>>> Hi, I am searching for a backup system and Borg is curently the most
>>> atractive, but...
>>> I read many docs, but I didn't find this:
>>>
>>> Is there a way to view all previous versions of a specific file, or any
>>> equivalent method of finding previous versions of that file to find past
>>> versions?
>> The only thing I can think of is coming in version 1.1.0:
>>
>> http://borgbackup.readthedocs.io/en/1.1.0b2/usage.html#borg-diff
>>
>> Not sure if that meets your needs or not.
>>
>>> Thanks,
>>>
>>
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup


From tkpapp at gmail.com  Wed Nov 30 11:37:01 2016
From: tkpapp at gmail.com (Tamas Papp)
Date: Wed, 30 Nov 2016 17:37:01 +0100
Subject: [Borgbackup] exclusion patterns question
Message-ID: <87r35shqky.fsf@gmail.com>

Hi,

I need a bit of help with patterns.

1. How do I escape spaces?

pp:/home/tamas/VirtualBox\ VMs

did not work.

2. Is there a way I can simplify the specification of paths in my home
directory? Eg instead of

pp:/home/tamas/.cache

something like

pp:~/.cache

or

pp:$HOME/.cache

This would also make files more portable across users.

3. Finally, could someone share an exclusion patterns file for a
Linux desktop that one can use to get started? I found

https://github.com/rubo77/rsync-homedir-excludes

which I could modify, but if there is already one with the syntax of
borgbackup, it would be nice.

Best,

Tamas

From public at enkore.de  Wed Nov 30 11:37:25 2016
From: public at enkore.de (Marian Beermann)
Date: Wed, 30 Nov 2016 17:37:25 +0100
Subject: [Borgbackup] File history
In-Reply-To: <583EFEB5.7070300@bonseletrons.com.br>
References: <583EF64A.4080504@bonseletrons.com.br>
 <1126a865-4d37-5e5b-d9c6-103a0fcc1376@aklaver.com>
 <455b90ec-3107-e62c-ffa8-8ae7cf1720da@enkore.de>
 <583EFEB5.7070300@bonseletrons.com.br>
Message-ID: <db9aea0f-3caa-26a1-ce86-ecb27c339115@enkore.de>

Only a short synopsis in the docs:

> Additional mount options supported by borg:
>
> versions: when used with a repository mount, this gives a merged,
> versioned view of the files in the archives.
> EXPERIMENTAL, layout may change in future.

http://borgbackup.readthedocs.io/en/latest/usage.html#id31

On 30.11.2016 17:30, Alain Mouette wrote:
> Yes, Marian Beermann, that seems to fit perfectly with what I want/need.
> Is there already any doc for it? if so, could you point it to me, please?
> 
> thanks,
> 
> Alain Mouette
> === Projetos especiais: <http://lnkd.in/dEu8cNq> ===
> 
> On 30-11-2016 14:24, Marian Beermann wrote:
>> mount in 1.1.x (currently beta) has a versions view, where you have a
>> directory for every file with every version of that file Borg knows.
>>
>> Cheers, Marian
>>
>> On 30.11.2016 17:05, Adrian Klaver wrote:
>>> On 11/30/2016 07:54 AM, Alain Mouette wrote:
>>>> Hi, I am searching for a backup system and Borg is curently the most
>>>> atractive, but...
>>>> I read many docs, but I didn't find this:
>>>>
>>>> Is there a way to view all previous versions of a specific file, or any
>>>> equivalent method of finding previous versions of that file to find
>>>> past
>>>> versions?
>>> The only thing I can think of is coming in version 1.1.0:
>>>
>>> http://borgbackup.readthedocs.io/en/1.1.0b2/usage.html#borg-diff
>>>
>>> Not sure if that meets your needs or not.
>>>
>>>> Thanks,
>>>>
>>>
>> _______________________________________________
>> Borgbackup mailing list
>> Borgbackup at python.org
>> https://mail.python.org/mailman/listinfo/borgbackup
> 
> 
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup


From tw at waldmann-edv.de  Fri Dec  2 12:13:48 2016
From: tw at waldmann-edv.de (Thomas Waldmann)
Date: Fri, 2 Dec 2016 18:13:48 +0100
Subject: [Borgbackup] exclusion patterns question
In-Reply-To: <87r35shqky.fsf@gmail.com>
References: <87r35shqky.fsf@gmail.com>
Message-ID: <e6f9f706-a658-287d-d35a-319119a4e297@waldmann-edv.de>

Moin Tamas,

> 1. How do I escape spaces?
> 
> pp:/home/tamas/VirtualBox\ VMs

If you have this inside an exclude file: do not escape them at all.

Escaping or quoting is only needed on the shell / in shell scripts.

> 2. Is there a way I can simplify the specification of paths in my home
> directory? Eg instead of
> 
> pp:/home/tamas/.cache
> 
> something like
> 
> pp:~/.cache
> 
> or
> 
> pp:$HOME/.cache

The patterns in a exclude file are just taken "as is", no env vars
expanded, no shell expansion.

Whether it is expanded when used from the shell commandline depends on
the expansion rules of your shell, try it.

> 3. Finally, could someone share an exclusion patterns file for a
> Linux desktop that one can use to get started?

I have these (among others, for a full system backup):

    --exclude-caches \
    --exclude "$SRC_MOUNT/home/*/.thunderbird/*/ImapMail/*" \
    --exclude "$SRC_MOUNT/home/*/.cache/*" \
    --exclude "$SRC_MOUNT/home/*/.local/share/zeitgeist/*" \
    --exclude "$SRC_MOUNT/home/*/.local/share/Trash/*" \
    --exclude "$SRC_MOUNT/var/cache/*" \
    --exclude "$SRC_MOUNT/var/lib/apt/lists/*" \
    --exclude "$SRC_MOUNT/var/tmp/*"

Below SRC_MOUNT, I only have on-disk filesystems mounted (no /sys /proc
/tmp etc.).

Cheers, Thomas

-- 

GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393


From dpierceprice at gmail.com  Thu Dec  8 11:29:25 2016
From: dpierceprice at gmail.com (Douglas Pierce-Price)
Date: Thu, 8 Dec 2016 17:29:25 +0100
Subject: [Borgbackup] BorgBackup Mac compatibility,
 e.g. for Photos library
In-Reply-To: <38c233da-3ea0-d9bf-25ff-33a39d6d70c6@enkore.de>
References: <CAK4ix-MDCpKvFazSb9vtgYNbeaofwoOEk8E2XczMTs7pf__gPw@mail.gmail.com>
 <4d1ceffd-6f32-a66c-86b4-f9d031155cc3@enkore.de>
 <38c233da-3ea0-d9bf-25ff-33a39d6d70c6@enkore.de>
Message-ID: <CAK4ix-OpHwVs2Go1LHR_CWrs6R_nUoN=9UMze83q-nEvj0CquA@mail.gmail.com>

Thank you! I'll give it a try...

On Sun, Nov 27, 2016 at 2:30 PM, Marian Beermann <public at enkore.de> wrote:

> On 27.11.2016 14:29, Marian Beermann wrote:
> > Probably. I don't know much about OSX; if all data the application needs
> > are in the backed up repositories, then there shouldn't be a problem.
>
> ... are in the backed up directories
> (not repositories)
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161208/1be60524/attachment.html>

From billk at iinet.net.au  Fri Dec  9 23:10:50 2016
From: billk at iinet.net.au (Bill Kenworthy)
Date: Sat, 10 Dec 2016 12:10:50 +0800
Subject: [Borgbackup] turn off warning?
Message-ID: <d0cbff2a-df83-3e60-a076-b5af55f4abc4@iinet.net.au>

I have borgbackup being used in a number of scripts.  The stable version
(1.09 on gentoo) and earlier spit out a warning:

Please upgrade to borg version 1.1+ on the server for safer AES-CTR
nonce handling.

Is there a way to turn it off?

I did try a git version which works fine, except that it cant be used to
back up a system with a .gvfs (all my GUI systems).

BillK


From billk at iinet.net.au  Sat Dec 10 00:25:08 2016
From: billk at iinet.net.au (Bill Kenworthy)
Date: Sat, 10 Dec 2016 13:25:08 +0800
Subject: [Borgbackup] turn off warning?
In-Reply-To: <d0cbff2a-df83-3e60-a076-b5af55f4abc4@iinet.net.au>
References: <d0cbff2a-df83-3e60-a076-b5af55f4abc4@iinet.net.au>
Message-ID: <8cad9735-d3f1-f5b6-9f41-203c7efefe88@iinet.net.au>

On 10/12/16 12:10, Bill Kenworthy wrote:
> I have borgbackup being used in a number of scripts.  The stable version
> (1.09 on gentoo) and earlier spit out a warning:
> 
> Please upgrade to borg version 1.1+ on the server for safer AES-CTR
> nonce handling.
> 
> Is there a way to turn it off?
> 
> I did try a git version which works fine, except that it cant be used to
> back up a system with a .gvfs (all my GUI systems).
> 
> BillK
> 
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
> 

Please ignore - realised my GIT instances were a couple of weeks old and
the latest juts gives a brief unreadable file message without the messy
crash/exit.

BillK


From public at enkore.de  Sat Dec 10 03:51:42 2016
From: public at enkore.de (Marian Beermann)
Date: Sat, 10 Dec 2016 09:51:42 +0100
Subject: [Borgbackup] turn off warning?
In-Reply-To: <d0cbff2a-df83-3e60-a076-b5af55f4abc4@iinet.net.au>
References: <d0cbff2a-df83-3e60-a076-b5af55f4abc4@iinet.net.au>
Message-ID: <df3525fa-82d9-f366-d70f-c2b5b2b6a903@enkore.de>

That warning isn't included in any 1.0.x client/server. The client that
gives you this message must be from git / from 1.1.x (beta).

It'd be great if you have details on that .gvfs crash / error, sounds
like a regression of some sort.

Cheers, Marian

On 10.12.2016 05:10, Bill Kenworthy wrote:
> I have borgbackup being used in a number of scripts.  The stable version
> (1.09 on gentoo) and earlier spit out a warning:
>
> Please upgrade to borg version 1.1+ on the server for safer AES-CTR
> nonce handling.
>
> Is there a way to turn it off?
>
> I did try a git version which works fine, except that it cant be used to
> back up a system with a .gvfs (all my GUI systems).
>
> BillK


From billk at iinet.net.au  Sat Dec 10 22:00:39 2016
From: billk at iinet.net.au (Bill Kenworthy)
Date: Sun, 11 Dec 2016 11:00:39 +0800
Subject: [Borgbackup] turn off warning?
In-Reply-To: <df3525fa-82d9-f366-d70f-c2b5b2b6a903@enkore.de>
References: <d0cbff2a-df83-3e60-a076-b5af55f4abc4@iinet.net.au>
 <df3525fa-82d9-f366-d70f-c2b5b2b6a903@enkore.de>
Message-ID: <7a177377-090d-dedb-73b6-5719614d882c@iinet.net.au>

On 10/12/16 16:51, Marian Beermann wrote:
> That warning isn't included in any 1.0.x client/server. The client that
> gives you this message must be from git / from 1.1.x (beta).
> 
> It'd be great if you have details on that .gvfs crash / error, sounds
> like a regression of some sort.
> 
> Cheers, Marian
> 
> On 10.12.2016 05:10, Bill Kenworthy wrote:
>> I have borgbackup being used in a number of scripts.  The stable version
>> (1.09 on gentoo) and earlier spit out a warning:
>>
>> Please upgrade to borg version 1.1+ on the server for safer AES-CTR
>> nonce handling.
>>
>> Is there a way to turn it off?
>>
>> I did try a git version which works fine, except that it cant be used to
>> back up a system with a .gvfs (all my GUI systems).
>>
>> BillK
> 
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
> 


Its gone in current git - I just get:

/home/config/home/wdk: [Errno 13] Permission denied:
'/home/config/home/wdk/.gvfs'

On gentoo. .gvfs is readable by the user only - root does not get
access.  This is what caused the crash - it would hang for a few seconds
then exit, sometimes requiring the backup to be "borg check --repair".

The current version just prints the above and continues merrily on as it
should.

BillK


From jgoerzen at complete.org  Thu Dec 15 13:38:21 2016
From: jgoerzen at complete.org (John Goerzen)
Date: Thu, 15 Dec 2016 12:38:21 -0600
Subject: [Borgbackup] "borg check" without reading every byte
Message-ID: <031139db-f1fd-3a27-c95e-38624da27072@complete.org>

Hi folks,

I have a question about borg check.  I'm anticipating storing a backup 
on a remote host to which I do not have the ability to install borg 
(think sshfs or webdav or so).  From the manpage description, it looks 
as if borg check will read every bit of data in the repo at least 
twice.  Is there a way for it to check the consistency of the metadata 
trees without checking the CRC of every segment or reading every bit of 
file data?

(I am pondering a move from obnam, which does have this feature)

Thanks,

John

From public at enkore.de  Thu Dec 15 14:06:20 2016
From: public at enkore.de (Marian Beermann)
Date: Thu, 15 Dec 2016 20:06:20 +0100
Subject: [Borgbackup] "borg check" without reading every byte
In-Reply-To: <031139db-f1fd-3a27-c95e-38624da27072@complete.org>
References: <031139db-f1fd-3a27-c95e-38624da27072@complete.org>
Message-ID: <8951a531-1296-64ea-96a8-e151f403414b@enkore.de>

Hi John

from the check help page:

  --repository-only     only perform repository checks
  --archives-only       only perform archives checks

Cheers, Marian

On 15.12.2016 19:38, John Goerzen wrote:
> Hi folks,
> 
> I have a question about borg check.  I'm anticipating storing a backup
> on a remote host to which I do not have the ability to install borg
> (think sshfs or webdav or so).  From the manpage description, it looks
> as if borg check will read every bit of data in the repo at least
> twice.  Is there a way for it to check the consistency of the metadata
> trees without checking the CRC of every segment or reading every bit of
> file data?
> 
> (I am pondering a move from obnam, which does have this feature)
> 
> Thanks,
> 
> John
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup


From jgoerzen at complete.org  Thu Dec 15 14:35:58 2016
From: jgoerzen at complete.org (John Goerzen)
Date: Thu, 15 Dec 2016 13:35:58 -0600
Subject: [Borgbackup] "borg check" without reading every byte
In-Reply-To: <8951a531-1296-64ea-96a8-e151f403414b@enkore.de>
References: <031139db-f1fd-3a27-c95e-38624da27072@complete.org>
 <8951a531-1296-64ea-96a8-e151f403414b@enkore.de>
Message-ID: <ff821123-781c-2bee-a17e-4b873b3a6ade@complete.org>

Yes, I read that, but the documentation also says this:

For the repository check, "For all objects stored in the segments, all
metadata... and all data is read."

Although now that I reread the description of the archive check, perhaps
there is a case where it does not have to reread all data.  I'll see if
I can validate this.

Thanks,

John


On 12/15/2016 01:06 PM, Marian Beermann wrote:
> Hi John
>
> from the check help page:
>
>   --repository-only     only perform repository checks
>   --archives-only       only perform archives checks
>
> Cheers, Marian
>
> On 15.12.2016 19:38, John Goerzen wrote:
>> Hi folks,
>>
>> I have a question about borg check.  I'm anticipating storing a backup
>> on a remote host to which I do not have the ability to install borg
>> (think sshfs or webdav or so).  From the manpage description, it looks
>> as if borg check will read every bit of data in the repo at least
>> twice.  Is there a way for it to check the consistency of the metadata
>> trees without checking the CRC of every segment or reading every bit of
>> file data?
>>
>> (I am pondering a move from obnam, which does have this feature)
>>
>> Thanks,
>>
>> John
>> _______________________________________________
>> Borgbackup mailing list
>> Borgbackup at python.org
>> https://mail.python.org/mailman/listinfo/borgbackup
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup


From public at enkore.de  Thu Dec 15 14:54:13 2016
From: public at enkore.de (Marian Beermann)
Date: Thu, 15 Dec 2016 20:54:13 +0100
Subject: [Borgbackup] "borg check" without reading every byte
In-Reply-To: <ff821123-781c-2bee-a17e-4b873b3a6ade@complete.org>
References: <031139db-f1fd-3a27-c95e-38624da27072@complete.org>
 <8951a531-1296-64ea-96a8-e151f403414b@enkore.de>
 <ff821123-781c-2bee-a17e-4b873b3a6ade@complete.org>
Message-ID: <7a5bad05-da0d-7a17-1147-8a06784cabd7@enkore.de>

On 15.12.2016 20:35, John Goerzen wrote:
> Yes, I read that, but the documentation also says this:
> 
> For the repository check, "For all objects stored in the segments, all
> metadata... and all data is read."
> 
> Although now that I reread the description of the archive check, perhaps
> there is a case where it does not have to reread all data.  I'll see if
> I can validate this.

Archives check only works with metadata.

> Second, the consistency and correctness of the
> archive metadata is verified:

Cheers, Marian

> Thanks,
> 
> John
> 
> 
> On 12/15/2016 01:06 PM, Marian Beermann wrote:
>> Hi John
>>
>> from the check help page:
>>
>>   --repository-only     only perform repository checks
>>   --archives-only       only perform archives checks
>>
>> Cheers, Marian
>>
>> On 15.12.2016 19:38, John Goerzen wrote:
>>> Hi folks,
>>>
>>> I have a question about borg check.  I'm anticipating storing a backup
>>> on a remote host to which I do not have the ability to install borg
>>> (think sshfs or webdav or so).  From the manpage description, it looks
>>> as if borg check will read every bit of data in the repo at least
>>> twice.  Is there a way for it to check the consistency of the metadata
>>> trees without checking the CRC of every segment or reading every bit of
>>> file data?
>>>
>>> (I am pondering a move from obnam, which does have this feature)
>>>
>>> Thanks,
>>>
>>> John
>>> _______________________________________________
>>> Borgbackup mailing list
>>> Borgbackup at python.org
>>> https://mail.python.org/mailman/listinfo/borgbackup
>> _______________________________________________
>> Borgbackup mailing list
>> Borgbackup at python.org
>> https://mail.python.org/mailman/listinfo/borgbackup
> 


From jgoerzen at complete.org  Sat Dec 17 15:08:22 2016
From: jgoerzen at complete.org (John Goerzen)
Date: Sat, 17 Dec 2016 14:08:22 -0600
Subject: [Borgbackup] "borg check" without reading every byte
In-Reply-To: <7a5bad05-da0d-7a17-1147-8a06784cabd7@enkore.de>
References: <031139db-f1fd-3a27-c95e-38624da27072@complete.org>
 <8951a531-1296-64ea-96a8-e151f403414b@enkore.de>
 <ff821123-781c-2bee-a17e-4b873b3a6ade@complete.org>
 <7a5bad05-da0d-7a17-1147-8a06784cabd7@enkore.de>
Message-ID: <32941c3e-4972-2bd6-7756-44b0c1342699@complete.org>

So I did an experiment:

rm data/21/219999
borg check -v --archives-only

This did not flag any errors.  Now it's possible that the segment I
deleted was not holding metadata, and therefore never visited.  However,
I find it odd that it didn't at least notice it was missing.  It would
be nice to check that without reading every single bit of data to
recheck CRCs, I guess.

John

On 12/15/2016 01:54 PM, Marian Beermann wrote:
> On 15.12.2016 20:35, John Goerzen wrote:
>> Yes, I read that, but the documentation also says this:
>>
>> For the repository check, "For all objects stored in the segments, all
>> metadata... and all data is read."
>>
>> Although now that I reread the description of the archive check, perhaps
>> there is a case where it does not have to reread all data.  I'll see if
>> I can validate this.
> Archives check only works with metadata.
>
>> Second, the consistency and correctness of the
>> archive metadata is verified:
> Cheers, Marian
>
>> Thanks,
>>
>> John
>>
>>
>> On 12/15/2016 01:06 PM, Marian Beermann wrote:
>>> Hi John
>>>
>>> from the check help page:
>>>
>>>   --repository-only     only perform repository checks
>>>   --archives-only       only perform archives checks
>>>
>>> Cheers, Marian
>>>
>>> On 15.12.2016 19:38, John Goerzen wrote:
>>>> Hi folks,
>>>>
>>>> I have a question about borg check.  I'm anticipating storing a backup
>>>> on a remote host to which I do not have the ability to install borg
>>>> (think sshfs or webdav or so).  From the manpage description, it looks
>>>> as if borg check will read every bit of data in the repo at least
>>>> twice.  Is there a way for it to check the consistency of the metadata
>>>> trees without checking the CRC of every segment or reading every bit of
>>>> file data?
>>>>
>>>> (I am pondering a move from obnam, which does have this feature)
>>>>
>>>> Thanks,
>>>>
>>>> John
>>>> _______________________________________________
>>>> Borgbackup mailing list
>>>> Borgbackup at python.org
>>>> https://mail.python.org/mailman/listinfo/borgbackup
>>> _______________________________________________
>>> Borgbackup mailing list
>>> Borgbackup at python.org
>>> https://mail.python.org/mailman/listinfo/borgbackup


From public at enkore.de  Sun Dec 18 10:18:05 2016
From: public at enkore.de (Marian Beermann)
Date: Sun, 18 Dec 2016 16:18:05 +0100
Subject: [Borgbackup] "borg check" without reading every byte
In-Reply-To: <32941c3e-4972-2bd6-7756-44b0c1342699@complete.org>
References: <031139db-f1fd-3a27-c95e-38624da27072@complete.org>
 <8951a531-1296-64ea-96a8-e151f403414b@enkore.de>
 <ff821123-781c-2bee-a17e-4b873b3a6ade@complete.org>
 <7a5bad05-da0d-7a17-1147-8a06784cabd7@enkore.de>
 <32941c3e-4972-2bd6-7756-44b0c1342699@complete.org>
Message-ID: <d1b67eeb-f90f-3c36-cd36-bd5991876d91@enkore.de>

On 17.12.2016 21:08, John Goerzen wrote:
> So I did an experiment:
> 
> rm data/21/219999
> borg check -v --archives-only
> 
> This did not flag any errors.  Now it's possible that the segment I
> deleted was not holding metadata, and therefore never visited.  However,
> I find it odd that it didn't at least notice it was missing.  It would
> be nice to check that without reading every single bit of data to
> recheck CRCs, I guess.

That's kinda expected, since the repository check and the archives check
work on different layers... one could implement a 'shallow' repository
check like you say, but it begs the question whether that's really a
useful thing to have; corruption of the data itself is more likely and
the transaction logic ensures (on typical hardware, anyway) that the
trailing end of the repository is intact (this is checked on every
access of the repository).

Cheers, Marian

> John
> 
> On 12/15/2016 01:54 PM, Marian Beermann wrote:
>> On 15.12.2016 20:35, John Goerzen wrote:
>>> Yes, I read that, but the documentation also says this:
>>>
>>> For the repository check, "For all objects stored in the segments, all
>>> metadata... and all data is read."
>>>
>>> Although now that I reread the description of the archive check, perhaps
>>> there is a case where it does not have to reread all data.  I'll see if
>>> I can validate this.
>> Archives check only works with metadata.
>>
>>> Second, the consistency and correctness of the
>>> archive metadata is verified:
>> Cheers, Marian
>>
>>> Thanks,
>>>
>>> John
>>>
>>>
>>> On 12/15/2016 01:06 PM, Marian Beermann wrote:
>>>> Hi John
>>>>
>>>> from the check help page:
>>>>
>>>>   --repository-only     only perform repository checks
>>>>   --archives-only       only perform archives checks
>>>>
>>>> Cheers, Marian
>>>>
>>>> On 15.12.2016 19:38, John Goerzen wrote:
>>>>> Hi folks,
>>>>>
>>>>> I have a question about borg check.  I'm anticipating storing a backup
>>>>> on a remote host to which I do not have the ability to install borg
>>>>> (think sshfs or webdav or so).  From the manpage description, it looks
>>>>> as if borg check will read every bit of data in the repo at least
>>>>> twice.  Is there a way for it to check the consistency of the metadata
>>>>> trees without checking the CRC of every segment or reading every bit of
>>>>> file data?
>>>>>
>>>>> (I am pondering a move from obnam, which does have this feature)
>>>>>
>>>>> Thanks,
>>>>>
>>>>> John
>>>>> _______________________________________________
>>>>> Borgbackup mailing list
>>>>> Borgbackup at python.org
>>>>> https://mail.python.org/mailman/listinfo/borgbackup
>>>> _______________________________________________
>>>> Borgbackup mailing list
>>>> Borgbackup at python.org
>>>> https://mail.python.org/mailman/listinfo/borgbackup
> 


From tw at waldmann-edv.de  Mon Dec 19 23:10:02 2016
From: tw at waldmann-edv.de (Thomas Waldmann)
Date: Tue, 20 Dec 2016 05:10:02 +0100
Subject: [Borgbackup] borgbackup 1.0.9 released
Message-ID: <f21de89c-3091-8157-a936-7f504b8504de@waldmann-edv.de>

https://github.com/borgbackup/borg/releases/tag/1.0.9

Security and Bug fixes, please upgrade.

More details: see URL above.

Cheers,

Thomas

-- 

GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393


From sitaramc at gmail.com  Tue Dec 20 07:29:22 2016
From: sitaramc at gmail.com (Sitaram Chamarty)
Date: Tue, 20 Dec 2016 17:59:22 +0530
Subject: [Borgbackup] borgbackup 1.0.9 released
In-Reply-To: <f21de89c-3091-8157-a936-7f504b8504de@waldmann-edv.de>
References: <f21de89c-3091-8157-a936-7f504b8504de@waldmann-edv.de>
Message-ID: <20161220122922.GA16294@sita-lt.atc.tcs.com>

On Tue, Dec 20, 2016 at 05:10:02AM +0100, Thomas Waldmann wrote:
> https://github.com/borgbackup/borg/releases/tag/1.0.9
> 
> Security and Bug fixes, please upgrade.
> 
> More details: see URL above.

Hi

Quick question about this instruction:

    borg upgrade --tam <repository>

do I have to upgrade the version of borg on the server also
before I run this?  Or is it just the *client* that matters?

(I know the instructions on that page only say "upgrade all
clients" but I just want to be sure.)

regards
sitaram

From public at enkore.de  Tue Dec 20 07:48:01 2016
From: public at enkore.de (Marian Beermann)
Date: Tue, 20 Dec 2016 13:48:01 +0100
Subject: [Borgbackup] borgbackup 1.0.9 released
In-Reply-To: <20161220122922.GA16294@sita-lt.atc.tcs.com>
References: <f21de89c-3091-8157-a936-7f504b8504de@waldmann-edv.de>
 <20161220122922.GA16294@sita-lt.atc.tcs.com>
Message-ID: <78cdbd8d-7f01-18bd-74b6-89af02f1b2e6@enkore.de>

Hi Sitaram,

this only affects the clients, not the server.

All 1.0.x releases are compatible with any 1.0.x server.

Cheers, Marian

On 20.12.2016 13:29, Sitaram Chamarty wrote:
> On Tue, Dec 20, 2016 at 05:10:02AM +0100, Thomas Waldmann wrote:
>> https://github.com/borgbackup/borg/releases/tag/1.0.9
>>
>> Security and Bug fixes, please upgrade.
>>
>> More details: see URL above.
> 
> Hi
> 
> Quick question about this instruction:
> 
>     borg upgrade --tam <repository>
> 
> do I have to upgrade the version of borg on the server also
> before I run this?  Or is it just the *client* that matters?
> 
> (I know the instructions on that page only say "upgrade all
> clients" but I just want to be sure.)
> 
> regards
> sitaram
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
> 


From jgoerzen at complete.org  Tue Dec 20 09:28:37 2016
From: jgoerzen at complete.org (John Goerzen)
Date: Tue, 20 Dec 2016 08:28:37 -0600
Subject: [Borgbackup] Why does borg delete/prune write a bunch of new data?
Message-ID: <a2bcb140-cbb3-8188-198c-22344d500b72@complete.org>

Hi folks,

So I'm doing some testing of Borg.  My ultimate aim is to rsync the 
backups to a dumb (WebDAV or S3-type) host.

I made a run of borg over a real subset of my data, about 80GB worth.  I 
then cleaned up and deleted a good chunk of data throughout that area, 
and made another archive with borg create.

So far so good.  Now I ran borg delete to remove the archive with all 
the extra data.  Sure enough, about 2GB freed up on the disk after.

However, watching the process with strace and examining the filesystem, 
I observed it wrote a considerable amount of new segments to the data 
directory.  A little analysis with ls and du shows it wrote right around 
2GB of new segments.  (It also, of course, unlinked a considerable 
number of segments.)

Having to rsync 2GB of new data every time I delete data is going to be 
rather sub-optimal on my poor DSL.  Any ideas why it's doing this?  FWIW 
the index file is only a few tens of MBs.

I'm using encryption and lzma compression.  I did double the 
max_segment_size from 5MB to 10MB (a lot of experience with obnam 
suggested this would improve the performance over the rsync situation)

Thanks,

John


From sitaramc at gmail.com  Tue Dec 20 13:30:35 2016
From: sitaramc at gmail.com (Sitaram Chamarty)
Date: Wed, 21 Dec 2016 00:00:35 +0530
Subject: [Borgbackup] borgbackup 1.0.9 released
In-Reply-To: <78cdbd8d-7f01-18bd-74b6-89af02f1b2e6@enkore.de>
References: <f21de89c-3091-8157-a936-7f504b8504de@waldmann-edv.de>
 <20161220122922.GA16294@sita-lt.atc.tcs.com>
 <78cdbd8d-7f01-18bd-74b6-89af02f1b2e6@enkore.de>
Message-ID: <20161220183035.GA19822@sita-lt.atc.tcs.com>

On Tue, Dec 20, 2016 at 01:48:01PM +0100, Marian Beermann wrote:
> Hi Sitaram,
> 
> this only affects the clients, not the server.
> 
> All 1.0.x releases are compatible with any 1.0.x server.

thanks!  One of my servers is only accessible via borg until
next month so I needed to be clear.

regards
sitaram

> 
> Cheers, Marian
> 
> On 20.12.2016 13:29, Sitaram Chamarty wrote:
> > On Tue, Dec 20, 2016 at 05:10:02AM +0100, Thomas Waldmann wrote:
> >> https://github.com/borgbackup/borg/releases/tag/1.0.9
> >>
> >> Security and Bug fixes, please upgrade.
> >>
> >> More details: see URL above.
> > 
> > Hi
> > 
> > Quick question about this instruction:
> > 
> >     borg upgrade --tam <repository>
> > 
> > do I have to upgrade the version of borg on the server also
> > before I run this?  Or is it just the *client* that matters?
> > 
> > (I know the instructions on that page only say "upgrade all
> > clients" but I just want to be sure.)
> > 
> > regards
> > sitaram
> > _______________________________________________
> > Borgbackup mailing list
> > Borgbackup at python.org
> > https://mail.python.org/mailman/listinfo/borgbackup
> > 
> 

From jgoerzen at complete.org  Tue Dec 20 18:09:48 2016
From: jgoerzen at complete.org (John Goerzen)
Date: Tue, 20 Dec 2016 17:09:48 -0600
Subject: [Borgbackup] Why does borg delete/prune write a bunch of new
 data?
In-Reply-To: <a2bcb140-cbb3-8188-198c-22344d500b72@complete.org>
References: <a2bcb140-cbb3-8188-198c-22344d500b72@complete.org>
Message-ID: <a4abb28e-dd60-6046-f5f9-865fc85edb52@complete.org>

I've done some digging into this, and it seems the reason is
compact_segments() in repository.py.

It both deletes the segments that are completely unused, and also (if
I'm understanding correctly), takes segments containing some objects
that are unused and some objects that are still used and writes new
segments containing only the used objects.

The end result is some space savings, at the cost of a lot of I/O.  I
wonder how hard it would be to support deleting unused segments without
bothering to rewrite segments that are partially used?

thanks,

John

On 12/20/2016 08:28 AM, John Goerzen wrote:
> Hi folks,
>
> So I'm doing some testing of Borg.  My ultimate aim is to rsync the
> backups to a dumb (WebDAV or S3-type) host.
>
> I made a run of borg over a real subset of my data, about 80GB worth. 
> I then cleaned up and deleted a good chunk of data throughout that
> area, and made another archive with borg create.
>
> So far so good.  Now I ran borg delete to remove the archive with all
> the extra data.  Sure enough, about 2GB freed up on the disk after.
>
> However, watching the process with strace and examining the
> filesystem, I observed it wrote a considerable amount of new segments
> to the data directory.  A little analysis with ls and du shows it
> wrote right around 2GB of new segments.  (It also, of course, unlinked
> a considerable number of segments.)
>
> Having to rsync 2GB of new data every time I delete data is going to
> be rather sub-optimal on my poor DSL.  Any ideas why it's doing this? 
> FWIW the index file is only a few tens of MBs.
>
> I'm using encryption and lzma compression.  I did double the
> max_segment_size from 5MB to 10MB (a lot of experience with obnam
> suggested this would improve the performance over the rsync situation)
>
> Thanks,
>
> John
>
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup


From mario at emmenlauer.de  Wed Dec 21 06:55:05 2016
From: mario at emmenlauer.de (Mario Emmenlauer)
Date: Wed, 21 Dec 2016 12:55:05 +0100
Subject: [Borgbackup] faster / better deletion, for a bounty?
Message-ID: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de>


Dear Borg developers,

thanks for the awesome tool! I've been using borgbackup now for
more than 6 months and created 533 backups successfully!

But now the disk is ~98% full and suddenly I have some troubles :-)
I've received great feedback already in IRC but some questions still
trouble me:
(1) My archive is now 3.4 TB (reported with 'du'), but borg list says
    the deduplicated archive size is 1.82 TB. Why are the two numbers
    off by 50%? Below the full output of my borg list.

(2) In the last months, my backup size went up quite a lot, even though
    I did not change anything in borg. So I'd like to reverse engineer
    which archives (or which files) contribute to the sudden increase in
    size. I tried "borg list" on all archives, but only 7 have ~3 GB of
    deduplicated space, and all others have less than 1 GB of dedup space!
    I assumed 533 archives of ~1 GB dedup size = 533 GB total, but my
    math must be quite wrong? I saw the documentation of "borg list"
    but it does not help me understand :-(
    How would I find the archives that free most space when deleted?

(3) borg delete was incredibly slow for me. I killed it after two hours,
    and it had read 500GB of the archive by then (reported with iotop).
    I understood from IRC discussion that both prune and delete would
    require reading the full 3.4 TB once per run, to sanitize some index?
    That would break borg usage for me, since this will very much wear
    the disk and also takes ~8hrs on my encrypted drive! Am I doing some-
    thing wrong? Are there tricks or workarounds, for example when
    deleting only from localhost?
    I'd like to offer a bounty of ~?20-?25 for a better solution, or a
    generally much faster delete and/or much faster prune. If possible
    I'd rather not have borg read the full 3.4TB archive!

PS: My preferred deletion pattern would keep an increasing number of
    archives over time, like monthly backups from the past 10 years,
    weekly from the past year, and  daily from past month. I can build
    this list of deletions with bash easily! But borg delete or prune
    are currently *way* to slow to be used this way :-(


#> borg list archive::somebackup
Number of files: 1796064
                       Original size      Compressed size    Deduplicated size
This archive:               95.27 GB             70.53 GB            178.00 MB
All archives:               78.26 TB             65.13 TB              1.82 TB
                       Unique chunks         Total chunks
Chunk index:                 9733154            414693364


Thanks a lot and all the best,

    Mario Emmenlauer


--
BioDataAnalysis GmbH, Mario Emmenlauer      Tel. Buero: +49-89-74677203
Balanstr. 43                   mailto: memmenlauer * biodataanalysis.de
D-81669 M?nchen                          http://www.biodataanalysis.de/

From tw at waldmann-edv.de  Wed Dec 21 08:31:06 2016
From: tw at waldmann-edv.de (Thomas Waldmann)
Date: Wed, 21 Dec 2016 14:31:06 +0100
Subject: [Borgbackup] faster / better deletion, for a bounty?
In-Reply-To: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de>
References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de>
Message-ID: <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de>

Hi Mario,

> But now the disk is ~98% full

Avoid that it fills up completely, borg needs free space, even for delete.

> (1) My archive is now 3.4 TB (reported with 'du'), but borg list says
>     the deduplicated archive size is 1.82 TB. Why are the two numbers
>     off by 50%? Below the full output of my borg list.

Did you activate append-only mode for the repo?

While append-only is set, borg prune/delete will not be able to really
remove data.

> (2) In the last months, my backup size went up quite a lot, even though
>     I did not change anything in borg. So I'd like to reverse engineer
>     which archives (or which files) contribute to the sudden increase in
>     size. I tried "borg list" on all archives, but only 7 have ~3 GB of
>     deduplicated space, and all others have less than 1 GB of dedup space!
>     I assumed 533 archives of ~1 GB dedup size = 533 GB total,

No, that is only the sum of the space ONLY used by a single archive.

As soon as the same chunks are used by more than 1 archive, it does not
show up as "unique chunks" any more.

>     How would I find the archives that free most space when deleted?

For a single archive deletion, that is the unique chunks space
("deduplicated size") of that archive.

For multiple archive deletion there is no easy way to see beforehands.

> (3) borg delete was incredibly slow for me. I killed it after two hours,
>     and it had read 500GB of the archive by then (reported with iotop).
>     I understood from IRC discussion that both prune and delete would
>     require reading the full 3.4 TB once per run, to sanitize some index?

No, they usually do not need to read all your data.

The worst case might be that, though.

>     Are there tricks or workarounds, for example when
>     deleting only from localhost?

If you use borg with encryption (default), you'ld need to use the
encryption key on the repo machine. It depends on how much you trust
that machine whether you want to do that or not.

>     I'd like to offer a bounty of ~?20-?25 for a better solution, or a
>     generally much faster delete and/or much faster prune.

Some improvements will come with borg 1.1 (which is currently still in
beta, so be very careful).

> PS: My preferred deletion pattern would keep an increasing number of
>     archives over time, like monthly backups from the past 10 years,
>     weekly from the past year, and  daily from past month.

That's how borg prune works.

>     I can build
>     this list of deletions with bash easily! But borg delete or prune
>     are currently *way* to slow to be used this way :-(

borg prune (when it deletes more than 1 archive per run) is faster than
borg delete. It uses delete internally, but doing multiple deletes at
once is a bit more efficient.

Cheers,

Thomas

-- 

GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393


From tw at waldmann-edv.de  Wed Dec 21 08:57:45 2016
From: tw at waldmann-edv.de (Thomas Waldmann)
Date: Wed, 21 Dec 2016 14:57:45 +0100
Subject: [Borgbackup] bounty advice
In-Reply-To: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de>
References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de>
Message-ID: <9340d4ab-98b0-324f-19f6-969b8121c53c@waldmann-edv.de>

>     I'd like to offer a bounty of ~?20-?25 for a better solution, ...

Some words about bounties:

They are very welcome and give additional motivation for a task.
Especially useful if the task itself isn't that interesting, but somehow
needs to be done. Or to push some issue to better visibility
(bountysource label) and attract more attention to it.

Bounties for clear and small tasks work better than for complex or
unclear goals. If the task is very complex or very unclear or even
impossible, it will take a long time or will never get done.

Bounties for stuff that totally lack fundamentals in current codebase
are also going to take rather long up to infinite.

-- 

GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393


From jgoerzen at complete.org  Wed Dec 21 08:58:51 2016
From: jgoerzen at complete.org (John Goerzen)
Date: Wed, 21 Dec 2016 07:58:51 -0600
Subject: [Borgbackup] faster / better deletion, for a bounty?
In-Reply-To: <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de>
References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de>
 <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de>
Message-ID: <6f48fa2a-706b-1fd3-ff28-84abebe964cc@complete.org>


On 12/21/2016 07:31 AM, Thomas Waldmann wrote:
>
>> (3) borg delete was incredibly slow for me. I killed it after two hours,
>>     and it had read 500GB of the archive by then (reported with iotop).
>>     I understood from IRC discussion that both prune and delete would
>>     require reading the full 3.4 TB once per run, to sanitize some index?
> No, they usually do not need to read all your data.
>
> The worst case might be that, though.

Hi Thomas,


Can you elaborate on this?  It may potentially be a pretty big problem
for me storing backups on remote dumb storage.

BTW, thanks to you and everyone for all the work you've done on
Borgbackup.  It has made incredible progress since I last looked at
Attic (which was right around the time of the Borg fork.)  I'm seriously
evaluating a switch from Obnam.

John


From tw at waldmann-edv.de  Wed Dec 21 09:16:27 2016
From: tw at waldmann-edv.de (Thomas Waldmann)
Date: Wed, 21 Dec 2016 15:16:27 +0100
Subject: [Borgbackup] faster / better deletion, for a bounty?
In-Reply-To: <6f48fa2a-706b-1fd3-ff28-84abebe964cc@complete.org>
References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de>
 <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de>
 <6f48fa2a-706b-1fd3-ff28-84abebe964cc@complete.org>
Message-ID: <ede89cec-84b6-40c8-59c9-b1e226297060@waldmann-edv.de>

>>> (3) borg delete was incredibly slow for me. I killed it after two hours,
>>>     and it had read 500GB of the archive by then (reported with iotop).
>>>     I understood from IRC discussion that both prune and delete would
>>>     require reading the full 3.4 TB once per run, to sanitize some index?
>> No, they usually do not need to read all your data.
>>
>> The worst case might be that, though.
> 
> Can you elaborate on this?

Well, I think it is very rare / synthetic, but one could imagine an
archive referencing one chunk in every segment file.

If at time of deletion of that archive these references are the last
references to these chunks, the chunks get unused.

borg 1.0 behaviour is to compact segments to free up space
(unconditionally, iirc).

So, if there is a unused chunk in every segment (unused, but allocated
disk space), it would read all the still used chunks from these segments
and create new, compact segments from them.

borg 1.1 introduces a threshold, so it won't compact segments if there
is only little gain.

> BTW, thanks to you and everyone for all the work you've done on
> Borgbackup.  It has made incredible progress since I last looked at
> Attic (which was right around the time of the Borg fork.)  I'm seriously
> evaluating a switch from Obnam.

Great. :) Your blog post from 2015 always shows in top search results
when searching for a comparison of obnam, attic (borg), ...

Would be great to have a refresh of that at some time.


-- 

GPG ID: 9F88FB52FAF7B393
GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393


From jgoerzen at complete.org  Wed Dec 21 09:36:40 2016
From: jgoerzen at complete.org (John Goerzen)
Date: Wed, 21 Dec 2016 08:36:40 -0600
Subject: [Borgbackup] faster / better deletion, for a bounty?
In-Reply-To: <ede89cec-84b6-40c8-59c9-b1e226297060@waldmann-edv.de>
References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de>
 <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de>
 <6f48fa2a-706b-1fd3-ff28-84abebe964cc@complete.org>
 <ede89cec-84b6-40c8-59c9-b1e226297060@waldmann-edv.de>
Message-ID: <d55911b5-2c1e-b8d9-6dc1-6cc50047b9f2@complete.org>


On 12/21/2016 08:16 AM, Thomas Waldmann wrote:
>
> Well, I think it is very rare / synthetic, but one could imagine an
> archive referencing one chunk in every segment file.
>
> If at time of deletion of that archive these references are the last
> references to these chunks, the chunks get unused.
>
> borg 1.0 behaviour is to compact segments to free up space
> (unconditionally, iirc).
>
> So, if there is a unused chunk in every segment (unused, but allocated
> disk space), it would read all the still used chunks from these segments
> and create new, compact segments from them.
>
> borg 1.1 introduces a threshold, so it won't compact segments if there
> is only little gain.
Ah ha.  So that would address the issue I raised in my other email as
well.  Great!

I think I found this code in repository.py:494, hard-coded at rewriting
the segment if it frees at least 15% of it.  Could that threshold be
made configurable?  For my use case, I would probably set it to 80% or
even 90%.

Also, a question on decrementing the segment/chunk reference counts.  Is
that information kept solely in the cache, or is it also in the repo
somewhere?  If the latter, where does it get written?
>> BTW, thanks to you and everyone for all the work you've done on
>> Borgbackup.  It has made incredible progress since I last looked at
>> Attic (which was right around the time of the Borg fork.)  I'm seriously
>> evaluating a switch from Obnam.
> Great. :) Your blog post from 2015 always shows in top search results
> when searching for a comparison of obnam, attic (borg), ...
>
> Would be great to have a refresh of that at some time.

Already planning on it, yes!  The thing that prompted the re-eval was
the discovery that Obnam somehow wound up with a lot of missing chunks
in my repo, and that obnam fsck detects but does not correct this
issue.  There was apparently a bug in obnam forget that may have led to
this awhile back (since fixed).  Due to some annoying interactions
between davfs2 and the webdav server, it occasionally returned EPERM on
operations that would have been permitted with a retry.  This caused it
to crash in the middle of a forget, with annoying consequences. 

The new default chunker params are the big thing that address the issue
I had with Attic.  Storing multiple chunks in segments also should lead
to a big performance benefit compared to Obnam as well.

John


From mario at emmenlauer.de  Wed Dec 21 17:25:30 2016
From: mario at emmenlauer.de (Mario Emmenlauer)
Date: Wed, 21 Dec 2016 23:25:30 +0100
Subject: [Borgbackup] faster / better deletion, for a bounty?
In-Reply-To: <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de>
References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de>
 <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de>
Message-ID: <4397537a-b0eb-09d1-a154-76cff2ccf792@emmenlauer.de>


Hi Thomas,

On 21.12.2016 14:31, Thomas Waldmann wrote:
>> (1) My archive is now 3.4 TB (reported with 'du'), but borg list says
>>     the deduplicated archive size is 1.82 TB. Why are the two numbers
>>     off by 50%? Below the full output of my borg list.
> 
> Did you activate append-only mode for the repo?
> 
> While append-only is set, borg prune/delete will not be able to really
> remove data.

This is actually before I performed any deletions. The disk usage is
reported as 3.4 TB by du and df, whereas borg reports the total dedup
size as "only" 1.8TB (so approx. 50% of the actual usage). Is this a
typical overhead, or is something fishy in my setup?


>> (2) In the last months, my backup size went up quite a lot, even though
>>     I did not change anything in borg. So I'd like to reverse engineer
>>     which archives (or which files) contribute to the sudden increase in
>>     size. I tried "borg list" on all archives, but only 7 have ~3 GB of
>>     deduplicated space, and all others have less than 1 GB of dedup space!
>>     I assumed 533 archives of ~1 GB dedup size = 533 GB total,
> 
> No, that is only the sum of the space ONLY used by a single archive.
> 
> As soon as the same chunks are used by more than 1 archive, it does not
> show up as "unique chunks" any more.
> 
>>     How would I find the archives that free most space when deleted?
> 
> For a single archive deletion, that is the unique chunks space
> ("deduplicated size") of that archive.
> 
> For multiple archive deletion there is no easy way to see beforehands.

Would it be possible to somehow change this reporting in borg? I
think I (possibly accidentally) backed up a few huge files for a few
days, that now use up 50% of my archive space. Since the chunks are
shared, I have no way of knowing which archives are the "bad guys".
My only option seems to prune with a shotgun-approach until eventually
I get lucky and free significant disk space. If I'm unlucky I can
prune a lot before freeing any significant space...

I think for example 'du' when used on hard links reports the shared
disk usage on the first directory it encounters, and does not duplicate
the size of hard links on subsequent directories. Would this be a sane
behaviour for borg too? Or add a new field for "shared chunks size"?


Thanks a lot for the help, and all the best,

    Mario Emmenlauer


From jgoerzen at complete.org  Wed Dec 21 19:37:22 2016
From: jgoerzen at complete.org (John Goerzen)
Date: Wed, 21 Dec 2016 18:37:22 -0600
Subject: [Borgbackup] faster / better deletion, for a bounty?
In-Reply-To: <4397537a-b0eb-09d1-a154-76cff2ccf792@emmenlauer.de>
References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de>
 <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de>
 <4397537a-b0eb-09d1-a154-76cff2ccf792@emmenlauer.de>
Message-ID: <befa9399-a985-7604-c279-48c9b82e4142@complete.org>


On 12/21/2016 04:25 PM, Mario Emmenlauer wrote:
> Hi Thomas,
>
> On 21.12.2016 14:31, Thomas Waldmann wrote:
>>> (1) My archive is now 3.4 TB (reported with 'du'), but borg list says
>>>      the deduplicated archive size is 1.82 TB. Why are the two numbers
>>>      off by 50%? Below the full output of my borg list.
>> Did you activate append-only mode for the repo?
>>
>> While append-only is set, borg prune/delete will not be able to really
>> remove data.
> This is actually before I performed any deletions. The disk usage is
> reported as 3.4 TB by du and df, whereas borg reports the total dedup
> size as "only" 1.8TB (so approx. 50% of the actual usage). Is this a
> typical overhead, or is something fishy in my setup?

Hi Mario,

On my system (which is zfs-backed for the moment), zfs list and df 
actually show *less* space used than borg does.  I'm still trying to 
figure that one out ;-)

If I understand the dedup size correctly -- and that's an *if* since I 
have not been using borg for more than a few days -- its meaning is /how 
much space will be freed if you delete just this one archive/.  This 
makes a lot of sense to me, because it is exactly the same way zfs gives 
me the size of snapshots.

If you have very little change in your datasets but a high number of 
archives, it would be possible for you to have terabytes of data under 
management and a sum of the dedup size of almost zero.  This would not 
be an error, given the meaning listed.

It is also, therefore, expected that if you remove an archive, the dedup 
size listed in other archives may increase, since if there was a chunk 
in common between the deleted archive and the other one, it wouldn't 
have shown up in the dedup size of either (since deleting /just that one 
archive/ would not free its space), but once one of the two archives is 
gone, it would be counted to the other.

Does that make sense?

How you count up space is a funny business when you have deduplication 
going on.  Same when you have hard links in your filesystem.  (du can 
say you've got 50GB in a directory, but you might find that rm -r on it 
only frees up 50K if there's a lot of hardlinks to other areas.)

I think zfs might have a little clearer terminology on this: 
"referenced" is how much data is pointed to by a given snapshot, and 
"used" is how much space would be freed if only that one snapshot were 
deleted right now.  That's like borg's archive size and dedup size.

John


>
>
>>> (2) In the last months, my backup size went up quite a lot, even though
>>>      I did not change anything in borg. So I'd like to reverse engineer
>>>      which archives (or which files) contribute to the sudden increase in
>>>      size. I tried "borg list" on all archives, but only 7 have ~3 GB of
>>>      deduplicated space, and all others have less than 1 GB of dedup space!
>>>      I assumed 533 archives of ~1 GB dedup size = 533 GB total,
>> No, that is only the sum of the space ONLY used by a single archive.
>>
>> As soon as the same chunks are used by more than 1 archive, it does not
>> show up as "unique chunks" any more.
>>
>>>      How would I find the archives that free most space when deleted?
>> For a single archive deletion, that is the unique chunks space
>> ("deduplicated size") of that archive.
>>
>> For multiple archive deletion there is no easy way to see beforehands.
> Would it be possible to somehow change this reporting in borg? I
> think I (possibly accidentally) backed up a few huge files for a few
> days, that now use up 50% of my archive space. Since the chunks are
> shared, I have no way of knowing which archives are the "bad guys".
> My only option seems to prune with a shotgun-approach until eventually
> I get lucky and free significant disk space. If I'm unlucky I can
> prune a lot before freeing any significant space...
>
> I think for example 'du' when used on hard links reports the shared
> disk usage on the first directory it encounters, and does not duplicate
> the size of hard links on subsequent directories. Would this be a sane
> behaviour for borg too? Or add a new field for "shared chunks size"?
>
>
> Thanks a lot for the help, and all the best,
>
>      Mario Emmenlauer
>
>
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161221/0a841efd/attachment.html>

From mario at emmenlauer.de  Thu Dec 22 04:14:09 2016
From: mario at emmenlauer.de (Mario Emmenlauer)
Date: Thu, 22 Dec 2016 10:14:09 +0100
Subject: [Borgbackup] faster / better deletion, for a bounty?
In-Reply-To: <befa9399-a985-7604-c279-48c9b82e4142@complete.org>
References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de>
 <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de>
 <4397537a-b0eb-09d1-a154-76cff2ccf792@emmenlauer.de>
 <befa9399-a985-7604-c279-48c9b82e4142@complete.org>
Message-ID: <87916a73-0200-780b-ee92-299f3264fed3@emmenlauer.de>


Hi John,

On 22.12.2016 01:37, John Goerzen wrote:
> On 12/21/2016 04:25 PM, Mario Emmenlauer wrote:
>> Hi Thomas,
>>
>> On 21.12.2016 14:31, Thomas Waldmann wrote:
>>>> (1) My archive is now 3.4 TB (reported with 'du'), but borg list says
>>>>     the deduplicated archive size is 1.82 TB. Why are the two numbers
>>>>     off by 50%? Below the full output of my borg list.
>>> Did you activate append-only mode for the repo?
>>>
>>> While append-only is set, borg prune/delete will not be able to really
>>> remove data.
>> This is actually before I performed any deletions. The disk usage is
>> reported as 3.4 TB by du and df, whereas borg reports the total dedup
>> size as "only" 1.8TB (so approx. 50% of the actual usage). Is this a
>> typical overhead, or is something fishy in my setup?
> 
> Hi Mario,
> 
> On my system (which is zfs-backed for the moment), zfs list and df actually show
> *less* space used than borg does.  I'm still trying to figure that one out ;-)

Haha that's interesting! Let me know what you find out :-)


> If I understand the dedup size correctly -- and that's an *if* since I have not
> been using borg for more than a few days -- its meaning is /how much space will
> be freed if you delete just this one archive/.  This makes a lot of sense to me,
> because it is exactly the same way zfs gives me the size of snapshots.
> 
> If you have very little change in your datasets but a high number of archives,
> it would be possible for you to have terabytes of data under management and a
> sum of the dedup size of almost zero.  This would not be an error, given the
> meaning listed.
> 
> It is also, therefore, expected that if you remove an archive, the dedup size
> listed in other archives may increase, since if there was a chunk in common
> between the deleted archive and the other one, it wouldn't have shown up in the
> dedup size of either (since deleting /just that one archive/ would not free its
> space), but once one of the two archives is gone, it would be counted to the other.
> 
> Does that make sense?

What you say makes perfect sense for a single archive. But borg reports also
numbers for "all archives", which I understood to be the numbers for the full
repository. Am I on the wrong track there? Because "all archives" is not the
sum of the individual archives, so I assumed its the repo. For the repo,
however, I think the dedup size should be equal to the disk size (except for
overheads like meta data, index, etc). Therefore I was surprised to see that
for me, its approx. 50% of disk usage.

See here the output of borg list on one of my archives:
Number of files: 1796064
                       Original size      Compressed size    Deduplicated size
This archive:               95.27 GB             70.53 GB            178.00 MB
All archives:               78.26 TB             65.13 TB              1.82 TB
                       Unique chunks         Total chunks
Chunk index:                 9733154            414693364

Cheers,

    Mario


> How you count up space is a funny business when you have deduplication going
> on.  Same when you have hard links in your filesystem.  (du can say you've got
> 50GB in a directory, but you might find that rm -r on it only frees up 50K if
> there's a lot of hardlinks to other areas.)
> 
> I think zfs might have a little clearer terminology on this: "referenced" is how
> much data is pointed to by a given snapshot, and "used" is how much space would
> be freed if only that one snapshot were deleted right now.  That's like borg's
> archive size and dedup size.
> 
> John
> 
> 
>>
>>
>>>> (2) In the last months, my backup size went up quite a lot, even though
>>>>     I did not change anything in borg. So I'd like to reverse engineer
>>>>     which archives (or which files) contribute to the sudden increase in
>>>>     size. I tried "borg list" on all archives, but only 7 have ~3 GB of
>>>>     deduplicated space, and all others have less than 1 GB of dedup space!
>>>>     I assumed 533 archives of ~1 GB dedup size = 533 GB total,
>>> No, that is only the sum of the space ONLY used by a single archive.
>>>
>>> As soon as the same chunks are used by more than 1 archive, it does not
>>> show up as "unique chunks" any more.
>>>
>>>>     How would I find the archives that free most space when deleted?
>>> For a single archive deletion, that is the unique chunks space
>>> ("deduplicated size") of that archive.
>>>
>>> For multiple archive deletion there is no easy way to see beforehands.
>> Would it be possible to somehow change this reporting in borg? I
>> think I (possibly accidentally) backed up a few huge files for a few
>> days, that now use up 50% of my archive space. Since the chunks are
>> shared, I have no way of knowing which archives are the "bad guys".
>> My only option seems to prune with a shotgun-approach until eventually
>> I get lucky and free significant disk space. If I'm unlucky I can
>> prune a lot before freeing any significant space...
>>
>> I think for example 'du' when used on hard links reports the shared
>> disk usage on the first directory it encounters, and does not duplicate
>> the size of hard links on subsequent directories. Would this be a sane
>> behaviour for borg too? Or add a new field for "shared chunks size"?
>>
>>
>> Thanks a lot for the help, and all the best,
>>
>>     Mario Emmenlauer
>>
>>
>> _______________________________________________
>> Borgbackup mailing list
>> Borgbackup at python.org
>> https://mail.python.org/mailman/listinfo/borgbackup
> 


Viele Gruesse,

    Mario Emmenlauer


--
BioDataAnalysis GmbH, Mario Emmenlauer      Tel. Buero: +49-89-74677203
Balanstr. 43                   mailto: memmenlauer * biodataanalysis.de
D-81669 M?nchen                          http://www.biodataanalysis.de/

From jgoerzen at complete.org  Thu Dec 22 09:44:57 2016
From: jgoerzen at complete.org (John Goerzen)
Date: Thu, 22 Dec 2016 08:44:57 -0600
Subject: [Borgbackup] faster / better deletion, for a bounty?
In-Reply-To: <87916a73-0200-780b-ee92-299f3264fed3@emmenlauer.de>
References: <378ddfde-5e1a-9924-4256-2b9287c91eca@emmenlauer.de>
 <2e4ac1e3-6919-b2bf-b6de-9e381308d752@waldmann-edv.de>
 <4397537a-b0eb-09d1-a154-76cff2ccf792@emmenlauer.de>
 <befa9399-a985-7604-c279-48c9b82e4142@complete.org>
 <87916a73-0200-780b-ee92-299f3264fed3@emmenlauer.de>
Message-ID: <38a54745-2747-709d-ef2b-f2c0d58a4da6@complete.org>


On 12/22/2016 03:14 AM, Mario Emmenlauer wrote:
> What you say makes perfect sense for a single archive. But borg 
> reports also
> numbers for "all archives", which I understood to be the numbers for the full
> repository. Am I on the wrong track there? Because "all archives" is not the
> sum of the individual archives, so I assumed its the repo. For the repo,
> however, I think the dedup size should be equal to the disk size (except for
> overheads like meta data, index, etc). Therefore I was surprised to see that
> for me, its approx. 50% of disk usage.
Ah, what you're saying there does seem to mesh with what's documented.  
You've got me then.

I wonder, what does du -sh over your repo show?  And is it any different 
if you add --apparent-size to du?

John
>
> See here the output of borg list on one of my archives:
> Number of files: 1796064
>                         Original size      Compressed size    Deduplicated size
> This archive:               95.27 GB             70.53 GB            178.00 MB
> All archives:               78.26 TB             65.13 TB              1.82 TB
>                         Unique chunks         Total chunks
> Chunk index:                 9733154            414693364
>
> Cheers,
>
>      Mario
>
>
>
>> How you count up space is a funny business when you have deduplication going
>> on.  Same when you have hard links in your filesystem.  (du can say you've got
>> 50GB in a directory, but you might find that rm -r on it only frees up 50K if
>> there's a lot of hardlinks to other areas.)
>>
>> I think zfs might have a little clearer terminology on this: "referenced" is how
>> much data is pointed to by a given snapshot, and "used" is how much space would
>> be freed if only that one snapshot were deleted right now.  That's like borg's
>> archive size and dedup size.
>>
>> John
>>
>>
>>>
>>>>> (2) In the last months, my backup size went up quite a lot, even though
>>>>>      I did not change anything in borg. So I'd like to reverse engineer
>>>>>      which archives (or which files) contribute to the sudden increase in
>>>>>      size. I tried "borg list" on all archives, but only 7 have ~3 GB of
>>>>>      deduplicated space, and all others have less than 1 GB of dedup space!
>>>>>      I assumed 533 archives of ~1 GB dedup size = 533 GB total,
>>>> No, that is only the sum of the space ONLY used by a single archive.
>>>>
>>>> As soon as the same chunks are used by more than 1 archive, it does not
>>>> show up as "unique chunks" any more.
>>>>
>>>>>      How would I find the archives that free most space when deleted?
>>>> For a single archive deletion, that is the unique chunks space
>>>> ("deduplicated size") of that archive.
>>>>
>>>> For multiple archive deletion there is no easy way to see beforehands.
>>> Would it be possible to somehow change this reporting in borg? I
>>> think I (possibly accidentally) backed up a few huge files for a few
>>> days, that now use up 50% of my archive space. Since the chunks are
>>> shared, I have no way of knowing which archives are the "bad guys".
>>> My only option seems to prune with a shotgun-approach until eventually
>>> I get lucky and free significant disk space. If I'm unlucky I can
>>> prune a lot before freeing any significant space...
>>>
>>> I think for example 'du' when used on hard links reports the shared
>>> disk usage on the first directory it encounters, and does not duplicate
>>> the size of hard links on subsequent directories. Would this be a sane
>>> behaviour for borg too? Or add a new field for "shared chunks size"?
>>>
>>>
>>> Thanks a lot for the help, and all the best,
>>>
>>>      Mario Emmenlauer
>>>
>>>
>>> _______________________________________________
>>> Borgbackup mailing list
>>> Borgbackup at python.org
>>> https://mail.python.org/mailman/listinfo/borgbackup
>
>
> Viele Gruesse,
>
>      Mario Emmenlauer
>
>
> --
> BioDataAnalysis GmbH, Mario Emmenlauer      Tel. Buero: +49-89-74677203
> Balanstr. 43                   mailto: memmenlauer * biodataanalysis.de
> D-81669 M?nchen                          http://www.biodataanalysis.de/


From mario at emmenlauer.de  Fri Dec 23 17:53:24 2016
From: mario at emmenlauer.de (Mario Emmenlauer)
Date: Fri, 23 Dec 2016 23:53:24 +0100
Subject: [Borgbackup] extended prune (was: faster / better deletion,
 for a bounty?)
Message-ID: <dc9434b2-ff41-f7c6-e4d0-615c398cb70c@emmenlauer.de>


Dear All,

it seems I am pretty lucky this year to have an early Christmas
present. First, I found the large archives in my repo by pure
chance and could free 50% of disk space with only a few deletions.
Now the actual disk usage is at 1.8TB again, which matches borg's
report of deduplicated size.

Furthermore, it seems that those huge deletions where the only
"slow" ones, because later I could prune another ~200 of ~500
archives in just little over 10 minutes, with borg 1.0.9.

Finally, I seemed unable to get prune do exactly what I hoped for.
It might be me, but I did not find exactly the right combination of
options. I take backups once per week, and if they are older than
one year, I'd like to keep only every other week.
In any case I was also curious to enable prune to handle a manual
selection of archives, so I tried, and got it working pretty easily.
I extended archive.py and helper.py with two new prune options
--keep-list and --remove-list, where the former takes a list of
archives to keep (all others are pruned) and the latter takes a
list of archives to prune (all others are kept). My patch against
borg 1.0.9 is available here
   https://github.com/emmenlau/borg/tree/emmenlau_better_prune
and I'm happy to make a PR if anyone is interested (sorry for the
bold name, its really just a very minor extension to prune).


Finally, thanks a lot again for the very nice borg! Your code was
very easy to read, and I found very helpful compile instructions
in the readme! This allowed me to get productive within a few
minutes! Nice work! It would be awesome to add the pyinstaller
instructions to the readme, but they where sufficiently easy to
find in an github issue report.

Thanks, and happy holidays,

    Mario Emmenlauer

From jgoerzen at complete.org  Fri Dec 23 17:59:56 2016
From: jgoerzen at complete.org (John Goerzen)
Date: Fri, 23 Dec 2016 16:59:56 -0600
Subject: [Borgbackup] extended prune (was: faster / better deletion,
 for a bounty?)
In-Reply-To: <dc9434b2-ff41-f7c6-e4d0-615c398cb70c@emmenlauer.de>
References: <dc9434b2-ff41-f7c6-e4d0-615c398cb70c@emmenlauer.de>
Message-ID: <f6c06eaf-9717-b8e7-226f-83f08842cbac@complete.org>

Hi Mario,

Just a couple quick comments:

1) Would those prune patches better be a 'borg delete' patch allowing
specification of multiple archives to zap?

2) You might try a hostname-monthly- or hostname-weekly- pattern in your
archive naming to let you achieve what you want with prune.

John

On 12/23/2016 04:53 PM, Mario Emmenlauer wrote:
> Dear All,
>
> it seems I am pretty lucky this year to have an early Christmas
> present. First, I found the large archives in my repo by pure
> chance and could free 50% of disk space with only a few deletions.
> Now the actual disk usage is at 1.8TB again, which matches borg's
> report of deduplicated size.
>
> Furthermore, it seems that those huge deletions where the only
> "slow" ones, because later I could prune another ~200 of ~500
> archives in just little over 10 minutes, with borg 1.0.9.
>
> Finally, I seemed unable to get prune do exactly what I hoped for.
> It might be me, but I did not find exactly the right combination of
> options. I take backups once per week, and if they are older than
> one year, I'd like to keep only every other week.
> In any case I was also curious to enable prune to handle a manual
> selection of archives, so I tried, and got it working pretty easily.
> I extended archive.py and helper.py with two new prune options
> --keep-list and --remove-list, where the former takes a list of
> archives to keep (all others are pruned) and the latter takes a
> list of archives to prune (all others are kept). My patch against
> borg 1.0.9 is available here
>    https://github.com/emmenlau/borg/tree/emmenlau_better_prune
> and I'm happy to make a PR if anyone is interested (sorry for the
> bold name, its really just a very minor extension to prune).
>
>
> Finally, thanks a lot again for the very nice borg! Your code was
> very easy to read, and I found very helpful compile instructions
> in the readme! This allowed me to get productive within a few
> minutes! Nice work! It would be awesome to add the pyinstaller
> instructions to the readme, but they where sufficiently easy to
> find in an github issue report.
>
> Thanks, and happy holidays,
>
>     Mario Emmenlauer
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup


From mario at emmenlauer.de  Fri Dec 23 18:21:00 2016
From: mario at emmenlauer.de (Mario Emmenlauer)
Date: Sat, 24 Dec 2016 00:21:00 +0100
Subject: [Borgbackup] extended prune (was: faster / better deletion,
 for a bounty?)
In-Reply-To: <f6c06eaf-9717-b8e7-226f-83f08842cbac@complete.org>
References: <dc9434b2-ff41-f7c6-e4d0-615c398cb70c@emmenlauer.de>
 <f6c06eaf-9717-b8e7-226f-83f08842cbac@complete.org>
Message-ID: <f66eef63-7eda-f157-01b3-d16b96a742c2@emmenlauer.de>


Hi,

whow that was a quick reply! :-) Thanks! :-)

On 23.12.2016 23:59, John Goerzen wrote:
> Just a couple quick comments:
> 
> 1) Would those prune patches better be a 'borg delete' patch allowing
> specification of multiple archives to zap?

I'm fully open to that. My logic was that prune and delete are
separated by the fact that prune performs multiple deletions in
one go, and so my patch would fit prune more than delete. But
I'm really new to borg, and any advise is happily accepted!
What do others think? Is my patch worthwhile at all?


> 2) You might try a hostname-monthly- or hostname-weekly- pattern in your
> archive naming to let you achieve what you want with prune.

Yes, as long as I'd keep a similar pattern its true. But I'm very
fond of the new option because for my "huge" prune I was able to pick
a crude mix of hand-picked archives together with different patterns
for different times of different hosts. With some bash-foo that was
less than ten minutes of work, and I could pass them all to prune in
one go (hopefully making the best use of disk I/O). In fact it would
be trivial to pick any other wild selection like the X largest archives
or whatever, by combining borg's statistics with the new --remove-list
option.

But really this is just be me, and admittedly I did not invest too
much time to try to understand prune's current behaviour :-)

Oh and a related note: Thomas mentioned in IRC the idea that borg
could use a garbage collection instead of immediate deletions. I very
much cherish this idea because deletions could be instant, and disk
space can be freed with "borg gc" whenever suitable (i.e. after a
long repo re-organization with deletions, renames, new backups etc).

Cheers,

    Mario


> John
> 
> On 12/23/2016 04:53 PM, Mario Emmenlauer wrote:
>> Dear All,
>>
>> it seems I am pretty lucky this year to have an early Christmas
>> present. First, I found the large archives in my repo by pure
>> chance and could free 50% of disk space with only a few deletions.
>> Now the actual disk usage is at 1.8TB again, which matches borg's
>> report of deduplicated size.
>>
>> Furthermore, it seems that those huge deletions where the only
>> "slow" ones, because later I could prune another ~200 of ~500
>> archives in just little over 10 minutes, with borg 1.0.9.
>>
>> Finally, I seemed unable to get prune do exactly what I hoped for.
>> It might be me, but I did not find exactly the right combination of
>> options. I take backups once per week, and if they are older than
>> one year, I'd like to keep only every other week.
>> In any case I was also curious to enable prune to handle a manual
>> selection of archives, so I tried, and got it working pretty easily.
>> I extended archive.py and helper.py with two new prune options
>> --keep-list and --remove-list, where the former takes a list of
>> archives to keep (all others are pruned) and the latter takes a
>> list of archives to prune (all others are kept). My patch against
>> borg 1.0.9 is available here
>>    https://github.com/emmenlau/borg/tree/emmenlau_better_prune
>> and I'm happy to make a PR if anyone is interested (sorry for the
>> bold name, its really just a very minor extension to prune).
>>
>>
>> Finally, thanks a lot again for the very nice borg! Your code was
>> very easy to read, and I found very helpful compile instructions
>> in the readme! This allowed me to get productive within a few
>> minutes! Nice work! It would be awesome to add the pyinstaller
>> instructions to the readme, but they where sufficiently easy to
>> find in an github issue report.
>>
>> Thanks, and happy holidays,
>>
>>     Mario Emmenlauer


Viele Gruesse,

    Mario Emmenlauer


--
BioDataAnalysis GmbH, Mario Emmenlauer      Tel. Buero: +49-89-74677203
Balanstr. 43                   mailto: memmenlauer * biodataanalysis.de
D-81669 M?nchen                          http://www.biodataanalysis.de/

From jdc at uwo.ca  Fri Dec 23 19:33:23 2016
From: jdc at uwo.ca (Dan Christensen)
Date: Fri, 23 Dec 2016 19:33:23 -0500
Subject: [Borgbackup] extended prune
In-Reply-To: <dc9434b2-ff41-f7c6-e4d0-615c398cb70c@emmenlauer.de> (Mario
 Emmenlauer's message of "Fri, 23 Dec 2016 23:53:24 +0100")
References: <dc9434b2-ff41-f7c6-e4d0-615c398cb70c@emmenlauer.de>
Message-ID: <87y3z6kvxo.fsf@uwo.ca>

On Dec 23, 2016, Mario Emmenlauer <mario at emmenlauer.de> wrote:

> Finally, I seemed unable to get prune do exactly what I hoped for.
> It might be me, but I did not find exactly the right combination of
> options. I take backups once per week, and if they are older than
> one year, I'd like to keep only every other week.

Would "--keep-within 1y --keep-monthly -1" do the trick?  I think it
would occasionally skip two weeklies, if three happened to fit into one
month, but maybe that's close enough?  You can test with "-n".

Dan

From mario at emmenlauer.de  Sat Dec 24 06:48:56 2016
From: mario at emmenlauer.de (Mario Emmenlauer)
Date: Sat, 24 Dec 2016 12:48:56 +0100
Subject: [Borgbackup] extended prune
In-Reply-To: <87y3z6kvxo.fsf@uwo.ca>
References: <dc9434b2-ff41-f7c6-e4d0-615c398cb70c@emmenlauer.de>
 <87y3z6kvxo.fsf@uwo.ca>
Message-ID: <d6648732-0920-a6e9-8f3a-a39fbe332487@emmenlauer.de>


Hi Dan,

thanks for the reply! Below more:

On 24.12.2016 01:33, Dan Christensen wrote:
> On Dec 23, 2016, Mario Emmenlauer <mario at emmenlauer.de> wrote:
> 
>> Finally, I seemed unable to get prune do exactly what I hoped for.
>> It might be me, but I did not find exactly the right combination of
>> options. I take backups once per week, and if they are older than
>> one year, I'd like to keep only every other week.
> 
> Would "--keep-within 1y --keep-monthly -1" do the trick?  I think it
> would occasionally skip two weeklies, if three happened to fit into one
> month, but maybe that's close enough?  You can test with "-n".

It seems this will keep one monthly archive, is that possible? At
least when I checked it seemed it would prune from 2015 everything
except one monthly archive. I would prefer to keep two monthly
archives.

Even better, I would love that more then 24 months ago, one archive
per month should be kept, for 12 to 24 months ago two archives per
month should be kept, and in the past 12 months, weekly archives
should be kept. And I have eight hosts in the same repository, so
this pattern should apply per host. To make the best use of disk-IO,
I'd like to combine all this into a single prune. With the new
--remove-list option (and some bash-foo), this is really easy for me
to achieve.

Thanks and Cheers,

    Mario Emmenlauer


--
BioDataAnalysis GmbH, Mario Emmenlauer      Tel. Buero: +49-89-74677203
Balanstr. 43                   mailto: memmenlauer * biodataanalysis.de
D-81669 M?nchen                          http://www.biodataanalysis.de/

From public at enkore.de  Sat Dec 24 06:53:01 2016
From: public at enkore.de (Marian Beermann)
Date: Sat, 24 Dec 2016 12:53:01 +0100
Subject: [Borgbackup] extended prune (was: faster / better deletion,
 for a bounty?)
In-Reply-To: <f6c06eaf-9717-b8e7-226f-83f08842cbac@complete.org>
References: <dc9434b2-ff41-f7c6-e4d0-615c398cb70c@emmenlauer.de>
 <f6c06eaf-9717-b8e7-226f-83f08842cbac@complete.org>
Message-ID: <889a59c7-7fee-6b95-af84-3bd22d72ffc1@enkore.de>

Hi Mario and John,

Re. 1.)

--remove-list feels a bit strange in prune -- usually it's run
automatically, why would one need to always remove the same archive
[name] over and over? I feel like this would be better suited to
'delete' with a syntax like

  borg delete repo::archive1 archive2 archive3 ...

A bit awkward due to the whole :: thing (can't change that anymore), but
ok I guess. What do you think?

--keep-list on the other hand could make sense to preserve some
important archives forever (without having to rename them outside the
--prefix). I like it.

Cheers, Marian

On 23.12.2016 23:59, John Goerzen wrote:
> Hi Mario,
> 
> Just a couple quick comments:
> 
> 1) Would those prune patches better be a 'borg delete' patch allowing
> specification of multiple archives to zap?
> 
> 2) You might try a hostname-monthly- or hostname-weekly- pattern in your
> archive naming to let you achieve what you want with prune.
> 
> John
> 
> On 12/23/2016 04:53 PM, Mario Emmenlauer wrote:
>> Dear All,
>>
>> it seems I am pretty lucky this year to have an early Christmas
>> present. First, I found the large archives in my repo by pure
>> chance and could free 50% of disk space with only a few deletions.
>> Now the actual disk usage is at 1.8TB again, which matches borg's
>> report of deduplicated size.
>>
>> Furthermore, it seems that those huge deletions where the only
>> "slow" ones, because later I could prune another ~200 of ~500
>> archives in just little over 10 minutes, with borg 1.0.9.
>>
>> Finally, I seemed unable to get prune do exactly what I hoped for.
>> It might be me, but I did not find exactly the right combination of
>> options. I take backups once per week, and if they are older than
>> one year, I'd like to keep only every other week.
>> In any case I was also curious to enable prune to handle a manual
>> selection of archives, so I tried, and got it working pretty easily.
>> I extended archive.py and helper.py with two new prune options
>> --keep-list and --remove-list, where the former takes a list of
>> archives to keep (all others are pruned) and the latter takes a
>> list of archives to prune (all others are kept). My patch against
>> borg 1.0.9 is available here
>>    https://github.com/emmenlau/borg/tree/emmenlau_better_prune
>> and I'm happy to make a PR if anyone is interested (sorry for the
>> bold name, its really just a very minor extension to prune).
>>
>>
>> Finally, thanks a lot again for the very nice borg! Your code was
>> very easy to read, and I found very helpful compile instructions
>> in the readme! This allowed me to get productive within a few
>> minutes! Nice work! It would be awesome to add the pyinstaller
>> instructions to the readme, but they where sufficiently easy to
>> find in an github issue report.
>>
>> Thanks, and happy holidays,
>>
>>     Mario Emmenlauer
>> _______________________________________________
>> Borgbackup mailing list
>> Borgbackup at python.org
>> https://mail.python.org/mailman/listinfo/borgbackup
> 
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
> 


From jdc at uwo.ca  Sat Dec 24 08:45:30 2016
From: jdc at uwo.ca (Dan Christensen)
Date: Sat, 24 Dec 2016 08:45:30 -0500
Subject: [Borgbackup] extended prune
In-Reply-To: <d6648732-0920-a6e9-8f3a-a39fbe332487@emmenlauer.de> (Mario
 Emmenlauer's message of "Sat, 24 Dec 2016 12:48:56 +0100")
References: <dc9434b2-ff41-f7c6-e4d0-615c398cb70c@emmenlauer.de>
 <87y3z6kvxo.fsf@uwo.ca>
 <d6648732-0920-a6e9-8f3a-a39fbe332487@emmenlauer.de>
Message-ID: <87h95tl9tx.fsf@uwo.ca>

On Dec 24, 2016, Mario Emmenlauer <mario at emmenlauer.de> wrote:

> On 24.12.2016 01:33, Dan Christensen wrote:
> 
>> Would "--keep-within 1y --keep-monthly -1" do the trick?  I think it
>> would occasionally skip two weeklies, if three happened to fit into one
>> month, but maybe that's close enough?  You can test with "-n".
>
> It seems this will keep one monthly archive, is that possible?

You are right.  I don't think you can achieve what you want with
the current pruning options.

Dan

From dastapov at gmail.com  Sun Dec 25 17:27:59 2016
From: dastapov at gmail.com (Dmitry Astapov)
Date: Sun, 25 Dec 2016 22:27:59 +0000
Subject: [Borgbackup] Why does borg delete/prune write a bunch of new
 data?
In-Reply-To: <a4abb28e-dd60-6046-f5f9-865fc85edb52@complete.org>
References: <a2bcb140-cbb3-8188-198c-22344d500b72@complete.org>
 <a4abb28e-dd60-6046-f5f9-865fc85edb52@complete.org>
Message-ID: <CAFQUnFhnp-Nzfu=d2K+ic4VVDbasuygwhiN93wX1250T2pZ6tA@mail.gmail.com>

I'm also shipping my backups off to S3, and I support this idea (I also now
know why my S3 bills are constantly higher than I expect them to be :)

On Tue, Dec 20, 2016 at 11:09 PM, John Goerzen <jgoerzen at complete.org>
wrote:

> I've done some digging into this, and it seems the reason is
> compact_segments() in repository.py.
>
> It both deletes the segments that are completely unused, and also (if
> I'm understanding correctly), takes segments containing some objects
> that are unused and some objects that are still used and writes new
> segments containing only the used objects.
>
> The end result is some space savings, at the cost of a lot of I/O.  I
> wonder how hard it would be to support deleting unused segments without
> bothering to rewrite segments that are partially used?
>
> thanks,
>
> John
>
> On 12/20/2016 08:28 AM, John Goerzen wrote:
> > Hi folks,
> >
> > So I'm doing some testing of Borg.  My ultimate aim is to rsync the
> > backups to a dumb (WebDAV or S3-type) host.
> >
> > I made a run of borg over a real subset of my data, about 80GB worth.
> > I then cleaned up and deleted a good chunk of data throughout that
> > area, and made another archive with borg create.
> >
> > So far so good.  Now I ran borg delete to remove the archive with all
> > the extra data.  Sure enough, about 2GB freed up on the disk after.
> >
> > However, watching the process with strace and examining the
> > filesystem, I observed it wrote a considerable amount of new segments
> > to the data directory.  A little analysis with ls and du shows it
> > wrote right around 2GB of new segments.  (It also, of course, unlinked
> > a considerable number of segments.)
> >
> > Having to rsync 2GB of new data every time I delete data is going to
> > be rather sub-optimal on my poor DSL.  Any ideas why it's doing this?
> > FWIW the index file is only a few tens of MBs.
> >
> > I'm using encryption and lzma compression.  I did double the
> > max_segment_size from 5MB to 10MB (a lot of experience with obnam
> > suggested this would improve the performance over the rsync situation)
> >
> > Thanks,
> >
> > John
> >
> > _______________________________________________
> > Borgbackup mailing list
> > Borgbackup at python.org
> > https://mail.python.org/mailman/listinfo/borgbackup
>
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup
>


-- 
Dmitry Astapov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161225/add36d6f/attachment.html>

From jgoerzen at complete.org  Sun Dec 25 17:35:08 2016
From: jgoerzen at complete.org (John Goerzen)
Date: Sun, 25 Dec 2016 16:35:08 -0600
Subject: [Borgbackup] Why does borg delete/prune write a bunch of new
 data?
In-Reply-To: <CAFQUnFhnp-Nzfu=d2K+ic4VVDbasuygwhiN93wX1250T2pZ6tA@mail.gmail.com>
References: <a2bcb140-cbb3-8188-198c-22344d500b72@complete.org>
 <a4abb28e-dd60-6046-f5f9-865fc85edb52@complete.org>
 <CAFQUnFhnp-Nzfu=d2K+ic4VVDbasuygwhiN93wX1250T2pZ6tA@mail.gmail.com>
Message-ID: <7b2c9016-d9a6-322b-1b5b-853ab83785e1@complete.org>

Out of curiousity, what tool are you using to send them to S3?

On 12/25/2016 04:27 PM, Dmitry Astapov wrote:
> I'm also shipping my backups off to S3, and I support this idea (I 
> also now know why my S3 bills are constantly higher than I expect them 
> to be :)
>
> On Tue, Dec 20, 2016 at 11:09 PM, John Goerzen <jgoerzen at complete.org 
> <mailto:jgoerzen at complete.org>> wrote:
>
>     I've done some digging into this, and it seems the reason is
>     compact_segments() in repository.py.
>
>     It both deletes the segments that are completely unused, and also (if
>     I'm understanding correctly), takes segments containing some objects
>     that are unused and some objects that are still used and writes new
>     segments containing only the used objects.
>
>     The end result is some space savings, at the cost of a lot of I/O.  I
>     wonder how hard it would be to support deleting unused segments
>     without
>     bothering to rewrite segments that are partially used?
>
>     thanks,
>
>     John
>
>     On 12/20/2016 08:28 AM, John Goerzen wrote:
>     > Hi folks,
>     >
>     > So I'm doing some testing of Borg.  My ultimate aim is to rsync the
>     > backups to a dumb (WebDAV or S3-type) host.
>     >
>     > I made a run of borg over a real subset of my data, about 80GB
>     worth.
>     > I then cleaned up and deleted a good chunk of data throughout that
>     > area, and made another archive with borg create.
>     >
>     > So far so good.  Now I ran borg delete to remove the archive
>     with all
>     > the extra data.  Sure enough, about 2GB freed up on the disk after.
>     >
>     > However, watching the process with strace and examining the
>     > filesystem, I observed it wrote a considerable amount of new
>     segments
>     > to the data directory.  A little analysis with ls and du shows it
>     > wrote right around 2GB of new segments.  (It also, of course,
>     unlinked
>     > a considerable number of segments.)
>     >
>     > Having to rsync 2GB of new data every time I delete data is going to
>     > be rather sub-optimal on my poor DSL.  Any ideas why it's doing
>     this?
>     > FWIW the index file is only a few tens of MBs.
>     >
>     > I'm using encryption and lzma compression.  I did double the
>     > max_segment_size from 5MB to 10MB (a lot of experience with obnam
>     > suggested this would improve the performance over the rsync
>     situation)
>     >
>     > Thanks,
>     >
>     > John
>     >
>     > _______________________________________________
>     > Borgbackup mailing list
>     > Borgbackup at python.org <mailto:Borgbackup at python.org>
>     > https://mail.python.org/mailman/listinfo/borgbackup
>     <https://mail.python.org/mailman/listinfo/borgbackup>
>
>     _______________________________________________
>     Borgbackup mailing list
>     Borgbackup at python.org <mailto:Borgbackup at python.org>
>     https://mail.python.org/mailman/listinfo/borgbackup
>     <https://mail.python.org/mailman/listinfo/borgbackup>
>
>
>
>
> -- 
> Dmitry Astapov

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161225/bae353e2/attachment.html>

From dastapov at gmail.com  Sun Dec 25 17:36:08 2016
From: dastapov at gmail.com (Dmitry Astapov)
Date: Sun, 25 Dec 2016 22:36:08 +0000
Subject: [Borgbackup] Why does borg delete/prune write a bunch of new
 data?
In-Reply-To: <7b2c9016-d9a6-322b-1b5b-853ab83785e1@complete.org>
References: <a2bcb140-cbb3-8188-198c-22344d500b72@complete.org>
 <a4abb28e-dd60-6046-f5f9-865fc85edb52@complete.org>
 <CAFQUnFhnp-Nzfu=d2K+ic4VVDbasuygwhiN93wX1250T2pZ6tA@mail.gmail.com>
 <7b2c9016-d9a6-322b-1b5b-853ab83785e1@complete.org>
Message-ID: <CAFQUnFjbTMb19mjaH8Sudj0WGxFao-spcAuhxb5u4tUZxzvSAg@mail.gmail.com>

"s3cmd sync" from s3tools.org

On Sun, Dec 25, 2016 at 10:35 PM, John Goerzen <jgoerzen at complete.org>
wrote:

> Out of curiousity, what tool are you using to send them to S3?
>
>
> On 12/25/2016 04:27 PM, Dmitry Astapov wrote:
>
> I'm also shipping my backups off to S3, and I support this idea (I also
> now know why my S3 bills are constantly higher than I expect them to be :)
>
> On Tue, Dec 20, 2016 at 11:09 PM, John Goerzen <jgoerzen at complete.org>
> wrote:
>
>> I've done some digging into this, and it seems the reason is
>> compact_segments() in repository.py.
>>
>> It both deletes the segments that are completely unused, and also (if
>> I'm understanding correctly), takes segments containing some objects
>> that are unused and some objects that are still used and writes new
>> segments containing only the used objects.
>>
>> The end result is some space savings, at the cost of a lot of I/O.  I
>> wonder how hard it would be to support deleting unused segments without
>> bothering to rewrite segments that are partially used?
>>
>> thanks,
>>
>> John
>>
>> On 12/20/2016 08:28 AM, John Goerzen wrote:
>> > Hi folks,
>> >
>> > So I'm doing some testing of Borg.  My ultimate aim is to rsync the
>> > backups to a dumb (WebDAV or S3-type) host.
>> >
>> > I made a run of borg over a real subset of my data, about 80GB worth.
>> > I then cleaned up and deleted a good chunk of data throughout that
>> > area, and made another archive with borg create.
>> >
>> > So far so good.  Now I ran borg delete to remove the archive with all
>> > the extra data.  Sure enough, about 2GB freed up on the disk after.
>> >
>> > However, watching the process with strace and examining the
>> > filesystem, I observed it wrote a considerable amount of new segments
>> > to the data directory.  A little analysis with ls and du shows it
>> > wrote right around 2GB of new segments.  (It also, of course, unlinked
>> > a considerable number of segments.)
>> >
>> > Having to rsync 2GB of new data every time I delete data is going to
>> > be rather sub-optimal on my poor DSL.  Any ideas why it's doing this?
>> > FWIW the index file is only a few tens of MBs.
>> >
>> > I'm using encryption and lzma compression.  I did double the
>> > max_segment_size from 5MB to 10MB (a lot of experience with obnam
>> > suggested this would improve the performance over the rsync situation)
>> >
>> > Thanks,
>> >
>> > John
>> >
>> > _______________________________________________
>> > Borgbackup mailing list
>> > Borgbackup at python.org
>> > https://mail.python.org/mailman/listinfo/borgbackup
>>
>> _______________________________________________
>> Borgbackup mailing list
>> Borgbackup at python.org
>> https://mail.python.org/mailman/listinfo/borgbackup
>>
>
>
>
> --
> Dmitry Astapov
>
>
>


-- 
Dmitry Astapov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/borgbackup/attachments/20161225/26d22273/attachment-0001.html>

From borg at picturenow.co.uk  Thu Dec 29 09:45:21 2016
From: borg at picturenow.co.uk (Iain Mac Donald)
Date: Thu, 29 Dec 2016 14:45:21 +0000
Subject: [Borgbackup] Borg & Maildir
Message-ID: <20161229144521.4efe9a85@flora.coachhouse>


I have been running Borg for a week for our home server & family
laptops and everything is working just great.

The one problem I foresee is restoring deleted emails. Emails are
stored in Maildir format, filenames aren't seen by users, many
folders have high daily activity and some folders contain 10s of
thousands of files. Identifying which file to restore would be very
difficult.

The only way I can think of doing it is by restoring a whole folder to
a temporary location accessible to the IMAP server and then using an
IMAP client to identify the deleted message and copying it back to the
desired folder. Another approach might be doing text string searches on
the Borg backups, within date ranges, but I'm not sure how to do that.

I realise this is a bit off-topic but I was hoping someone on the list
might have a better solution to this problem.

Regards,
Iain.

From sitaramc at gmail.com  Thu Dec 29 10:42:54 2016
From: sitaramc at gmail.com (Sitaram Chamarty)
Date: Thu, 29 Dec 2016 21:12:54 +0530
Subject: [Borgbackup] Borg & Maildir
In-Reply-To: <20161229144521.4efe9a85@flora.coachhouse>
References: <20161229144521.4efe9a85@flora.coachhouse>
Message-ID: <20161229154254.GB22090@sita-lt.atc.tcs.com>

On Thu, Dec 29, 2016 at 02:45:21PM +0000, Iain Mac Donald wrote:
> 
> I have been running Borg for a week for our home server & family
> laptops and everything is working just great.
> 
> The one problem I foresee is restoring deleted emails. Emails are
> stored in Maildir format, filenames aren't seen by users, many
> folders have high daily activity and some folders contain 10s of
> thousands of files. Identifying which file to restore would be very
> difficult.
> 
> The only way I can think of doing it is by restoring a whole folder to
> a temporary location accessible to the IMAP server and then using an
> IMAP client to identify the deleted message and copying it back to the
> desired folder. Another approach might be doing text string searches on
> the Borg backups, within date ranges, but I'm not sure how to do that.
> 
> I realise this is a bit off-topic but I was hoping someone on the list
> might have a better solution to this problem.

Just use a mail client that can detect (and delete) duplicate
mails.  Once you have that, do what you said above -- i.e.,
restore the whole folder to a temp location -- then copy all
mails from there to the main one.  When that is done, delete
duplicates.

Thunderbird has an extension for this.  In mutt just use the
pattern '~=' and delete all mails with that pattern.  I'm sure
other mail clients have something.

The big downside to this is if you really only need a few mails,
but the maildir has tons of mails. You'd be processing all those
tons of mails just to grab one.

Playing around with 'formail' and message-IDs may also help.

regards
sitaram

From borg at picturenow.co.uk  Thu Dec 29 11:22:24 2016
From: borg at picturenow.co.uk (Iain Mac Donald)
Date: Thu, 29 Dec 2016 16:22:24 +0000
Subject: [Borgbackup] Borg & Maildir
In-Reply-To: <20161229154254.GB22090@sita-lt.atc.tcs.com>
References: <20161229144521.4efe9a85@flora.coachhouse>
 <20161229154254.GB22090@sita-lt.atc.tcs.com>
Message-ID: <20161229162224.527f73d8@flora.coachhouse>

On Thu, 29 Dec 2016 21:12:54 +0530
Sitaram Chamarty <sitaramc at gmail.com> wrote:

> Just use a mail client that can detect (and delete) duplicate
> mails.

We all use Claws Mail, which does have a "delete duplicate emails"
tool. I'll give it a try on a test folder and see how it goes. Not
sure I'd like to try that on, for example, my Sent folder which has
more than 50,000 emails (and a few Gig in size).

Ironically, a catastrophic failure, disc failure for instance, is
probably easier to deal with than restoring a few emails. The latter
is the more common occurrence in my experience.

Thanks for the suggestion.

Regards,
Iain. 

From third07 at gmail.com  Thu Dec 29 11:49:59 2016
From: third07 at gmail.com (Ed F.)
Date: Thu, 29 Dec 2016 10:49:59 -0600
Subject: [Borgbackup] Borg & Maildir
In-Reply-To: <20161229162224.527f73d8@flora.coachhouse>
References: <20161229144521.4efe9a85@flora.coachhouse>
 <20161229154254.GB22090@sita-lt.atc.tcs.com>
 <20161229162224.527f73d8@flora.coachhouse>
Message-ID: <CAG3deVr_2wxFeWQQNXRTiOuphneD1BKjEZmMEzjzPRmCqtPR4w@mail.gmail.com>

On Thu, Dec 29, 2016 at 10:22 AM, Iain Mac Donald <borg at picturenow.co.uk> wrote:

> Ironically, a catastrophic failure, disc failure for instance, is
> probably easier to deal with than restoring a few emails. The latter
> is the more common occurrence in my experience.

Use borg to FUSE mount an archive, and then use rsync with suitable
options to restore any updated files.

Ed

From mario at emmenlauer.de  Thu Dec 29 13:08:53 2016
From: mario at emmenlauer.de (Mario Emmenlauer)
Date: Thu, 29 Dec 2016 19:08:53 +0100
Subject: [Borgbackup] Borg & Maildir
In-Reply-To: <CAG3deVr_2wxFeWQQNXRTiOuphneD1BKjEZmMEzjzPRmCqtPR4w@mail.gmail.com>
References: <20161229144521.4efe9a85@flora.coachhouse>
 <20161229154254.GB22090@sita-lt.atc.tcs.com>
 <20161229162224.527f73d8@flora.coachhouse>
 <CAG3deVr_2wxFeWQQNXRTiOuphneD1BKjEZmMEzjzPRmCqtPR4w@mail.gmail.com>
Message-ID: <72b44801-5507-75ae-fc19-2cacabb5b0ef@emmenlauer.de>


Dear Iain,

On 29.12.2016 17:49, Ed F. wrote:
> On Thu, Dec 29, 2016 at 10:22 AM, Iain Mac Donald <borg at picturenow.co.uk> wrote:
> 
>> Ironically, a catastrophic failure, disc failure for instance, is
>> probably easier to deal with than restoring a few emails. The latter
>> is the more common occurrence in my experience.
> 
> Use borg to FUSE mount an archive, and then use rsync with suitable
> options to restore any updated files.

I think Ed's suggestion is the most suitable one. After you fuse-mount
the backup, you can rsync it to the mail server (if its not the same
machine), while preserving all time stamps etc. Then it should be fairly
easy to use rsync with --dry-run to perform a "dummy-sync" of the
backup with the current folder. This will list all files that where
deleted, and ignore all others. This way you get "cheaply" a list of
deleted emails. With some luck this list is a lot shorter than your
50.000 emails, and you can more easily nail it down to the ones you
want to restore...

Viele Gruesse,

    Mario Emmenlauer


--
BioDataAnalysis GmbH, Mario Emmenlauer      Tel. Buero: +49-89-74677203
Balanstr. 43                   mailto: memmenlauer * biodataanalysis.de
D-81669 M?nchen                          http://www.biodataanalysis.de/

From borg at picturenow.co.uk  Thu Dec 29 13:48:29 2016
From: borg at picturenow.co.uk (Iain Mac Donald)
Date: Thu, 29 Dec 2016 18:48:29 +0000
Subject: [Borgbackup] Borg & Maildir
In-Reply-To: <72b44801-5507-75ae-fc19-2cacabb5b0ef@emmenlauer.de>
References: <20161229144521.4efe9a85@flora.coachhouse>
 <20161229154254.GB22090@sita-lt.atc.tcs.com>
 <20161229162224.527f73d8@flora.coachhouse>
 <CAG3deVr_2wxFeWQQNXRTiOuphneD1BKjEZmMEzjzPRmCqtPR4w@mail.gmail.com>
 <72b44801-5507-75ae-fc19-2cacabb5b0ef@emmenlauer.de>
Message-ID: <20161229184829.1e890c65@flora.coachhouse>

On Thu, 29 Dec 2016 19:08:53 +0100
Mario Emmenlauer <mario at emmenlauer.de> wrote:

> I think Ed's suggestion is the most suitable one. After you fuse-mount
> the backup, you can rsync it to the mail server (if its not the same
> machine), while preserving all time stamps etc.

Ed & Mario,

Thanks, guys!

I think that sounds like a plan for dealing with bigger folders. Mail
server and backup server is the same machine which probably makes
things much easier.

I'll do a test run in the next few days.

Regards,
Iain.

From tve at voneicken.com  Sat Dec 31 15:41:59 2016
From: tve at voneicken.com (Thorsten von Eicken)
Date: Sat, 31 Dec 2016 20:41:59 +0000
Subject: [Borgbackup] extract pattern never matched
Message-ID: <01000159569e1cad-aef532ce-1ab3-46ed-9b8e-b2255a3252b8-000000@email.amazonses.com>

How are the extract patterns supposed to work? I can get the sh patterns 
to work, but not the fm ones. Example:

# borg list backup at backup:/big/h/home::home-2016-08-31T03:18-0700 | 
egrep big/home/weather/home/tve/relay
drwxrwxr-x tve    tve           0 Sun, 2016-07-24 22:22:53 
big/home/weather/home/tve/relay
-rw-rw-r-- tve    tve         284 Mon, 2013-12-02 23:44:04 
big/home/weather/home/tve/relay/aprs2-servers
-rw-r--r-- root   root    5256288 Mon, 2016-02-01 00:22:27 
big/home/weather/home/tve/relay/cwop.log.1.xz
-rwxrwxr-x tve    tve          34 Sun, 2014-02-23 15:48:26 
big/home/weather/home/tve/relay/doit
-rw-rw-r-- tve    tve    154513244 Sun, 2015-04-26 16:40:49 
big/home/weather/home/tve/relay/nohup.out.1.xz
-rw------- tve    tve    17961084 Sun, 2015-06-14 10:34:13 
big/home/weather/home/tve/relay/nohup.out.2.xz
-rw------- tve    tve    13316988 Sat, 2015-07-18 17:24:47 
big/home/weather/home/tve/relay/nohup.out.3.xz
-rw------- tve    tve    15975004 Sat, 2015-10-10 22:15:35 
big/home/weather/home/tve/relay/nohup.out.4.xz
-rw------- tve    tve    15338236 Sat, 2015-10-10 22:15:37 
big/home/weather/home/tve/relay/nohup.out.5.xz
-rw------- tve    tve    16072328 Tue, 2015-11-24 10:57:43 
big/home/weather/home/tve/relay/nohup.out.6.xz
-rw------- tve    tve    11614156 Sun, 2015-12-27 15:31:29 
big/home/weather/home/tve/relay/nohup.out.7.xz
-rw------- tve    tve    17320288 Thu, 2016-02-18 21:20:05 
big/home/weather/home/tve/relay/nohup.out.8.xz
-rw-rw-r-- tve    tve    26109380 Sun, 2016-07-24 22:18:42 
big/home/weather/home/tve/relay/nohup.out.9.xz
-rwxrwxr-x tve    tve        3227 Mon, 2014-11-17 21:57:51 
big/home/weather/home/tve/relay/relay.rb

Now a specific extract of one file, which works:

# borg extract --strip-components 6 
backup at backup:/big/h/home::home-2016-08-31T03:18-0700 
big/home/weather/home/tve/relay/nohup.out.9.xz
# ls -ls nohup.out.9.xz
25500 -rw-rw-r-- 1 tve tve 26109380 Jul 24 22:18 nohup.out.9.xz

But a pattern extract fails, why?

# borg extract --strip-components 6 
backup at backup:/big/h/home::home-2016-08-31T03:18-0700 
'big/home/weather/home/tve/relay/nohup*'
Include pattern 'big/home/weather/home/tve/relay/nohup*' never matched.

The same as a shell pattern works:

# borg extract --strip-components 6 
backup at backup:/big/h/home::home-2016-08-31T03:18-0700 
'sh:big/home/weather/home/tve/relay/nohup*'
# ls -ls nohup.out.9.xz
25500 -rw-rw-r-- 1 tve tve 26109380 Jul 24 22:18 nohup.out.9.xz

Borg client:

# borg debug-info
Platform: Linux h 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 
UTC 2016 x86_64 x86_64
Linux: debian stretch/sid
Borg: 1.0.8  Python: CPython 3.5.2
PID: 18886  CWD: /tmp/relay
sys.argv: ['borg', 'debug-info']
SSH_ORIGINAL_COMMAND: None

Borg server:

# borg debug-info
Platform: Linux backup 3.10.104-3-ARCH #1 SMP PREEMPT Mon Nov 14 
18:37:24 MST 2016 armv7l
Linux: arch
Borg: 1.0.8  Python: CPython 3.5.2
PID: 15160  CWD: /home/tve
sys.argv: ['/usr/bin/borg', 'debug-info']
SSH_ORIGINAL_COMMAND: None