From lazyvirus at gmx.com Tue Jul 7 20:05:09 2020 From: lazyvirus at gmx.com (Bzzzz) Date: Wed, 8 Jul 2020 02:05:09 +0200 Subject: [Borgbackup] Repo init failure on a NFS directory with a raspberry pi 4 [CLOSED] In-Reply-To: <20200630155451.11f13b67@msi.defcon1.lan> References: <20200630061921.571354a1@msi.defcon1.lan> <20200630155451.11f13b67@msi.defcon1.lan> Message-ID: <20200708020509.3aaf09b1@msi.defcon1.lan> On Tue, 30 Jun 2020 15:54:51 +0200 Bzzzz wrote: As a matter of fact, I had several problems w/ Raspberry OS using a 64 bit kernel and made a reinstall in pure 32 bit - I also installed the borgbackup Debian R.OS package (1.1.9-2) instead of trying to install 1.1.13 from pip3 and now, it works - I can now backup my Pi4 :) Jean-Yves > On Tue, 30 Jun 2020 15:40:53 +0200 > Thomas Waldmann wrote: > > > > "/usr/local/lib/python3.7/dist-packages/borg/crypto/file_integrity.py", > > > line 75 in write > > > Bus Error > > > > That is likely an unaligned access. > > > > This occurs when not running ARM in "fixup" mode, then it just blows > > up when doing an unaligned access. > > What do you call an unaligned access? Do you speak about memory > alignment on CPU word width? > > > In "fixup" mode, a software exception handler would just make the > > access work as expected. > > > > I heard that on raspi, the 32bit kernel usually runs in fixup mode, > > but the 64bit kernel does not (which i would say is an obviously bad > > idea). > > > > IIRC: > > > > - there is an issue about that in our github issue tracker > > Ok, I wasn't sure it was exactly the same issue. > > > - the unaligned access is not in borg code, but in a library / 3rd > > party code borg uses > > I go back to read the whole ticket. > > JY > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup From krause at biochem2.uni-frankfurt.de Wed Jul 8 04:45:02 2020 From: krause at biochem2.uni-frankfurt.de (David Krause) Date: Wed, 8 Jul 2020 10:45:02 +0200 Subject: [Borgbackup] 'borg list' on bigger repo runs for days. In-Reply-To: References: Message-ID: <034dfa50-1934-b6ed-5cab-4c681858a1a8@biochem2.uni-frankfurt.de> Noticed something else: When i launch the borg list with borg list --last 1 -p /path/to/repo i see that it stuck at: Replaying segments 0% Mit freundlichen Gr??en / With kind regards David Krause IT Institute of Biochemistry II Gustav Embden-Zentrum der Biochemie Goethe University Frankfurt - Medical Faculty University Hospital - Building 75 Theodor-Stern-Kai 7 60590 Frankfurt am Main Germany +49 (0)69 6301 84971 (Phone) +49 (0)69 6301 84975 (Fax) krause at biochem2.uni-frankfurt.de Am 08.07.2020 um 10:14 schrieb David Krause: > > Dear List, > > > We daily backup a storage of about 20tb into a repository which now > has about 1000tb worth of backup thanks to deduplication. > Everything was fine all the time, we could restore files from the > backup it was okayish fast (borg list on this repo took a few minutes), > but a few days ago, we again lost a file from the storage and needed > to restore it from the backup. > > During this time, a borg prune was running. I've/canceled it/ (by > sending kill to the process). > Now the `borg list` took about 2 full days, and still not finished. > When i check with iotop i see the python/borg process with 40mb/s > (~100% of this storage capabilities) running all the time. > > Any idea what i could do? > Do you have any explanation why this takes so long? > Or any tip to fasten this up when this is normal behavior? > > This is ubuntu's borg 1.1.5 > > > -- > Mit freundlichen Gr??en / With kind regards > > David Krause > IT > Institute of Biochemistry II > > Gustav Embden-Zentrum der Biochemie > Goethe University Frankfurt - Medical Faculty > > University Hospital - Building 75 > Theodor-Stern-Kai 7 > 60590 Frankfurt am Main > Germany > > +49 (0)69 6301 84971 (Phone) > +49 (0)69 6301 84975 (Fax) > krause at biochem2.uni-frankfurt.de -------------- next part -------------- An HTML attachment was scrubbed... URL: From krause at biochem2.uni-frankfurt.de Wed Jul 8 04:14:48 2020 From: krause at biochem2.uni-frankfurt.de (David Krause) Date: Wed, 8 Jul 2020 10:14:48 +0200 Subject: [Borgbackup] 'borg list' on bigger repo runs for days. Message-ID: Dear List, We daily backup a storage of about 20tb into a repository which now has about 1000tb worth of backup thanks to deduplication. Everything was fine all the time, we could restore files from the backup it was okayish fast (borg list on this repo took a few minutes), but a few days ago, we again lost a file from the storage and needed to restore it from the backup. During this time, a borg prune was running. I've/canceled it/ (by sending kill to the process). Now the `borg list` took about 2 full days, and still not finished. When i check with iotop i see the python/borg process with 40mb/s (~100% of this storage capabilities) running all the time. Any idea what i could do? Do you have any explanation why this takes so long? Or any tip to fasten this up when this is normal behavior? This is ubuntu's borg 1.1.5 -- Mit freundlichen Gr??en / With kind regards David Krause IT Institute of Biochemistry II Gustav Embden-Zentrum der Biochemie Goethe University Frankfurt - Medical Faculty University Hospital - Building 75 Theodor-Stern-Kai 7 60590 Frankfurt am Main Germany +49 (0)69 6301 84971 (Phone) +49 (0)69 6301 84975 (Fax) krause at biochem2.uni-frankfurt.de -------------- next part -------------- An HTML attachment was scrubbed... URL: From dastapov at gmail.com Wed Jul 8 05:15:29 2020 From: dastapov at gmail.com (Dmitry Astapov) Date: Wed, 8 Jul 2020 10:15:29 +0100 Subject: [Borgbackup] 'borg list' on bigger repo runs for days. In-Reply-To: References: Message-ID: I dont remember borg internals well anymore, but from past experience, I could hazard a guess: Prune is a repo-changing operation, so when it was interrupted it potentially left repo in an inconsistent state, and borg is now trying to restore consistency by replaying all the uncorrupted Indices and Hints since the beginning of time: see https://borgbackup.readthedocs.io/en/stable/internals/data-structures.html#repository, section "Index, hints and integrity" : "*If the index or hints are corrupted, they are regenerated automatically. If they are outdated, segments are replayed from the index state to the currently committed transaction*" Now, whether this means that borg intends to read 1000T at 40mb/s or not - I am not sure. You can try adding "--debug" to your list command and see if it adds any clarity. On Wed, Jul 8, 2020 at 10:02 AM David Krause < krause at biochem2.uni-frankfurt.de> wrote: > Dear List, > > > We daily backup a storage of about 20tb into a repository which now has > about 1000tb worth of backup thanks to deduplication. > Everything was fine all the time, we could restore files from the backup > it was okayish fast (borg list on this repo took a few minutes), > but a few days ago, we again lost a file from the storage and needed to > restore it from the backup. > > During this time, a borg prune was running. I've* canceled it* (by > sending kill to the process). > Now the `borg list` took about 2 full days, and still not finished. > When i check with iotop i see the python/borg process with 40mb/s (~100% > of this storage capabilities) running all the time. > > Any idea what i could do? > Do you have any explanation why this takes so long? > Or any tip to fasten this up when this is normal behavior? > > This is ubuntu's borg 1.1.5 > > > -- > Mit freundlichen Gr??en / With kind regards > > David Krause > IT > Institute of Biochemistry II > > Gustav Embden-Zentrum der Biochemie > Goethe University Frankfurt - Medical Faculty > > University Hospital - Building 75 > Theodor-Stern-Kai 7 > 60590 Frankfurt am Main > Germany > > +49 (0)69 6301 84971 (Phone) > +49 (0)69 6301 84975 (Fax)krause at biochem2.uni-frankfurt.de > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > -- D. Astapov -------------- next part -------------- An HTML attachment was scrubbed... URL: From tw at waldmann-edv.de Wed Jul 8 06:12:43 2020 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Wed, 8 Jul 2020 12:12:43 +0200 Subject: [Borgbackup] 'borg list' on bigger repo runs for days. In-Reply-To: References: Message-ID: <7e7947a8-e6ad-38c2-c963-8b5a3ab135e4@waldmann-edv.de> > This is ubuntu's borg 1.1.5 Please reproduce with borg 1.1.13. There's a PPA for ubuntu from the package maintainer with recent borg. -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From krause at biochem2.uni-frankfurt.de Wed Jul 8 07:11:48 2020 From: krause at biochem2.uni-frankfurt.de (David Krause) Date: Wed, 8 Jul 2020 13:11:48 +0200 Subject: [Borgbackup] 'borg list' on bigger repo runs for days. In-Reply-To: <7e7947a8-e6ad-38c2-c963-8b5a3ab135e4@waldmann-edv.de> References: <7e7947a8-e6ad-38c2-c963-8b5a3ab135e4@waldmann-edv.de> Message-ID: <96fe1137-a11b-8694-b35f-420d728f99fa@biochem2.uni-frankfurt.de> Am 08.07.2020 um 11:15 schrieb Dmitry Astapov: > > > Now, whether this means that borg intends to read 1000T at 40mb/s or > not - I am not sure. You can try adding "--debug" to your list command > and see if it adds any clarity. > hopefully not.. I'll cancel the current job and try "--debug" Am 08.07.2020 um 12:12 schrieb Thomas Waldmann: >> This is ubuntu's borg 1.1.5 > Please reproduce with borg 1.1.13. > > There's a PPA for ubuntu from the package maintainer with recent borg. > I now try the same with the newest stable from the site. borg-linux64 1.1.13 root at backup1:~# ./borg-linux64 -p --debug list /mnt/backups/backup_CommonStorage/ using builtin fallback logging configuration 35 self tests completed in 0.08 seconds Replaying segments?? 0% From grumpy at mailfence.com Thu Jul 16 12:52:49 2020 From: grumpy at mailfence.com (grumpy at mailfence.com) Date: Thu, 16 Jul 2020 18:52:49 +0200 (CEST) Subject: [Borgbackup] how does dedupe work Message-ID: i'm pretty new to borg so try to look around my ignorance suppose i backup several machines to the same repository and these machines have the same os each machine will have a copy of many of the same files does dedupe remove all of the copies From clickwir at gmail.com Thu Jul 16 14:04:18 2020 From: clickwir at gmail.com (Zack Coffey) Date: Thu, 16 Jul 2020 12:04:18 -0600 Subject: [Borgbackup] how does dedupe work In-Reply-To: References: Message-ID: I'll take a stab at this answer. Please correct me if I'm wrong. Yes, is the short answer. It will dedupe all of that. (Disclaimer, I've not tried it) However, when sharing repos there are caveats. Biggest I know of is the cache needs to be rebuilt every time borgbackup is run. This will make backups take much longer and I don't think you can run them at the same time. Pretty sure I remember seeing someone say you have to make sure they don't access the repo at the same time. So it will take more coordination as well. So it's a tradeoff, time vs space. Dedupe in borgbackup is great! I'm saving tons of space. But each machine has its own repo. On Thu, Jul 16, 2020 at 11:02 AM grumpy--- via Borgbackup < borgbackup at python.org> wrote: > i'm pretty new to borg so try to look around my ignorance > suppose i backup several machines to the same repository and these > machines have the same os > each machine will have a copy of many of the same files > does dedupe remove all of the copies > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tw at waldmann-edv.de Thu Jul 16 14:42:39 2020 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Thu, 16 Jul 2020 20:42:39 +0200 Subject: [Borgbackup] how does dedupe work In-Reply-To: References: Message-ID: <080ea658-e210-4c39-1499-c666f0b130df@waldmann-edv.de> > Yes, is the short answer. It will dedupe all of that. Correct. There is: - dedup within a data set - dedup between hosts sharing a repo - historical dedup > However, when sharing repos there are caveats. Biggest I know of is the > cache needs to be rebuilt every time borgbackup is run. s/every time/every time switching the host/ if you run multiple backups (of different data sets) one after the other on same hosts, it won't have to rebuild the cache. but as soon as another hosts starts to back up to same repo, it will find its own cache not being consistent with the repo state any more and will rebuild it. > This will make backups take much longer It will take longer due to the rebuild. How much longer it is depends on some factors (e.g. archive count and whether you let it cache to chunks.archives.d). > and I don't think you can run them at the same time. Correct. > Pretty sure I remember seeing someone say you have to make sure > they don't access the repo at the same time. No, that is not a problem. borg will lock the repo, so nothing bad can happen. You can have borg immediately abort when encountering a locked repo (default) or you can have multiple borg queuing up (use a looooong --lock-timeout). > So it's a tradeoff, time vs space. Yup, as ever. Same goes for the chunks.archives.d. > Dedupe in borgbackup is great! I'm saving tons of space. But each > machine has its own repo. That's definitely quicker. And also a bit safer regarding crypto (AES counter mode). -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From grumpy at mailfence.com Thu Jul 16 21:16:24 2020 From: grumpy at mailfence.com (grumpy at mailfence.com) Date: Fri, 17 Jul 2020 03:16:24 +0200 (CEST) Subject: [Borgbackup] how does dedupe work In-Reply-To: References: Message-ID: On Thu, 16 Jul 2020, Zack Coffey wrote: > I'll take a stab at this answer. Please correct me if I'm wrong. > > Yes, is the short answer. It will dedupe all of that. (Disclaimer, I've not > tried it) > > However, when sharing repos there are caveats. Biggest I know of is the > cache needs to be rebuilt every time borgbackup is run. This will make > backups take much longer and I don't think you can run them at the same > time. Pretty sure I remember seeing someone say you have to make sure they > don't access the repo at the same time. So it will take more coordination > as well. > > So it's a tradeoff, time vs space. > > Dedupe in borgbackup is great! I'm saving tons of space. But each machine > has its own repo. > > > On Thu, Jul 16, 2020 at 11:02 AM grumpy--- via Borgbackup < > borgbackup at python.org> wrote: > >> i'm pretty new to borg so try to look around my ignorance >> suppose i backup several machines to the same repository and these >> machines have the same os >> each machine will have a copy of many of the same files >> does dedupe remove all of the copies at present i also have each machine in a separate repository just curious about others experiences and opinions From lazyvirus at gmx.com Thu Jul 16 22:34:17 2020 From: lazyvirus at gmx.com (Bzzzz) Date: Fri, 17 Jul 2020 04:34:17 +0200 Subject: [Borgbackup] how does dedupe work In-Reply-To: References: Message-ID: <20200717043417.11ce3a68@msi.defcon1.lan> On Fri, 17 Jul 2020 03:16:24 +0200 (CEST) grumpy--- via Borgbackup wrote: > at present i also have each machine in a separate repository > just curious about others experiences and opinions Each machine has it's repo, all backups launched @ the same time by a crontab or manually when needed. The backup of my laptop (~380GB, system included on a 5400RPM HD) takes ~9'30 since I switched the network from 100Mb/s to 1Gb/s (~34' before). Jean-Yves From fabio.pedretti at unibs.it Fri Jul 17 07:29:01 2020 From: fabio.pedretti at unibs.it (Fabio Pedretti) Date: Fri, 17 Jul 2020 13:29:01 +0200 Subject: [Borgbackup] how does dedupe work In-Reply-To: References: Message-ID: We are using a single repository for many servers to maximize deduplication, but we have a backup server (the only one with borg installed) which mounts via NFS the servers to backup. This way there are no cache rebuild issues, there is no risk to backup to the same repo in the same time, also no need to install borg and configure the backup on the target servers. All backups are easily managed from the borg server. Also, if one target server gets compromised, the intruder won't have access to the backup repo, because it's accessible only from the backup server. Hope this helps. Il giorno ven 17 lug 2020 alle ore 03:16 grumpy--- via Borgbackup < borgbackup at python.org> ha scritto: > On Thu, 16 Jul 2020, Zack Coffey wrote: > > > I'll take a stab at this answer. Please correct me if I'm wrong. > > > > Yes, is the short answer. It will dedupe all of that. (Disclaimer, I've > not > > tried it) > > > > However, when sharing repos there are caveats. Biggest I know of is the > > cache needs to be rebuilt every time borgbackup is run. This will make > > backups take much longer and I don't think you can run them at the same > > time. Pretty sure I remember seeing someone say you have to make sure > they > > don't access the repo at the same time. So it will take more coordination > > as well. > > > > So it's a tradeoff, time vs space. > > > > Dedupe in borgbackup is great! I'm saving tons of space. But each machine > > has its own repo. > > > > > > On Thu, Jul 16, 2020 at 11:02 AM grumpy--- via Borgbackup < > > borgbackup at python.org> wrote: > > > >> i'm pretty new to borg so try to look around my ignorance > >> suppose i backup several machines to the same repository and these > >> machines have the same os > >> each machine will have a copy of many of the same files > >> does dedupe remove all of the copies > > at present i also have each machine in a separate repository > just curious about others experiences and opinions > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > -- ing. Fabio Pedretti Responsabile U.O.C. "Reti, Sistemi e Sicurezza Informatica" https://www.unibs.it/node/4305 Universit? degli Studi di Brescia Via Valotti, 9 - 25121 Brescia E-mail: fabio.pedretti at unibs.it -- Informativa sulla Privacy: http://www.unibs.it/node/8155 -------------- next part -------------- An HTML attachment was scrubbed... URL: From billk at iinet.net.au Fri Jul 17 09:02:08 2020 From: billk at iinet.net.au (William Kenworthy) Date: Fri, 17 Jul 2020 21:02:08 +0800 Subject: [Borgbackup] how does dedupe work In-Reply-To: References: Message-ID: <60599124-e947-cc6e-036a-7ea0a3674f0e@iinet.net.au> Another way is to back up the individual systems to multiple repos in a single storage area (3 times a day to a MFS share in my case - a run can take a couple of hours as it includes early raspberry pi's) - then borgbackup the repos themselves (for off-line backup to a removable drive). I was a bit dubious but there was some serious deduplication at the start which took many hours (many of the machines are close copies, even though they have separate repos) , then its quick 3-5 minute and very deduplicated backups thereafter.? I did one restore of a repo that went well as a test. Only downside is large changes (such as monthly mail archiving) make large changes to the repos and that considerably extends the time to back them up. Sounds daft, but it means I have versioned off-line backups of many terabytes of machines and data on a single 2tb removable drive with plenty of room to spare. BillK On 17/7/20 7:29 pm, Fabio Pedretti wrote: > We are using a single repository for many servers to maximize > deduplication, but we have a backup server (the only one with borg > installed) which mounts via NFS the servers to backup. > This way there are no cache rebuild issues, there is no risk to backup > to the same repo in the same time, also no need to install borg and > configure the backup on the target servers. > All backups are easily managed from the borg server. > Also, if one target server gets compromised, the intruder won't have > access to the backup repo, because it's accessible only from the > backup server. > Hope this helps. > > Il giorno ven 17 lug 2020 alle ore 03:16 grumpy--- via Borgbackup > > ha scritto: > > On Thu, 16 Jul 2020, Zack Coffey wrote: > > > I'll take a stab at this answer. Please correct me if I'm wrong. > > > > Yes, is the short answer. It will dedupe all of that. > (Disclaimer, I've not > > tried it) > > > > However, when sharing repos there are caveats. Biggest I know of > is the > > cache needs to be rebuilt every time borgbackup is run. This > will make > > backups take much longer and I don't think you can run them at > the same > > time. Pretty sure I remember seeing someone say you have to make > sure they > > don't access the repo at the same time. So it will take more > coordination > > as well. > > > > So it's a tradeoff, time vs space. > > > > Dedupe in borgbackup is great! I'm saving tons of space. But > each machine > > has its own repo. > > > > > > On Thu, Jul 16, 2020 at 11:02 AM grumpy--- via Borgbackup < > > borgbackup at python.org > wrote: > > > >> i'm pretty new to borg so try to look around my ignorance > >> suppose i backup several machines to the same repository and these > >> machines have the same os > >> each machine will have a copy of many of the same files > >> does dedupe remove all of the copies > > at present i also have each machine in a separate repository > just curious about others experiences and opinions > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > > > > -- > ing. Fabio Pedretti > Responsabile U.O.C. "Reti, Sistemi e Sicurezza Informatica" > https://www.unibs.it/node/4305 > Universit? degli Studi di Brescia > Via Valotti, 9 - 25121 Brescia > E-mail: fabio.pedretti at unibs.it > > > Informativa sulla Privacy: http://www.unibs.it/node/8155 > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup -------------- next part -------------- An HTML attachment was scrubbed... URL: From grumpy at mailfence.com Fri Jul 17 10:40:29 2020 From: grumpy at mailfence.com (grumpy at mailfence.com) Date: Fri, 17 Jul 2020 16:40:29 +0200 (CEST) Subject: [Borgbackup] is there a .borgbackuprc Message-ID: is there a .borgbackuprc for set'n personal preferences for example when i run borg list i set a specific format it would be nice set it and forget it From lazyvirus at gmx.com Fri Jul 17 10:47:51 2020 From: lazyvirus at gmx.com (Bzzzz) Date: Fri, 17 Jul 2020 16:47:51 +0200 Subject: [Borgbackup] is there a .borgbackuprc In-Reply-To: References: Message-ID: <20200717164751.58a0bc98@msi.defcon1.lan> On Fri, 17 Jul 2020 16:40:29 +0200 (CEST) grumpy--- via Borgbackup wrote: > is there a .borgbackuprc for set'n personal preferences > for example when i run borg list i set a specific format > it would be nice set it and forget it No need, just put that in a script and your script into /usr/local/either bin or sbin following the permissions level. JY From clickwir at gmail.com Fri Jul 17 11:08:05 2020 From: clickwir at gmail.com (Zack Coffey) Date: Fri, 17 Jul 2020 09:08:05 -0600 Subject: [Borgbackup] is there a .borgbackuprc In-Reply-To: References: Message-ID: Check out borgmatic. It helps fill some gaps that I tried to fix myself, but it does a much better job than I could. https://torsion.org/borgmatic/ On Fri, Jul 17, 2020 at 8:40 AM grumpy--- via Borgbackup < borgbackup at python.org> wrote: > is there a .borgbackuprc for set'n personal preferences > for example when i run borg list i set a specific format > it would be nice set it and forget it > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > -------------- next part -------------- An HTML attachment was scrubbed... URL: From krause at biochem2.uni-frankfurt.de Tue Jul 21 04:13:48 2020 From: krause at biochem2.uni-frankfurt.de (David Krause) Date: Tue, 21 Jul 2020 10:13:48 +0200 Subject: [Borgbackup] 'borg list' on bigger repo runs for days. In-Reply-To: <96fe1137-a11b-8694-b35f-420d728f99fa@biochem2.uni-frankfurt.de> References: <7e7947a8-e6ad-38c2-c963-8b5a3ab135e4@waldmann-edv.de> <96fe1137-a11b-8694-b35f-420d728f99fa@biochem2.uni-frankfurt.de> Message-ID: <690b96c4-3c56-621e-cfee-88bc04ae9196@biochem2.uni-frankfurt.de> Dear List, borg was able to repair the backup borg check --repair /path/to/repo it just took about a week to do so. The thing i take with me is that i'll try to finer granulate the backup, not backup the storage, so that i'll not get such huge backups of gt 2.2PB (huge for the hardware borg runs on.) Another lesson learned is that i will not prune after every backup (daily), but once every weekend. So in the case i have to get data from the backup i can cancel the running backup and not leave the repository in an unclean state. For me the case is closed, thank you for your help! Mit freundlichen Gr??en / With kind regards David Krause IT Institute of Biochemistry II Gustav Embden-Zentrum der Biochemie Goethe University Frankfurt - Medical Faculty University Hospital - Building 75 Theodor-Stern-Kai 7 60590 Frankfurt am Main Germany +49 (0)69 6301 84971 (Phone) +49 (0)69 6301 84975 (Fax) krause at biochem2.uni-frankfurt.de Am 08.07.2020 um 13:11 schrieb David Krause: > > Am 08.07.2020 um 11:15 schrieb Dmitry Astapov: >> >> >> Now, whether this means that borg intends to read 1000T at 40mb/s or >> not - I am not sure. You can try adding "--debug" to your list >> command and see if it adds any clarity. >> > hopefully not.. > I'll cancel the current job and try "--debug" > > > Am 08.07.2020 um 12:12 schrieb Thomas Waldmann: >>> This is ubuntu's borg 1.1.5 >> Please reproduce with borg 1.1.13. >> >> There's a PPA for ubuntu from the package maintainer with recent borg. >> > I now try the same with the newest stable from the site. > > borg-linux64 1.1.13 > > root at backup1:~# ./borg-linux64 -p --debug list > /mnt/backups/backup_CommonStorage/ > using builtin fallback logging configuration > 35 self tests completed in 0.08 seconds > Replaying segments?? 0% > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup From joehesse at gmail.com Fri Jul 24 11:24:59 2020 From: joehesse at gmail.com (Joseph Hesse) Date: Fri, 24 Jul 2020 10:24:59 -0500 Subject: [Borgbackup] Borg check fails Message-ID: <32cadfee-cbb3-f710-9ecb-79897cb0c161@gmail.com> I did a "borg check" on my repository and got the following 279 lines of output.? A repeat gave similar output.? How do I fix this? Thank you, Joe Only the first 10 and last 10 of my output is shown since the lines all have the same form. Starting repository check Data integrity error: Segment entry checksum mismatch [segment 417, offset 33880751] Starting repository index check Index object count mismatch. committed index: 106112 objects rebuilt index:?? 105838 objects ID: 765bd078f818580f3d10497c17202f13bf39cd469ed47aa8b62baf1085244431 rebuilt index: ????? committed index: (417, 507500162) ID: 08a9bcdf76d07b76d09dbad99ad718a89a240c944d14fbcd233ae8e3a2656e54 rebuilt index: ????? committed index: (417, 124860021) ID: 0b678484bb376160780b8f272d1915f0467a5cdf3f1fb47c1a173a547dc2c3b5 rebuilt index: ????? committed index: (417, 488433694) ID: b1a2a8d320b634384edbed3ad32cfbb724a4599653e0750300e1b0bf87fbcd15 rebuilt index: ????? committed index: (417, 507508789) ID: d6085001a7acae43295d73c94c2cc8ac48031d3f5896a2b9f4ad4537072a06bf rebuilt index: ????? committed index: (417, 29311001) ID: b5be8ced0b12ab924618b9573c2ef0701373b7bc9c2d37c1047131d3c6266b10 rebuilt index: ????? committed index: (417, 517976434) ID: 7da3fcc652a8e7d352c47895e6fbac09de2ae7408572feec78717ff14dfaf579 rebuilt index: ????? committed index: (417, 161177452) ID: 859554b2a0acbaa1473d2aaa90e5e8c8d8b86c09b8c3ec32892927a57c26ab9e rebuilt index: ????? committed index: (417, 457360972) ID: a0c670e945c5c6c1da93f592a3179f679f39d62bd765cbfa03c20d0bc302bcee rebuilt index: ????? committed index: (417, 172471204) ID: 2e6d0c6771978804ea709943adbd8dd6e80c6f6e22c02611f142513a52b8dd01 rebuilt index: ????? committed index: (417, 203174592) lines omitted ID: b55bccb2cfa21242cfecaed451b9bad613c3ad9f6793e4b713cee0b9a76075fa rebuilt index: ????? committed index: (417, 276188088) ID: f55548a7de9eed6401566b18ece320bbb608dccc293d47fadc7c2445af94c72b rebuilt index: ????? committed index: (417, 508929872) ID: da8bf4ed2684a6ccc25f4cbfd751df06cbb39a8686a75b2588dc86ceb79a6967 rebuilt index: ????? committed index: (417, 149067557) ID: 4f27f45e456ce2a53de4c6a4f2106abc911eff37700653341f6c9588121fce23 rebuilt index: ????? committed index: (417, 215454220) ID: 35f32718dc53fb35e77d099c6fd2f915c0ec35002a24b64b48a226d7ebb34dae rebuilt index: ????? committed index: (417, 297246550) ID: 22206850cfa34c18b68ed8f8721abf009ae4da0bb9ad16af5f5e0612d9706119 rebuilt index: ????? committed index: (417, 486429424) ID: ba9298dad697562b7c543edcaf012922882a2728b75f6b6c5e8b222e4e8ad353 rebuilt index: ????? committed index: (417, 184406940) ID: 6f821cbfaaa9699295934cb98dad3716e59ee47ee97faacf9b7f9856faac62d7 rebuilt index: ????? committed index: (417, 407150070) ID: 4c2ef44a849f5199e4c3431fa0cc776adc9e78e60b9dfc246db62533ae12c2c1 rebuilt index: ????? committed index: (417, 516585126) ID: 4aa704e9c63ac38921f5ef402c023cce6c30254235f4dd1256aae35256b3ea9a rebuilt index: ????? committed index: (417, 97268626) ID: 7018a029632b88dd793b593f59b51992473435731a66c99900a84ff32baa9e9a rebuilt index: ????? committed index: (417, 166554942) Completed repository check, errors found. From thomas at portmann.org Sat Jul 25 07:05:41 2020 From: thomas at portmann.org (Thomas Portmann) Date: Sat, 25 Jul 2020 13:05:41 +0200 Subject: [Borgbackup] Borg check fails In-Reply-To: <32cadfee-cbb3-f710-9ecb-79897cb0c161@gmail.com> References: <32cadfee-cbb3-f710-9ecb-79897cb0c161@gmail.com> Message-ID: Basically, you can fix it with "borg check --repair" (see the documentation and the most recent changelog). However, a sensible approach highly depends on the situation. First of all, I would want to know the root cause of the damage, at least an educated guess of it. Which file system is the repo on? Independently, I would deem it a good idea to make a copy of the repository before you repair it. In any case, BEFORE repairing, I would 1. install the most recent release, 1.1.13. If there is no such package in your distribution, use the standalone version. (I always used the standalone versions, because in stable Debian or Ubuntu which I am working on, the Borg versions, e.g. Pre-1.1.11, are know to have some problems which could cause such damages---though in very, very unprobable circumstances.) In a client-server setup, install it on both sides, 2. remove the local cache, by "borg delete --cache-only ", on the (or all, resp.) client machine(s). Thomas Am 24.07.20 um 17:24 schrieb Joseph Hesse: > I did a "borg check" on my repository and got the following 279 lines of > output.? A repeat gave similar output.? How do I fix this? > Thank you, > Joe > > Only the first 10 and last 10 of my output is shown since the lines all > have the same form. > > Starting repository check > Data integrity error: Segment entry checksum mismatch [segment 417, > offset 33880751] > Starting repository index check > Index object count mismatch. > committed index: 106112 objects > rebuilt index:?? 105838 objects > ID: 765bd078f818580f3d10497c17202f13bf39cd469ed47aa8b62baf1085244431 > rebuilt index: ????? committed index: (417, 507500162) > ID: 08a9bcdf76d07b76d09dbad99ad718a89a240c944d14fbcd233ae8e3a2656e54 > rebuilt index: ????? committed index: (417, 124860021) > ID: 0b678484bb376160780b8f272d1915f0467a5cdf3f1fb47c1a173a547dc2c3b5 > rebuilt index: ????? committed index: (417, 488433694) > ID: b1a2a8d320b634384edbed3ad32cfbb724a4599653e0750300e1b0bf87fbcd15 > rebuilt index: ????? committed index: (417, 507508789) > ID: d6085001a7acae43295d73c94c2cc8ac48031d3f5896a2b9f4ad4537072a06bf > rebuilt index: ????? committed index: (417, 29311001) > ID: b5be8ced0b12ab924618b9573c2ef0701373b7bc9c2d37c1047131d3c6266b10 > rebuilt index: ????? committed index: (417, 517976434) > ID: 7da3fcc652a8e7d352c47895e6fbac09de2ae7408572feec78717ff14dfaf579 > rebuilt index: ????? committed index: (417, 161177452) > ID: 859554b2a0acbaa1473d2aaa90e5e8c8d8b86c09b8c3ec32892927a57c26ab9e > rebuilt index: ????? committed index: (417, 457360972) > ID: a0c670e945c5c6c1da93f592a3179f679f39d62bd765cbfa03c20d0bc302bcee > rebuilt index: ????? committed index: (417, 172471204) > ID: 2e6d0c6771978804ea709943adbd8dd6e80c6f6e22c02611f142513a52b8dd01 > rebuilt index: ????? committed index: (417, 203174592) > lines omitted > ID: b55bccb2cfa21242cfecaed451b9bad613c3ad9f6793e4b713cee0b9a76075fa > rebuilt index: ????? committed index: (417, 276188088) > ID: f55548a7de9eed6401566b18ece320bbb608dccc293d47fadc7c2445af94c72b > rebuilt index: ????? committed index: (417, 508929872) > ID: da8bf4ed2684a6ccc25f4cbfd751df06cbb39a8686a75b2588dc86ceb79a6967 > rebuilt index: ????? committed index: (417, 149067557) > ID: 4f27f45e456ce2a53de4c6a4f2106abc911eff37700653341f6c9588121fce23 > rebuilt index: ????? committed index: (417, 215454220) > ID: 35f32718dc53fb35e77d099c6fd2f915c0ec35002a24b64b48a226d7ebb34dae > rebuilt index: ????? committed index: (417, 297246550) > ID: 22206850cfa34c18b68ed8f8721abf009ae4da0bb9ad16af5f5e0612d9706119 > rebuilt index: ????? committed index: (417, 486429424) > ID: ba9298dad697562b7c543edcaf012922882a2728b75f6b6c5e8b222e4e8ad353 > rebuilt index: ????? committed index: (417, 184406940) > ID: 6f821cbfaaa9699295934cb98dad3716e59ee47ee97faacf9b7f9856faac62d7 > rebuilt index: ????? committed index: (417, 407150070) > ID: 4c2ef44a849f5199e4c3431fa0cc776adc9e78e60b9dfc246db62533ae12c2c1 > rebuilt index: ????? committed index: (417, 516585126) > ID: 4aa704e9c63ac38921f5ef402c023cce6c30254235f4dd1256aae35256b3ea9a > rebuilt index: ????? committed index: (417, 97268626) > ID: 7018a029632b88dd793b593f59b51992473435731a66c99900a84ff32baa9e9a > rebuilt index: ????? committed index: (417, 166554942) > Completed repository check, errors found. > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup From joehesse at gmail.com Sun Jul 26 10:16:27 2020 From: joehesse at gmail.com (Joseph Hesse) Date: Sun, 26 Jul 2020 09:16:27 -0500 Subject: [Borgbackup] "borg check --repair" problem Message-ID: <6a9ca597-effc-1384-5d31-2f637092d3e1@gmail.com> Hi, I did a "borg check" on my repository and got 279 error lines, each of the form: ID: 024126ba2e8883b2c36b505fa2217a80479e4334faa659b4a4ff09918e316a5c rebuilt index: ????? committed index: (417, 455027472) ?ending with : "Completed repository check, errors found." I am using Linux with ext4 file system and the partition on the external drive passed an "fsck" and "badblocks" test. I deleted the index for the repository and then did a "borg check --repair",? and, even though there was a warning that this was an experimental feature, I felt I had no other choice but to proceed.? I got the output below. There a 3 backups in this output and since the files in question that are causing the error have very long names, I have replaced them with FILE1, FILE2, ..., FILE6 and changed user names. I plan on deleting the 3 backups in the repository and hope for the best.? Is this a good plan?? I prefer not but I can afford to lose the entire repository since I have Borg backups on other repositories. Thank you, Joe user at Laptop ~]$ borg check --repair /borg/repo/ISBorgBackup/ 'check --repair' is an experimental feature that might result in data loss. Type 'YES' if you understand this and want to continue: YES Data integrity error: Segment entry checksum mismatch [segment 417, offset 33880751] Enter passphrase for key /borg/repo/ISBorgBackup: user-2020-06-30T18:02:21: FILE1: New missing file chunk detected (Byte 1497238-3698398). Replacing with all-zero chunk. user-2020-06-30T18:02:21: FILE2: New missing file chunk detected (Byte 1878467-3580233). Replacing with all-zero chunk. user-2020-06-30T18:02:21: FILE2: New missing file chunk detected (Byte 4421386-6930565). Replacing with all-zero chunk. user-2020-06-30T18:02:21: FILE3: New missing file chunk detected (Byte 3382479-4869362). Replacing with all-zero chunk. user-2020-06-30T18:02:21: FILE4: New missing file chunk detected (Byte 4865467-6609832). Replacing with all-zero chunk. user-2020-06-30T18:02:21: FILE5: New missing file chunk detected (Byte 2085016-5899534). Replacing with all-zero chunk. user-2020-06-30T18:02:21: FILE6: New missing file chunk detected (Byte 4148719-5052114). Replacing with all-zero chunk. user-2020-07-15T22:07:36: FILE1: New missing file chunk detected (Byte 1497238-3698398). Replacing with all-zero chunk. user-2020-07-15T22:07:36: FILE2: New missing file chunk detected (Byte 1878467-3580233). Replacing with all-zero chunk. user-2020-07-15T22:07:36: FILE2: New missing file chunk detected (Byte 4421386-6930565). Replacing with all-zero chunk. user-2020-07-15T22:07:36: FILE3: New missing file chunk detected (Byte 3382479-4869362). Replacing with all-zero chunk. user-2020-07-15T22:07:36: FILE4: New missing file chunk detected (Byte 4865467-6609832). Replacing with all-zero chunk. user-2020-07-15T22:07:36: FILE5: New missing file chunk detected (Byte 2085016-5899534). Replacing with all-zero chunk. user-2020-07-15T22:07:36: FILE6: New missing file chunk detected (Byte 4148719-5052114). Replacing with all-zero chunk. user-2020-07-19T16:35:57: FILE1: New missing file chunk detected (Byte 1497238-3698398). Replacing with all-zero chunk. user-2020-07-19T16:35:57: FILE2: New missing file chunk detected (Byte 1878467-3580233). Replacing with all-zero chunk. user-2020-07-19T16:35:57: FILE2: New missing file chunk detected (Byte 4421386-6930565). Replacing with all-zero chunk. user-2020-07-19T16:35:57: FILE3: New missing file chunk detected (Byte 3382479-4869362). Replacing with all-zero chunk. user-2020-07-19T16:35:57: FILE4: New missing file chunk detected (Byte 4865467-6609832). Replacing with all-zero chunk. user-2020-07-19T16:35:57: FILE5: New missing file chunk detected (Byte 2085016-5899534). Replacing with all-zero chunk. user-2020-07-19T16:35:57: FILE6: New missing file chunk detected (Byte 4148719-5052114). Replacing with all-zero chunk. Archive consistency check complete, problems found. From thomas at portmann.org Sun Jul 26 14:05:27 2020 From: thomas at portmann.org (Thomas Portmann) Date: Sun, 26 Jul 2020 20:05:27 +0200 Subject: [Borgbackup] "borg check --repair" problem In-Reply-To: <6a9ca597-effc-1384-5d31-2f637092d3e1@gmail.com> References: <6a9ca597-effc-1384-5d31-2f637092d3e1@gmail.com> Message-ID: <374e57bc-a749-7d9d-23f8-77f1262741ae@portmann.org> Am 26.07.20 um 16:16 schrieb Joseph Hesse: > I am using Linux with ext4 file system and the partition on the external > drive passed an "fsck" and "badblocks" test. Did you look at the fsck output? > I deleted the index for the repository and then did a "borg check > --repair",? and, even though there was a warning that this was an > experimental feature, > I felt I had no other choice but to proceed. Yes, exactly. This hint is only very, very honest, but you have no other chance in this situation, anyway. You can trust. > I got the output below. > There a 3 backups in this output and since the files in question that > are causing the error have very long names, I have replaced them with > FILE1, FILE2, ..., FILE6 and changed user names. To be honest, since I never experienced this kind of damage, I can compare your report only with my experience with Borgbackup in general and with my understanding of its documentation. Since your log says there are missing file chunks which have been replaced with all-zero chunks, your 6 files will now contain these chunks. So when you extract these files, they will be damaged / contain these holes. All other files and archives should be intact. But I cannot exclude that there is another data loss. Apart from this, the three archives and the repository should be in a consistent state now. > I plan on deleting the 3 backups in the repository and hope for the > best. Is this a good plan? I prefer not but I can afford to lose the > entire repository since I have Borg backups on other repositories. There is no need to remove the archives or the repo.---You can confirm this if you run "borg check" (without "--repair") again. So you're done. A good plan would be to finding the root cause, if possible, and deduce countermeasures for the future. Cheers Thomas From tw at waldmann-edv.de Sun Jul 26 16:17:06 2020 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sun, 26 Jul 2020 22:17:06 +0200 Subject: [Borgbackup] "borg check --repair" problem In-Reply-To: <6a9ca597-effc-1384-5d31-2f637092d3e1@gmail.com> References: <6a9ca597-effc-1384-5d31-2f637092d3e1@gmail.com> Message-ID: Hi, > ID: 024126ba2e8883b2c36b505fa2217a80479e4334faa659b4a4ff09918e316a5c > rebuilt index: ????? committed index: (417, 455027472) > ?ending with : "Completed repository check, errors found." That means that your old on disk index ("committed") referred to chunks which are not there any more in the on-disk segment files, from which it built the rebuilt-index. > I deleted the index for the repository and then did a "borg check > --repair",? and, even though there was a warning that this was an > experimental feature, We are currently changing this message. Read it rather as "no warranties" and "make a backup of the repo if it is important". > I plan on deleting the 3 backups in the repository and hope for the > best.? Is this a good plan? After you check --repair the repo, you could run backups that have (maybe) the same data as in these archives. And then check --repair again. If you are lucky, borg will heal the files IF the chunks that were gone have re-appeared due to the backups after the first repair. > Data integrity error: Segment entry checksum mismatch [segment 417, > offset 33880751] If that was after the index repair, it means corruption in a on-disk file (segment file number 417). > user-2020-06-30T18:02:21: FILE1: New missing file chunk detected (Byte > 1497238-3698398). Replacing with all-zero chunk. And that was maybe that corrupt chunk. So, the root cause is unclear. Corruption can have multiple reasons, like on-disk bit rot or defective RAM or ... I assume you read the advisory at the top of the changelog of borg 1.1.13? Cheers, Thomas -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From grumpy at mailfence.com Sat Aug 1 08:11:33 2020 From: grumpy at mailfence.com (grumpy at mailfence.com) Date: Sat, 1 Aug 2020 14:11:33 +0200 (CEST) Subject: [Borgbackup] extract produced wrong uid/gid Message-ID: yesterday i had my first big test of borg i had to reconstruct a system on a new drive the extract was no problem i found that many files and directories had been restored with the wrong uid and/or gid i use the borg-pull method described here for backups https://github.com/borgbackup/borg/issues/900 after a little examination i saw that the bad uid/gid matched the uid/gid of the server do'n the pull'n is the pull method responsible for the screwup From tw at waldmann-edv.de Sat Aug 1 11:09:23 2020 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sat, 1 Aug 2020 17:09:23 +0200 Subject: [Borgbackup] extract produced wrong uid/gid In-Reply-To: References: Message-ID: <68517413-0203-c900-9e18-f4c7c9f7dec8@waldmann-edv.de> > i use the borg-pull method described here for backups > https://github.com/borgbackup/borg/issues/900 Referring to a closed ticket with a longish discussion is not the best way to describe what you did. The method described there was added to our docs, read that, check if that is what you did and if there is an issue although you followed it completely, file a ticket on github: https://borgbackup.readthedocs.io/en/stable/deployment/pull-backup.html > after a little examination i saw that the bad uid/gid matched the > uid/gid of the server do'n the pull'n > is the pull method responsible for the screwup That docs specifically refers to that issue and describes how to avoid it. If you are unsure, try it again, following closely the steps in the docs. -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From grumpy at mailfence.com Sat Aug 1 11:27:38 2020 From: grumpy at mailfence.com (grumpy at mailfence.com) Date: Sat, 1 Aug 2020 17:27:38 +0200 (CEST) Subject: [Borgbackup] extract produced wrong uid/gid In-Reply-To: <68517413-0203-c900-9e18-f4c7c9f7dec8@waldmann-edv.de> References: <68517413-0203-c900-9e18-f4c7c9f7dec8@waldmann-edv.de> Message-ID: On Sat, 1 Aug 2020, Thomas Waldmann wrote: >> i use the borg-pull method described here for backups >> https://github.com/borgbackup/borg/issues/900 > > Referring to a closed ticket with a longish discussion is not the best > way to describe what you did. > > The method described there was added to our docs, read that, check if > that is what you did and if there is an issue although you followed it > completely, file a ticket on github: > > https://borgbackup.readthedocs.io/en/stable/deployment/pull-backup.html > >> after a little examination i saw that the bad uid/gid matched the >> uid/gid of the server do'n the pull'n >> is the pull method responsible for the screwup > > That docs specifically refers to that issue and describes how to avoid it. > > If you are unsure, try it again, following closely the steps in the docs. thank you From hpj at urpla.net Sat Aug 29 06:28:49 2020 From: hpj at urpla.net (Hans-Peter Jansen) Date: Sat, 29 Aug 2020 12:28:49 +0200 Subject: [Borgbackup] Central backup and ~/.cache/borg/ piling up Message-ID: <2539402.XvYzVfBo0i@xrated> Hi, borg is doing a fantastic job here since two month?, but when doing a central backup of a couple of small and not so small systems and taking advantage of full deduplication, ~/.cache/borg piles up and will blow up limits eventually. Here, ~/.cache/borg is at 21GB now. Had to reconfigure some systems to cope this this already. borg create has --no-files-cache and --no-cache-sync flags. If I understand correctly, that will result in a lot more network traffic, since file modification determination has to happen on the server, then. Since --no-cache-sync is marked experimental, I want to ask, what other downsides one can expect, and what experiences others have had using these flags, or otherwise tackling the issue in question. Thanks, Pete ?) Latest stats: ------------------------------------------------------------------------------ Original size Compressed size Deduplicated size This archive: 125.78 GB 112.63 GB 43.12 MB All archives: 144.81 TB 112.34 TB 6.46 TB Unique chunks Total chunks Chunk index: 31886048 499329985 ------------------------------------------------------------------------------ From lazyvirus at gmx.com Sat Aug 29 07:19:58 2020 From: lazyvirus at gmx.com (Bzzzz) Date: Sat, 29 Aug 2020 13:19:58 +0200 Subject: [Borgbackup] Central backup and ~/.cache/borg/ piling up In-Reply-To: <2539402.XvYzVfBo0i@xrated> References: <2539402.XvYzVfBo0i@xrated> Message-ID: <20200829131958.30ffbd71@msi.defcon1.lan> On Sat, 29 Aug 2020 12:28:49 +0200 Hans-Peter Jansen wrote: > Hi, Hi HP, > borg is doing a fantastic job here since two month?, but when doing a > central backup of a couple of small and not so small systems and > taking advantage of full deduplication, ~/.cache/borg piles up and > will blow up limits eventually. Here, ~/.cache/borg is at 21GB now. > Had to reconfigure some systems to cope this this already. Looks normal (here, 3.5 GB on a laptop w/330 GB to process) > borg create has --no-files-cache and --no-cache-sync flags. > > If I understand correctly, that will result in a lot more network > traffic, since file modification determination has to happen on the > server, then. From the doc, the 2nd implies the first. > Since --no-cache-sync is marked experimental, I want to ask, what > other downsides one can expect, and what experiences others have had > using these flags, or otherwise tackling the issue in question. IIUC the doc, each and every backup of your's is re-processed (almost?) the same as if it was the first one - this doesn't look optimal at all; so, using these options should be reduced to very special cases (I guess Thomas could enlighten us about which they could be). In a word, you lose (very much) time and bloat the network for a minor sparing - not to mention that if you have a lot of machines to backup, the whole thing is taking much too long to achieve. So, IF my doc interpretation's correct, you lose dozens of time whatever you hope to gain using --no-cache-sync, namely time and network availability. Jean-Yves From tw at waldmann-edv.de Sat Aug 29 10:32:40 2020 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sat, 29 Aug 2020 16:32:40 +0200 Subject: [Borgbackup] Central backup and ~/.cache/borg/ piling up In-Reply-To: <2539402.XvYzVfBo0i@xrated> References: <2539402.XvYzVfBo0i@xrated> Message-ID: <267b031a-c21a-be8d-2a3f-dd6b86cff050@waldmann-edv.de> > Original size Compressed size Deduplicated size > This archive: 125.78 GB 112.63 GB 43.12 MB > All archives: 144.81 TB 112.34 TB 6.46 TB That looks like you have > 1000 backup archives. The chunks.archive.d cache is O(archive count * archive size) big, so if that is getting too big, just reduce the archive count via "borg prune". If you've never run prune yet, expect the first run to take rather long. Be careful, use borg prune --dry-run ... first, so you see what is getting pruned. You also might need --prefix if you have multiple different data sets backed up into same repo. See the docs for more info. -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From hpj at urpla.net Sun Aug 30 07:48:36 2020 From: hpj at urpla.net (Hans-Peter Jansen) Date: Sun, 30 Aug 2020 13:48:36 +0200 Subject: [Borgbackup] Central backup and ~/.cache/borg/ piling up In-Reply-To: <267b031a-c21a-be8d-2a3f-dd6b86cff050@waldmann-edv.de> References: <2539402.XvYzVfBo0i@xrated> <267b031a-c21a-be8d-2a3f-dd6b86cff050@waldmann-edv.de> Message-ID: <2924621.ffk7riIq3i@xrated> Thanks Jean-Yves and Thomas, for your prompt replies and valuable input. Jean-Yves, I've combined the thread again here. Am Samstag, 29. August 2020, 16:32:40 CEST schrieb Thomas Waldmann: > > Original size Compressed size Deduplicated > > size > > > > This archive: 125.78 GB 112.63 GB > > 43.12 MB All archives: 144.81 TB 112.34 TB > > 6.46 TB > That looks like you have > 1000 backup archives. Not exactly: $ borg list | wc -l 196 > The chunks.archive.d cache is O(archive count * archive size) big, so if > that is getting too big, just reduce the archive count via "borg prune". I'm pruning as part of every backup. Guess, the sheer amount is due to many vm images and some such. This is going to strive towards some prune implied limit, sure, but those 20G already caused me some headaches on certain systems and will slowly raise still (due to monthly retentions). Jean-Yves wrote: > So, IF my doc interpretation's correct, you lose dozens of time whatever > you hope to gain using --no-cache-sync, namely time and network > availability. You probably right. May I respectfully ask for some further discussion of --no-files-cache and -- no-cache-sync in order to estimate: * chances of reduction of local disk space * expected increase of network load and runtime * probability increase of hazardous effects if you don't mind. Thanks, Pete From tw at waldmann-edv.de Sun Aug 30 08:01:58 2020 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Sun, 30 Aug 2020 14:01:58 +0200 Subject: [Borgbackup] Central backup and ~/.cache/borg/ piling up In-Reply-To: <2924621.ffk7riIq3i@xrated> References: <2539402.XvYzVfBo0i@xrated> <267b031a-c21a-be8d-2a3f-dd6b86cff050@waldmann-edv.de> <2924621.ffk7riIq3i@xrated> Message-ID: <3c5b8b78-8077-8ec2-599f-cd43f276b314@waldmann-edv.de> >> The chunks.archive.d cache is O(archive count * archive size) big, so if >> that is getting too big, just reduce the archive count via "borg prune". Check whether the chunks.archive.d directory is eating your disk space. If so, you either can reduce archive amount or apply the hack from the FAQ to get rid of chunks.archive.d (the latter might be ok, if only one borg client will write to the repo, so cache rebuilds will never or rarely happen OR if you are willing to take the time for the rebuilds). > discussion of --no-files-cache Will make your backups slow as it will read/chunk/hash/lookup all files content. -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From hpj at urpla.net Sun Aug 30 08:52:04 2020 From: hpj at urpla.net (Hans-Peter Jansen) Date: Sun, 30 Aug 2020 14:52:04 +0200 Subject: [Borgbackup] Central backup and ~/.cache/borg/ piling up In-Reply-To: <3c5b8b78-8077-8ec2-599f-cd43f276b314@waldmann-edv.de> References: <2539402.XvYzVfBo0i@xrated> <2924621.ffk7riIq3i@xrated> <3c5b8b78-8077-8ec2-599f-cd43f276b314@waldmann-edv.de> Message-ID: <1952720.m4gFMHJ5S7@xrated> Am Sonntag, 30. August 2020, 14:01:58 CEST schrieb Thomas Waldmann: > >> The chunks.archive.d cache is O(archive count * archive size) big, so if > >> that is getting too big, just reduce the archive count via "borg prune". > > Check whether the chunks.archive.d directory is eating your disk space. It is: $ du -h . chunks.archive.d/ 17G ./chunks.archive.d 21G . > If so, you either can reduce archive amount or apply the hack from the > FAQ to get rid of chunks.archive.d (the latter might be ok, if only one > borg client will write to the repo, so cache rebuilds will never or > rarely happen OR if you are willing to take the time for the rebuilds). Will dive into this next. > > discussion of --no-files-cache > > Will make your backups slow as it will read/chunk/hash/lookup all files > content. Thanks, Pete From billk at iinet.net.au Sun Aug 30 10:02:02 2020 From: billk at iinet.net.au (William Kenworthy) Date: Sun, 30 Aug 2020 22:02:02 +0800 Subject: [Borgbackup] Central backup and ~/.cache/borg/ piling up In-Reply-To: <1952720.m4gFMHJ5S7@xrated> References: <2539402.XvYzVfBo0i@xrated> <2924621.ffk7riIq3i@xrated> <3c5b8b78-8077-8ec2-599f-cd43f276b314@waldmann-edv.de> <1952720.m4gFMHJ5S7@xrated> Message-ID: Could you offload the long term backups onto a machine with more resources? - That is: 1. create your normal backups to a machine with plenty of resources 2. Back those borgbackup repositories up to a single local repo on the backup machine (i.e., backup the backups) 3. heavily prune the first stage backups to regain the space. The heavily pruned first stage only needs one or two iterations to be present so the local machines can reduce their cache but still be efficient Restoration means you will need to restore the correct first stage version from the applicable 2nd stage so you can get at the original files. I have found that there is a lot of duplication between backups so the 2nd stage is usually quite fast - unless there are a lot of changes.? I have had to extract two sets so far without problems, but there are the extra steps/effort and time required. I my case, the first stage is on a moosefs data store and the second stage is on an off-line removable hard drive. BillK On 30/8/20 8:52 pm, Hans-Peter Jansen wrote: > Am Sonntag, 30. August 2020, 14:01:58 CEST schrieb Thomas Waldmann: >>>> The chunks.archive.d cache is O(archive count * archive size) big, so if >>>> that is getting too big, just reduce the archive count via "borg prune". >> Check whether the chunks.archive.d directory is eating your disk space. > It is: > $ du -h . chunks.archive.d/ > 17G ./chunks.archive.d > 21G . > >> If so, you either can reduce archive amount or apply the hack from the >> FAQ to get rid of chunks.archive.d (the latter might be ok, if only one >> borg client will write to the repo, so cache rebuilds will never or >> rarely happen OR if you are willing to take the time for the rebuilds). > Will dive into this next. > >>> discussion of --no-files-cache >> Will make your backups slow as it will read/chunk/hash/lookup all files >> content. > Thanks, > Pete > > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup From lazyvirus at gmx.com Sun Aug 30 10:33:31 2020 From: lazyvirus at gmx.com (Bzzzz) Date: Sun, 30 Aug 2020 16:33:31 +0200 Subject: [Borgbackup] Central backup and ~/.cache/borg/ piling up In-Reply-To: <2924621.ffk7riIq3i@xrated> References: <2539402.XvYzVfBo0i@xrated> <267b031a-c21a-be8d-2a3f-dd6b86cff050@waldmann-edv.de> <2924621.ffk7riIq3i@xrated> Message-ID: <20200830163331.5d711f9d@msi.defcon1.lan> On Sun, 30 Aug 2020 13:48:36 +0200 Hans-Peter Jansen wrote: > Guess, the sheer amount is due to many vm images and some such. Ah, in this case you wanna fiddle with --chunker-params to lower the chunk size to avoid almost the whole image file to be backed up when there are little changes. From what I read onzeweb, my notes say: * ORG is : 19,23,21,4095 * a pro with lots of VM images fixed it to : 16,23,11,4095 and say it's doing the trick * a rookie fixed it to : 12,20,15,4095 but this seems too low and multiplies BB files * I fixed it to : 16,23,16,4095 and it seems to work correctly with heavy VBox images (essentially of the windo$e kind) Note that it won't change big things if you're at 90% writing in these or worse, you defragment them (w$). > This is going to strive towards some prune implied limit, sure, but > those 20G already caused me some headaches on certain systems and will > slowly raise still (due to monthly retentions). May be (if possible) you could separate the VM images backup from the rest, using a low 3rd parm for --chunker-params for VMs - but before that, you have to be sure it is them who inflate the account list. "borg diff" is your friend to know where the bloat is located, because it provides you the old/new backup file sizes differences - it may take a moment to analyze it's output, but this way you'll be sure of the bloat source (you might have surprises?;) Jean-Yves From public at enkore.de Sun Aug 30 13:14:55 2020 From: public at enkore.de (Marian Beermann) Date: Sun, 30 Aug 2020 19:14:55 +0200 Subject: [Borgbackup] Central backup and ~/.cache/borg/ piling up In-Reply-To: <20200829131958.30ffbd71@msi.defcon1.lan> References: <2539402.XvYzVfBo0i@xrated> <20200829131958.30ffbd71@msi.defcon1.lan> Message-ID: <1c16e0d9-2bf3-37ca-6133-273407b221f0@enkore.de> >> Since --no-cache-sync is marked experimental, I want to ask, what >> other downsides one can expect, and what experiences others have had >> using these flags, or otherwise tackling the issue in question. > > IIUC the doc, each and every backup of your's is re-processed (almost?) > the same as if it was the first one - this doesn't look optimal at all; > so, using these options should be reduced to very special cases (I guess > Thomas could enlighten us about which they could be). > > In a word, you lose (very much) time and bloat the network for a minor > sparing - not to mention that if you have a lot of machines to backup, > the whole thing is taking much too long to achieve. > > So, IF my doc interpretation's correct, you lose dozens of time whatever > you hope to gain using --no-cache-sync, namely time and network > availability. I originally wrote the code for --no-cache-sync a few years ago, but I didn't remember what it does, so I re-read the code [1] to refresh my memory :) --no-cache-sync uses the local cache *if* it is in sync. Otherwise it downloads a list of all chunks in the repository (that's about "Unique chunks" * 32 bytes in network traffic, in your case 31886048 * 32 bytes = 1 GB!) and uses that for deduplication. Because it only has the the ID of chunks, but doesn't know their size or compressed size, the stats will be all wrong (all deduped chunks will show up with a size of zero bytes). This might result in some situations (archives being deleted etc.) in a cache sync (--possibly on another host) to download actual data chunks to figure their (uncompressed) size out. IIRC we added it to see if it would work in practice or not. I dunno if someone tried it in a bigger scenario and what the results there might have been. -Marian [1] https://github.com/borgbackup/borg/commit/8aa745ddbd8ee1cddb3374673eb3eb08a9d5a8da From lazyvirus at gmx.com Sun Aug 30 14:22:12 2020 From: lazyvirus at gmx.com (Bzzzz) Date: Sun, 30 Aug 2020 20:22:12 +0200 Subject: [Borgbackup] Central backup and ~/.cache/borg/ piling up In-Reply-To: <1c16e0d9-2bf3-37ca-6133-273407b221f0@enkore.de> References: <2539402.XvYzVfBo0i@xrated> <20200829131958.30ffbd71@msi.defcon1.lan> <1c16e0d9-2bf3-37ca-6133-273407b221f0@enkore.de> Message-ID: <20200830202212.049d06ec@msi.defcon1.lan> On Sun, 30 Aug 2020 19:14:55 +0200 Marian Beermann wrote: Oops, answering to the ML finally landed for your eyes only, back online. > >> Since --no-cache-sync is marked experimental, I want to ask, what > >> other downsides one can expect, and what experiences others have had > >> using these flags, or otherwise tackling the issue in question. > > > > IIUC the doc, each and every backup of your's is re-processed > > (almost?) the same as if it was the first one - this doesn't look > > optimal at all; so, using these options should be reduced to very > > special cases (I guess Thomas could enlighten us about which they > > could be). > > > > In a word, you lose (very much) time and bloat the network for a > > minor sparing - not to mention that if you have a lot of machines to > > backup, the whole thing is taking much too long to achieve. > > > > So, IF my doc interpretation's correct, you lose dozens of time > > whatever you hope to gain using --no-cache-sync, namely time and > > network availability. > > I originally wrote the code for --no-cache-sync a few years ago, but I > didn't remember what it does, so I re-read the code [1] to refresh my > memory :) Hehe, this is where a clean code thoroughly commented is (possibly) reaching you ;-) > --no-cache-sync uses the local cache *if* it is in sync. Otherwise it > downloads a list of all chunks in the repository (that's about "Unique > chunks" * 32 bytes in network traffic, in your case 31886048 * 32 bytes > = 1 GB!) and uses that for deduplication. Because it only has the the > ID of chunks, but doesn't know their size or compressed size, the stats > will be all wrong (all deduped chunks will show up with a size of zero > bytes). This might result in some situations (archives being deleted > etc.) in a cache sync (--possibly on another host) to download actual > data chunks to figure their (uncompressed) size out. Thanks for these clarifications, Marian. Doc is relatively clear about that, once you've read about "borg create" and "why-it-is-not-such-a-good-idea-to-put-all-machines-in-the-same-repo" (away from the compression thing). > IIRC we added it to see if it would work in practice or not. I dunno if > someone tried it in a bigger scenario and what the results there might > have been. Wild guess : terrible on small machines, such as laptops, that have trouble to manage between (rust) disk R/W and Ethernet I/O when both are busy and in any case (ie: when you use a SSD), network bloating - which can be a hassle when you've got a lot of machines backupetting each one to it's own repo on the same backup server. Add it the data traffic and you end up with slow motion backups. Jean-Yves From borg-samuel at balkonien.org Fri Sep 4 05:04:14 2020 From: borg-samuel at balkonien.org (Samuel) Date: Fri, 4 Sep 2020 11:04:14 +0200 Subject: [Borgbackup] Backup taking very long - sometimes Message-ID: Dear borg-mailing-list, I have a quite weird problem, I guess. Borg runs on a regular daily basis and is working stable. But the consumed cpu time varies between around 4 hours and 12-13 hours: Aug 29 18:54:33 balkonien systemd[1]: borg.service: Consumed 13h 33min 17.260s CPU time. Aug 30 16:26:14 balkonien systemd[1]: borg.service: Consumed 11h 31min 14.142s CPU time. Aug 31 22:10:19 balkonien systemd[1]: borg.service: Consumed 11h 52min 34.410s CPU time. Sep 01 07:37:53 balkonien systemd[1]: borg.service: Consumed 4h 13min 21.085s CPU time. Sep 02 15:30:09 balkonien systemd[1]: borg.service: Consumed 10h 45min 23.284s CPU time. Sep 03 21:44:20 balkonien systemd[1]: borg.service: Consumed 11h 31min 47.892s CPU time. Sep 04 06:30:58 balkonien systemd[1]: borg.service: Consumed 3h 58min 39.703s CPU time. I don't see the cause for this behaviour. Can you help me investigate this? Thank you very much. Samuel From tw at waldmann-edv.de Fri Sep 4 08:29:14 2020 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Fri, 4 Sep 2020 14:29:14 +0200 Subject: [Borgbackup] Backup taking very long - sometimes In-Reply-To: References: Message-ID: <8eee4192-5961-8b19-ee51-47ecc882614d@waldmann-edv.de> > Borg runs on a regular daily basis and is working stable. > But the consumed cpu time varies between around 4 hours and 12-13 hours: Could be because on different days, a different amount of data changes? Or (if you do backups to a remote repo server), the connection speed to / system load of source or target server is different. Without more infos, it is hard to say. You could start by adding this to borg create: --stats --list --filter="AME" stats should tell if there was really much data added. if so, the list output should hint about what it was. -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From borg-samuel at balkonien.org Wed Sep 9 04:02:38 2020 From: borg-samuel at balkonien.org (Samuel) Date: Wed, 9 Sep 2020 10:02:38 +0200 Subject: [Borgbackup] Backup taking very long - sometimes In-Reply-To: <8eee4192-5961-8b19-ee51-47ecc882614d@waldmann-edv.de> References: <8eee4192-5961-8b19-ee51-47ecc882614d@waldmann-edv.de> Message-ID: Thanks for your reply. Stats was already enabled and the duration of the backup doesn't seem to depend on the data added (i attached the of the borg backup). Logs with stats enabled but without --list --filter="AME" can be found here: https://pastebin.com/raw/RW43kBHi With --list --filter="AME" enabled the logs since 2020-09-06 exceed 200Mbyte. Therefore i can't send them. Best wishes Samuel Am 04.09.20 um 14:29 schrieb Thomas Waldmann: >> Borg runs on a regular daily basis and is working stable. >> But the consumed cpu time varies between around 4 hours and 12-13 hours: > Could be because on different days, a different amount of data changes? > > Or (if you do backups to a remote repo server), the connection speed to > / system load of source or target server is different. > > Without more infos, it is hard to say. > > You could start by adding this to borg create: > > --stats --list --filter="AME" > > stats should tell if there was really much data added. > if so, the list output should hint about what it was. > From tw at waldmann-edv.de Wed Sep 9 06:34:03 2020 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Wed, 9 Sep 2020 12:34:03 +0200 Subject: [Borgbackup] Backup taking very long - sometimes In-Reply-To: References: <8eee4192-5961-8b19-ee51-47ecc882614d@waldmann-edv.de> Message-ID: <69985d27-c985-017e-446e-e692eb8707e8@waldmann-edv.de> > Stats was already enabled and the duration of the backup doesn't seem to > depend on the data added (i attached the of the borg backup). You should use a more recent release than borg 1.1.9, see the advisory at the top of the changelog. > Logs with stats enabled but without --list --filter="AME" can be found here: > https://pastebin.com/raw/RW43kBHi Next time you could shorten the logs so they only show the relevant / interesting stuff. Didn't see anything remarkable there (except maybe that your "System" backup includes a rather high count of files). > With --list --filter="AME" enabled the logs since 2020-09-06 exceed > 200Mbyte. Therefore i can't send them. It wasn't intended that you publish that log, but rather that you check yourself whether the A)dded and M)odified files shown in there are somehow as expected. If it shows files as M)odified that had no content modification, that is an indication that the borg files cache maybe doesn't work ok for you / for that filesystem type or that the files were slightly "touched" by something. You have to find out then what the root cause for this is, see the FAQ and the documentation of --files-cache. Or if it shows a huge amount of E)rrors, then find out if the errors are severe or if they can be ignored (sometimes one can just exclude problematic stuff if it does not need to be backed up anyway). > Am 04.09.20 um 14:29 schrieb Thomas Waldmann: >>> Borg runs on a regular daily basis and is working stable. >>> But the consumed cpu time varies between around 4 hours and 12-13 hours: >> Could be because on different days, a different amount of data changes? >> >> Or (if you do backups to a remote repo server), the connection speed to >> / system load of source or target server is different. >> >> Without more infos, it is hard to say. >> >> You could start by adding this to borg create: >> >> --stats --list --filter="AME" >> >> stats should tell if there was really much data added. >> if so, the list output should hint about what it was. -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From borg-samuel at balkonien.org Thu Sep 10 03:52:05 2020 From: borg-samuel at balkonien.org (Samuel) Date: Thu, 10 Sep 2020 09:52:05 +0200 Subject: [Borgbackup] Backup taking very long - sometimes In-Reply-To: <69985d27-c985-017e-446e-e692eb8707e8@waldmann-edv.de> References: <8eee4192-5961-8b19-ee51-47ecc882614d@waldmann-edv.de> <69985d27-c985-017e-446e-e692eb8707e8@waldmann-edv.de> Message-ID: Okay. I'm now using 1.1.11 from buster-backports. Next time I will try to shorten the logs to the relevant part. You are right, there is a huge amount of files classified as modified, which should be unchanged (for several years). Now I'm unsure in how to investigate further. The files are stored on a btrfs filesystem across 4 HDs as RAID1. Unfortunately I didn't find a hint in the FAQ regarding this behaviour. I will now wait for the next Backups with the more recent version of borgbackup and try to investigate the new logs. Thanks a lot for your help! Samuel Am 09.09.20 um 12:34 schrieb Thomas Waldmann: >> Stats was already enabled and the duration of the backup doesn't seem to >> depend on the data added (i attached the of the borg backup). > You should use a more recent release than borg 1.1.9, see the advisory > at the top of the changelog. > >> Logs with stats enabled but without --list --filter="AME" can be found here: >> https://pastebin.com/raw/RW43kBHi > Next time you could shorten the logs so they only show the relevant / > interesting stuff. > > Didn't see anything remarkable there (except maybe that your "System" > backup includes a rather high count of files). > >> With --list --filter="AME" enabled the logs since 2020-09-06 exceed >> 200Mbyte. Therefore i can't send them. > It wasn't intended that you publish that log, but rather that you check > yourself whether the A)dded and M)odified files shown in there are > somehow as expected. > > If it shows files as M)odified that had no content modification, that is > an indication that the borg files cache maybe doesn't work ok for you / > for that filesystem type or that the files were slightly "touched" by > something. > > You have to find out then what the root cause for this is, see the FAQ > and the documentation of --files-cache. > > Or if it shows a huge amount of E)rrors, then find out if the errors are > severe or if they can be ignored (sometimes one can just exclude > problematic stuff if it does not need to be backed up anyway). > > > > >> Am 04.09.20 um 14:29 schrieb Thomas Waldmann: >>>> Borg runs on a regular daily basis and is working stable. >>>> But the consumed cpu time varies between around 4 hours and 12-13 hours: >>> Could be because on different days, a different amount of data changes? >>> >>> Or (if you do backups to a remote repo server), the connection speed to >>> / system load of source or target server is different. >>> >>> Without more infos, it is hard to say. >>> >>> You could start by adding this to borg create: >>> >>> --stats --list --filter="AME" >>> >>> stats should tell if there was really much data added. >>> if so, the list output should hint about what it was. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From w.schuermann at posteo.de Thu Sep 10 04:47:38 2020 From: w.schuermann at posteo.de (=?UTF-8?Q?Winfried_Sch=c3=bcrmann?=) Date: Thu, 10 Sep 2020 10:47:38 +0200 Subject: [Borgbackup] Backup taking very long - sometimes In-Reply-To: References: <8eee4192-5961-8b19-ee51-47ecc882614d@waldmann-edv.de> <69985d27-c985-017e-446e-e692eb8707e8@waldmann-edv.de> Message-ID: Just a hint: it's easy to install the binaries of Borg 1.1.13 not waiting for updates of repos: https://borgbackup.readthedocs.io/en/stable/installation.html#pyinstaller-binary Winfried Am 10.09.20 um 09:52 schrieb Samuel: > Okay. > > I'm now using 1.1.11 from buster-backports. > > Next time I will try to shorten the logs to the relevant part. > > You are right, there is a huge amount of files classified as modified, > which should be unchanged (for several years). > Now I'm unsure in how to investigate further. > The files are stored on a btrfs filesystem across 4 HDs as RAID1. > > Unfortunately I didn't find a hint in the FAQ regarding this behaviour. > > I will now wait for the next Backups with the more recent version of > borgbackup and try to investigate the new logs. > > > Thanks a lot for your help! > Samuel > > Am 09.09.20 um 12:34 schrieb Thomas Waldmann: >>> Stats was already enabled and the duration of the backup doesn't seem to >>> depend on the data added (i attached the of the borg backup). >> You should use a more recent release than borg 1.1.9, see the advisory >> at the top of the changelog. >> >>> Logs with stats enabled but without --list --filter="AME" can be found here: >>> https://pastebin.com/raw/RW43kBHi >> Next time you could shorten the logs so they only show the relevant / >> interesting stuff. >> >> Didn't see anything remarkable there (except maybe that your "System" >> backup includes a rather high count of files). >> >>> With --list --filter="AME" enabled the logs since 2020-09-06 exceed >>> 200Mbyte. Therefore i can't send them. >> It wasn't intended that you publish that log, but rather that you check >> yourself whether the A)dded and M)odified files shown in there are >> somehow as expected. >> >> If it shows files as M)odified that had no content modification, that is >> an indication that the borg files cache maybe doesn't work ok for you / >> for that filesystem type or that the files were slightly "touched" by >> something. >> >> You have to find out then what the root cause for this is, see the FAQ >> and the documentation of --files-cache. >> >> Or if it shows a huge amount of E)rrors, then find out if the errors are >> severe or if they can be ignored (sometimes one can just exclude >> problematic stuff if it does not need to be backed up anyway). >> >> >> >> >>> Am 04.09.20 um 14:29 schrieb Thomas Waldmann: >>>>> Borg runs on a regular daily basis and is working stable. >>>>> But the consumed cpu time varies between around 4 hours and 12-13 hours: >>>> Could be because on different days, a different amount of data changes? >>>> >>>> Or (if you do backups to a remote repo server), the connection speed to >>>> / system load of source or target server is different. >>>> >>>> Without more infos, it is hard to say. >>>> >>>> You could start by adding this to borg create: >>>> >>>> --stats --list --filter="AME" >>>> >>>> stats should tell if there was really much data added. >>>> if so, the list output should hint about what it was. > > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup > From borg-samuel at balkonien.org Thu Sep 17 04:32:12 2020 From: borg-samuel at balkonien.org (Samuel Greiner) Date: Thu, 17 Sep 2020 10:32:12 +0200 Subject: [Borgbackup] Backup taking very long - sometimes In-Reply-To: References: <8eee4192-5961-8b19-ee51-47ecc882614d@waldmann-edv.de> <69985d27-c985-017e-446e-e692eb8707e8@waldmann-edv.de> Message-ID: <2f2a7d98-2f69-6b8c-ee5c-320909c4a43d@mailbox.org> Thank you for your advices, I'm now using the most recent version 1.1.13. Still the problem persists. With the option "--list --filter="AME"" I see that some backups run very fast and only a considerable amount of files is changed or added. The backups that have a really long duration backup a whole lot of files as modified which should be untouched. The source is a nextcloud data directory and the data is on a btrfs volume. Do you have advice how I could investigate this further? Thank you very much Samuel Am 10.09.20 um 10:47 schrieb Winfried Sch?rmann: > Just a hint: > it's easy to install the binaries of Borg 1.1.13 not waiting for updates > of repos: > https://borgbackup.readthedocs.io/en/stable/installation.html#pyinstaller-binary > > Winfried > > > Am 10.09.20 um 09:52 schrieb Samuel: >> Okay. >> >> I'm now using 1.1.11 from buster-backports. >> >> Next time I will try to shorten the logs to the relevant part. >> >> You are right, there is a huge amount of files classified as modified, >> which should be unchanged (for several years). >> Now I'm unsure in how to investigate further. >> The files are stored on a btrfs filesystem across 4 HDs as RAID1. >> >> Unfortunately I didn't find a hint in the FAQ regarding this behaviour. >> >> I will now wait for the next Backups with the more recent version of >> borgbackup and try to investigate the new logs. >> >> >> Thanks a lot for your help! >> Samuel >> >> Am 09.09.20 um 12:34 schrieb Thomas Waldmann: >>>> Stats was already enabled and the duration of the backup doesn't seem to >>>> depend on the data added (i attached the of the borg backup). >>> You should use a more recent release than borg 1.1.9, see the advisory >>> at the top of the changelog. >>> >>>> Logs with stats enabled but without --list --filter="AME" can be found here: >>>> https://pastebin.com/raw/RW43kBHi >>> Next time you could shorten the logs so they only show the relevant / >>> interesting stuff. >>> >>> Didn't see anything remarkable there (except maybe that your "System" >>> backup includes a rather high count of files). >>> >>>> With --list --filter="AME" enabled the logs since 2020-09-06 exceed >>>> 200Mbyte. Therefore i can't send them. >>> It wasn't intended that you publish that log, but rather that you check >>> yourself whether the A)dded and M)odified files shown in there are >>> somehow as expected. >>> >>> If it shows files as M)odified that had no content modification, that is >>> an indication that the borg files cache maybe doesn't work ok for you / >>> for that filesystem type or that the files were slightly "touched" by >>> something. >>> >>> You have to find out then what the root cause for this is, see the FAQ >>> and the documentation of --files-cache. >>> >>> Or if it shows a huge amount of E)rrors, then find out if the errors are >>> severe or if they can be ignored (sometimes one can just exclude >>> problematic stuff if it does not need to be backed up anyway). >>> >>> >>> >>> >>>> Am 04.09.20 um 14:29 schrieb Thomas Waldmann: >>>>>> Borg runs on a regular daily basis and is working stable. >>>>>> But the consumed cpu time varies between around 4 hours and 12-13 hours: >>>>> Could be because on different days, a different amount of data changes? >>>>> >>>>> Or (if you do backups to a remote repo server), the connection speed to >>>>> / system load of source or target server is different. >>>>> >>>>> Without more infos, it is hard to say. >>>>> >>>>> You could start by adding this to borg create: >>>>> >>>>> --stats --list --filter="AME" >>>>> >>>>> stats should tell if there was really much data added. >>>>> if so, the list output should hint about what it was. >> _______________________________________________ >> Borgbackup mailing list >> Borgbackup at python.org >> https://mail.python.org/mailman/listinfo/borgbackup >> > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup From devzero at web.de Thu Sep 17 07:05:57 2020 From: devzero at web.de (Roland) Date: Thu, 17 Sep 2020 13:05:57 +0200 Subject: [Borgbackup] Backup taking very long - sometimes In-Reply-To: <2f2a7d98-2f69-6b8c-ee5c-320909c4a43d@mailbox.org> References: <8eee4192-5961-8b19-ee51-47ecc882614d@waldmann-edv.de> <69985d27-c985-017e-446e-e692eb8707e8@waldmann-edv.de> <2f2a7d98-2f69-6b8c-ee5c-320909c4a43d@mailbox.org> Message-ID: you can have a look with "stat" utility for file timestamp change of the affected files i'm using borg with btrfs with no problems regards roland Am 17.09.20 um 10:32 schrieb Samuel Greiner: > Thank you for your advices, > > I'm now using the most recent version 1.1.13. > > Still the problem persists. > > With the option "--list --filter="AME"" I see that some backups run very > fast and only a considerable amount of files is changed or added. > The backups that have a really long duration backup a whole lot of files > as modified which should be untouched. > > The source is a nextcloud data directory and the data is on a btrfs volume. > > Do you have advice how I could investigate this further? > > Thank you very much > Samuel > > > > Am 10.09.20 um 10:47 schrieb Winfried Sch?rmann: >> Just a hint: >> it's easy to install the binaries of Borg 1.1.13 not waiting for updates >> of repos: >> https://borgbackup.readthedocs.io/en/stable/installation.html#pyinstaller-binary >> >> Winfried >> >> >> Am 10.09.20 um 09:52 schrieb Samuel: >>> Okay. >>> >>> I'm now using 1.1.11 from buster-backports. >>> >>> Next time I will try to shorten the logs to the relevant part. >>> >>> You are right, there is a huge amount of files classified as modified, >>> which should be unchanged (for several years). >>> Now I'm unsure in how to investigate further. >>> The files are stored on a btrfs filesystem across 4 HDs as RAID1. >>> >>> Unfortunately I didn't find a hint in the FAQ regarding this behaviour. >>> >>> I will now wait for the next Backups with the more recent version of >>> borgbackup and try to investigate the new logs. >>> >>> >>> Thanks a lot for your help! >>> Samuel >>> >>> Am 09.09.20 um 12:34 schrieb Thomas Waldmann: >>>>> Stats was already enabled and the duration of the backup doesn't seem to >>>>> depend on the data added (i attached the of the borg backup). >>>> You should use a more recent release than borg 1.1.9, see the advisory >>>> at the top of the changelog. >>>> >>>>> Logs with stats enabled but without --list --filter="AME" can be found here: >>>>> https://pastebin.com/raw/RW43kBHi >>>> Next time you could shorten the logs so they only show the relevant / >>>> interesting stuff. >>>> >>>> Didn't see anything remarkable there (except maybe that your "System" >>>> backup includes a rather high count of files). >>>> >>>>> With --list --filter="AME" enabled the logs since 2020-09-06 exceed >>>>> 200Mbyte. Therefore i can't send them. >>>> It wasn't intended that you publish that log, but rather that you check >>>> yourself whether the A)dded and M)odified files shown in there are >>>> somehow as expected. >>>> >>>> If it shows files as M)odified that had no content modification, that is >>>> an indication that the borg files cache maybe doesn't work ok for you / >>>> for that filesystem type or that the files were slightly "touched" by >>>> something. >>>> >>>> You have to find out then what the root cause for this is, see the FAQ >>>> and the documentation of --files-cache. >>>> >>>> Or if it shows a huge amount of E)rrors, then find out if the errors are >>>> severe or if they can be ignored (sometimes one can just exclude >>>> problematic stuff if it does not need to be backed up anyway). >>>> >>>> >>>> >>>> >>>>> Am 04.09.20 um 14:29 schrieb Thomas Waldmann: >>>>>>> Borg runs on a regular daily basis and is working stable. >>>>>>> But the consumed cpu time varies between around 4 hours and 12-13 hours: >>>>>> Could be because on different days, a different amount of data changes? >>>>>> >>>>>> Or (if you do backups to a remote repo server), the connection speed to >>>>>> / system load of source or target server is different. >>>>>> >>>>>> Without more infos, it is hard to say. >>>>>> >>>>>> You could start by adding this to borg create: >>>>>> >>>>>> --stats --list --filter="AME" >>>>>> >>>>>> stats should tell if there was really much data added. >>>>>> if so, the list output should hint about what it was. >>> _______________________________________________ >>> Borgbackup mailing list >>> Borgbackup at python.org >>> https://mail.python.org/mailman/listinfo/borgbackup >>> >> _______________________________________________ >> Borgbackup mailing list >> Borgbackup at python.org >> https://mail.python.org/mailman/listinfo/borgbackup > _______________________________________________ > Borgbackup mailing list > Borgbackup at python.org > https://mail.python.org/mailman/listinfo/borgbackup From tw at waldmann-edv.de Thu Sep 17 07:07:25 2020 From: tw at waldmann-edv.de (Thomas Waldmann) Date: Thu, 17 Sep 2020 13:07:25 +0200 Subject: [Borgbackup] Backup taking very long - sometimes In-Reply-To: <2f2a7d98-2f69-6b8c-ee5c-320909c4a43d@mailbox.org> References: <8eee4192-5961-8b19-ee51-47ecc882614d@waldmann-edv.de> <69985d27-c985-017e-446e-e692eb8707e8@waldmann-edv.de> <2f2a7d98-2f69-6b8c-ee5c-320909c4a43d@mailbox.org> Message-ID: > With the option "--list --filter="AME"" I see that some backups run very > fast and only a considerable amount of files is changed or added. > The backups that have a really long duration backup a whole lot of files > as modified which should be untouched. > > The source is a nextcloud data directory and the data is on a btrfs volume. Is it always mounted to same mountpoint (same base path)? Checks the docs about the files-cache, it uses full path, ctime, inode number and file size (defaults) to determine whether a file is still the same. In that case, processing is very fast, because file content does not need to be read. In the other case, if content might have changed, all content is read, chunked, hashed, deduplicated, new chunks compressed, encrypted, authenticated, transferred to repo. This is significantly slower. full path -> always use same mountpoint / have stable full pathes ctime -> avoid stuff like frequent chown -R / chmod -R (because they trigger ctime change) inode -> have stable inode numbers (frequently an issue with network filesystems, some have options for stable inode numbers and also you can use --files-cache=ctime,size to not check for inode number change) size -> usually not a problem. if size changes, file contents have changed for sure. -- GPG ID: 9F88FB52FAF7B393 GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393 From borg-samuel at balkonien.org Thu Sep 17 07:54:24 2020 From: borg-samuel at balkonien.org (borg-samuel at balkonien.org) Date: Thu, 17 Sep 2020 13:54:24 +0200 Subject: [Borgbackup] Backup taking very long - sometimes In-Reply-To: References: <8eee4192-5961-8b19-ee51-47ecc882614d@waldmann-edv.de> <69985d27-c985-017e-446e-e692eb8707e8@waldmann-edv.de> <2f2a7d98-2f69-6b8c-ee5c-320909c4a43d@mailbox.org> Message-ID: Thank you! I guess I'm getting closer. There is a chown -R which runs over the whole filesystem there. It was part of hardened file permissions for nextcloud, but now I see they are not recommended anymore. So I will retire this script and hope the backup will be faster in the future. Thank you! Samuel Am 17.09.20 um 13:07 schrieb Thomas Waldmann: >> With the option "--list --filter="AME"" I see that some backups run very >> fast and only a considerable amount of files is changed or added. >> The backups that have a really long duration backup a whole lot of files >> as modified which should be untouched. >> >> The source is a nextcloud data directory and the data is on a btrfs volume. > Is it always mounted to same mountpoint (same base path)? > > Checks the docs about the files-cache, it uses full path, ctime, inode > number and file size (defaults) to determine whether a file is still the > same. > > In that case, processing is very fast, because file content does not > need to be read. > > In the other case, if content might have changed, all content is read, > chunked, hashed, deduplicated, new chunks compressed, encrypted, > authenticated, transferred to repo. This is significantly slower. > > > full path -> always use same mountpoint / have stable full pathes > > ctime -> avoid stuff like frequent chown -R / chmod -R (because they > trigger ctime change) > > inode -> have stable inode numbers (frequently an issue with network > filesystems, some have options for stable inode numbers and also you can > use --files-cache=ctime,size to not check for inode number change) > > size -> usually not a problem. if size changes, file contents have > changed for sure. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hpj at urpla.net Thu Sep 17 14:41:56 2020 From: hpj at urpla.net (Hans-Peter Jansen) Date: Thu, 17 Sep 2020 20:41:56 +0200 Subject: [Borgbackup] Backup taking very long - sometimes In-Reply-To: References: Message-ID: <1600374067.viCR3OJYjj@xrated> Am Donnerstag, 17. September 2020, 13:54:24 CEST schrieb borg- samuel at balkonien.org: > Thank you! > I guess I'm getting closer. There is a chown -R which runs over the > whole filesystem there. Does that really blindly change all files?!? Tsss. > It was part of hardened file permissions for nextcloud, but now I see > they are not recommended anymore. So I will retire this script and hope > the backup will be faster in the future. You can easily write a script, that only changes files, that need changing. I've done so in Python a few decades ago.. Let me know, if you want that. Cheers, Pete From dave at gasaway.org Fri Sep 18 02:25:01 2020 From: dave at gasaway.org (David Gasaway) Date: Thu, 17 Sep 2020 23:25:01 -0700 Subject: [Borgbackup] Backup taking very long - sometimes In-Reply-To: <1600374067.viCR3OJYjj@xrated> References: <1600374067.viCR3OJYjj@xrated> Message-ID: On Thu, Sep 17, 2020 at 11:42 AM Hans-Peter Jansen wrote: > You can easily write a script, that only changes files, that need changing. > I've done so in Python a few decades ago.. Let me know, if you want that. > I don't know about the OP, but I would like to see it. -- -:-:- David K. Gasaway -:-:- Email: dave at gasaway.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From hpj at urpla.net Sat Sep 19 07:27:06 2020 From: hpj at urpla.net (Hans-Peter Jansen) Date: Sat, 19 Sep 2020 13:27:06 +0200 Subject: [Borgbackup] Backup taking very long - sometimes In-Reply-To: References: <1600374067.viCR3OJYjj@xrated> Message-ID: <4390923.M0sBr9PLBv@xrated> Am Freitag, 18. September 2020, 08:25:01 CEST schrieb David Gasaway: > On Thu, Sep 17, 2020 at 11:42 AM Hans-Peter Jansen wrote: > > You can easily write a script, that only changes files, that need > > changing. > > I've done so in Python a few decades ago.. Let me know, if you want that. > > I don't know about the OP, but I would like to see it. Sure, with pleasure, attached. Please note, that it is pretty oldschool in some aspects, but it should work properly with any python out there... If some of you guys want to hack on it, drop me a note, please. Will add a setup.py, modernize a few aspects and push it to a GH project then. Cheers, Pete -------------- next part -------------- A non-text attachment was scrubbed... Name: chperm.py Type: text/x-python3 Size: 6565 bytes Desc: not available URL: