[Borgbackup] Fw: faq entry regarding "added" status for an unchanged file

devzero at web.de devzero at web.de
Wed Dec 19 14:50:05 EST 2018


spent time on thinking on this, spoke with my collueage - we both don't get the point. we both care about file/data integrity, so we find this question interesting.

could somebody explain in other words given an example?

i do snapshot "snap1" of filesystem. lets say next snapshot (snap2) is done seconds later on. 

then i backup files from snap1. then do backup from snap2. 

files which changed in between snap1 and snap2 have different mtime in snap2. ok.

these are the files being chunked/re-added, all others getting skipped because of borg-cache entries...

so why is only _one_ single file being handled in a special way? (as multiple files could have changed...)

sorry, maybe i'm too dumb...

regards
roland


> Gesendet: Mittwoch, 19. Dezember 2018 um 02:52 Uhr
> Von: devzero at web.de
> An: "borgbackuppython.org" <borgbackup at python.org>
> Betreff: faq entry regarding "added" status for an unchanged file
>
> i think "files that are backed up from a snapshot" could need some better explanation.
> what is meant here? what "snapshots"?
> are you speaking of virtual machine snashots?
> 
> what about some additional hint in the log "hey, i'm doing something special here...." ?
> 
> files which "change during save" are always inconsistent - rsync for example warns about this...
> 
> i don't get the real point what's being adressed with this feature and why borg does handle things different from rsync.
> 
> why is it default behaviour to always backup the file with the latest mtime - and why not adding a special backup option for a special use-case?
> 
> i'm asking this because i accidentally opened a bugticket because i "observed something strange i could not explain"...
> 
> regards
> roland
> 
> 
> I am seeing ‘A’ (added) status for an unchanged file!?
> 
> The files cache is used to determine whether Borg already “knows” / has backed up a file and if so, to skip the file from chunking. It does intentionally not contain files that have a modification time (mtime) same as the newest mtime in the created archive.
> 
> So, if you see an ‘A’ status for unchanged file(s), they are likely the files with the most recent mtime in that archive.
> 
> This is expected: it is to avoid data loss with files that are backed up from a snapshot and that are immediately changed after the snapshot (but within mtime granularity time, so the mtime would not change). Without the code that removes these files from the files cache, the change that happened right after the snapshot would not be contained in the next backup as Borg would think the file is unchanged.
> 
> This does not affect deduplication, the file will be chunked, but as the chunks will often be the same and already stored in the repo (except in the above mentioned rare condition), it will just re-use them as usual and not store new data chunks.
> 
> If you want to avoid unnecessary chunking, just create or touch a small or empty file in your backup source file set (so that one has the latest mtime, not your 50GB VM disk image) and, if you do snapshots, do the snapshot after that.
> 
> Since only the files cache is used in the display of files status, those files are reported as being added when, really, chunks are already used.


More information about the Borgbackup mailing list