[Borgbackup] Determining size of archive

Bzzzz lazyvirus at gmx.com
Thu Nov 12 01:07:54 EST 2020


On Thu, 12 Nov 2020 05:47:37 +0200 (EET)
"Eric S. Johansson" <eric at in3x.io> wrote:

> I need to determine how effective the duplication compression will be
> on a particular data set. According to documentation a dry-run doesn't
> for the duplication or compression. What's the best way to check its
> effectiveness? 

Hmm, I would make a test on a representative sample coming from your
data set and extrapolate for the whole - you might have an error margin,
but it is much faster to test several comp/decompression methods/parms.

You might need something like that to help you choose your own sample :

#!/bin/sh

usage () {
    echo
    echo "You're doing it wrong!"
    echo
    echo "Usage:    `basename $0`   <directory name>"
    echo
}
if [ ! "$1" ]; then
    usage
    exit 1
fi

clear
echo "Count and sort files by their size from directory: $1"
echo
"============================================================================================"
echo "ie:   128 ≤    383 < 256  >>> Means there are 383 files of size
[128-256[ BYTES" echo
"============================================================================================"
echo "         [Lower limit]     Nb of files      ]Higher limit["

# Do not work from the command line => from a script only !

find $1 -type f -print0 | xargs -0 ls -l | awk
'{size[int(log($5)/log(2))]++}END{for (i in size) { printf("%'"'"'15.f",
2^i) ; printf(" ≤ %'"'"'15.f", size[i]) ; printf(" < %'"'"'15.f\n",
2^(i+1)) } }' | sort -n

exit 0



Jean-Yves


More information about the Borgbackup mailing list