How to best update remote compressed, encrypted archives incrementally?

Steven D'Aprano steve at REMOVETHIScyber.com.au
Sat Mar 11 06:59:18 EST 2006


On Sat, 11 Mar 2006 11:46:24 +0100, robert wrote:

>> Sounds like a job for any number of already existing technologies, like
>> rsync (which, by the way, already uses ssh for the encrypted transmission
>> of data).
> 
> As far as I know, rsync cannot update compressed+encrypted into an 
> existing file(set) ?
> I any case with rsync I would have to have a duplicate of the backup 
> file geometry on the local machine (consuming another magnitude of the 
> file stuff itself) ?

Let me see if I understand you.

On the remote machine, you have one large file, which is compressed and
encrypted. Call the large file "Archive". Archive is made up of a number
of virtual files, call them A, B, ... Z. Think of Archive as a compressed
and encrypted tar file.

On the local machine, you have some, but not all, of those smaller
files, let's say B, C, D, and E. You want to modify those smaller files,
compress them, encrypt them, transmit them to the remote machine, and
insert them in Archive, replacing the existing B, C, D and E.

Is that correct?

> Thats why I ask: how to get all these tasks into a cohesive encrypted 
> backup solution not wasting disk space and network bandwidth?

What's your budget for developing this solution? $100? $1000? $10,000?
Stop me when I get close. Remember, your time is money, and if you are a
developer, every hour you spend on this is costing your employer anything
from AUD$25 to AUD$150. (Of course, if you are working for yourself, you
might value your time as Free.)

If you have an unlimited budget, you can probably create a solution to do
this, keeping in mind that compressed/encrypted and modify-in-place
*rarely* go together. 

If you have a lower budget, I'd suggest you drop the "single file"
requirement. Hard disks are cheap, less than an Australian dollar a
gigabyte, so don't get trapped into the false economy of spending $100 of
developer time to save a gigabyte of data. Using multiple files makes it
*much* simpler to modify-in-place: you simply replace the modified file.
Of course the individual files can be compressed and encrypted, or you can
use a compressed/encrypted file system. 

Lastly, have you considered that your attempted solution is completely the
wrong way to solve the problem? If you explain _what_ you are wanting to
do, rather than _how_ you want to do it, perhaps there is a better way.


-- 
Steven.




More information about the Python-list mailing list