Pure Python Data Mangling or Encrypting

Steven D'Aprano steve+comp.lang.python at pearwood.info
Thu Jun 25 05:25:37 EDT 2015


On Thursday 25 June 2015 14:27, Devin Jeanpierre wrote:

> On Wed, Jun 24, 2015 at 9:07 PM, Steven D'Aprano <steve at pearwood.info>
> wrote:
>> But just sticking to the three above, the first one is partially
>> mitigated by allowing virus scanners to scan the data, but that implies
>> that the owner of the storage machine can spy on the files. So you have a
>> conflict here.
> 
> If it's encrypted malware, and you can't decrypt it, there's no threat.

If the *only* threat is that the sender will send malware, you can mitigate 
around that by dropping the file in an unencrypted container. Anything good 
enough to prevent Windows from executing the code, accidentally or 
deliberately, say, a tar file with a custom extension.

But encrypting the file is also a good solution, and it prevents the storage 
machine spying on the file contents too. Provided the encryption is strong.


>> Honestly, the *only* real defence against the spying issue is to encrypt
>> the files. Not obfuscate them with a lousy random substitution cipher.
>> The storage machine can keep the files as long as they like, just by
>> making a copy, and spend hours bruteforcing them. They *will* crack the
>> substitution cipher. In pure Python, that may take a few days or weeks;
>> in C, hours or days. If they have the resources to throw at it, minutes.
>> Substitution ciphers have not been effective encryption since, oh, the
>> 1950s, unless you use a one-time pad. Which you won't be.
> 
> The original post said that the sender will usually send files they
> encrypted, unless they are malicious. So if the sender wants them to
> be encrypted, they already are.

The OP *hopes* that the sender will encrypt the files. I think that's a 
vanishingly faint hope, unless the application itself encrypts the file.

Most people don't have any encryption software beyond password-protecting 
zip files. Zip 2.0 legacy encryption is crap, and there are plenty of tools 
available to break it. Winzip has an extension for 128-bit and 256-bit AES 
encryption, both of which are probably strong enough unless you're targeted 
by the NSA, but the weak link in the chain is the idea that people will 
encrypt the software before sending it. Even if they have the tools, 
laziness being the defining characteristic of most people, they won't use 
them.

> "While the data senders are supposed to encrypt data, that's not
> guaranteed, and I'd like to protect the recipient against exposure to
> nefarious data by mangling or encrypting the data before it is written
> to disk."
> 
> The cipher is just to keep the sender from being able to control what
> is on disk.

The sender has a copy of the application? Then they can see the type of 
obfuscation used. If they know the key, or can guess it, they can take their 
malware, *decrypt* it, and send that, so that *encrypting* that file puts 
the malicious code on the disk.

E.g. suppose I want to send you an insult, but I know your program 
automatically ROT-13s the strings I send you. Then I send you:

'lbhe sngure fzryyf bs ryqreoreevrf'

and your program ROT-13s it to:

'your father smells of elderberries'

I know that the OP doesn't propose using ROT-13, but a classical 
substitution cipher isn't that much stronger.


> I am usually very oppositional when it comes to rolling your own
> crypto, but am I alone here in thinking the OP very clearly laid out
> their case?


I don't think any of us *really* understand his use-case or the potential 
threats, but to my way of thinking, you can never have too strong a cipher 
or underestimate the risk of users taking short-cuts.



-- 
Steve




More information about the Python-list mailing list