encrypting files + filestreams?

Marshall T. Vandegrift llasram at gmail.com
Wed Aug 15 15:14:23 EDT 2007


per9000 <per9000 at gmail.com> writes:

> I am trying to figure out the best way to encrypt files in python.

Looking at your code and questions, you probably want to pick up a
cryptography handbook of some sort (I'd recommend /Practical
Cryptography/) and give it a read.

> But I have some thoughts about it. By pure luck (?) this file happened
> to be N*512 bytes long so I do not have to add crap at the end - but
> on files of the size N*512 + M (M != 521) I will add some crap to make
> it fit in the algorithm.

BTW, AES has a block size of 16, not 512.

> When I later decrypt I will have the stuff I do not want. How do
> people solve this? (By writing the number of relevant bytes in
> readable text in the beginning of the file?)

There are three basic ways of solving the problem with block ciphers.
Like you suggest, you can somehow store the actual size of the encrypted
data.  The second option is to store the number of padding bytes
appended to the end of the data.  The third is to use the block cipher
in cipher feedback (CFB) or output feedback (OFB) modes, both of which
transform the block cipher into a stream cipher.  The simplest choice
coding-wise is to just use CFB mode, but the "best" choice depends upon
the requirements of your project.

> Also I wonder if this can be solved with filestreams (Are there
> streams in python? The only python file streams I found in the evil
> search engine was stuff in other forums.)

Try looking for information on "file-like objects."  Depending on the
needs of your application, one general solution would be to implement a
file-like object which decorates another file-like object with
encryption on its IO operations.

>     crptz = AES.new("my-secret_passwd")

I realize this is just toy code, but this is almost certainly not what
you want:

  - You'll get a much higher quality key -- and allow arbitrary length
    passphrases -- by producing the key from the passphrase instead of
    using it directly as the key.  For example, taking the SHA-256 hash
    of the passphrase will produce a much higher entropy key of the
    correct size for AES.

  - Instantiating the cipher without specifying a mode and
    initialization vector will cause the resulting cipher object to use
    ECB (electronic codebook) mode.  This causes each identical block in
    the input stream to result in an identical block in the output
    stream, which opens the door for all sorts of attacks.

Hope this helps!

-Marshall




More information about the Python-list mailing list