[Cryptography-dev] Create Fernet API allowing streaming encryption and decryption from file-like objects.

Donald Stufft donald at stufft.io
Mon Jan 19 17:27:45 CET 2015


> On Jan 19, 2015, at 11:16 AM, Michael Iverson <dr.michael.iverson at gmail.com> wrote:
> 
> 
> On Mon, Jan 19, 2015 at 10:51 AM, Donald Stufft <donald at stufft.io <mailto:donald at stufft.io>> wrote:
> This is a fairly obvious way of handling that. However it’ll write a whole bunch of data to decrypted.txt and only fail after the very last chunk.
> 
> That is definitely a concern, and it cannot be readily mitigated, as not keeping everything in memory is exactly what is required. 
> 
> However, I'm not sure the chunk based approach necessarily mitigates this problem either, as you could write out hundreds of chunks, only to have the final chunk fail. Also, having multiple chunks also requires that we somehow manage to ensure that we can identify missing or out-of-order chunks. 

The point here is while we might fail on the very last chunk, we ensure that each chunk is fully authenticated before we give it to the user. So yes, the user can operate on decrypted and authenticated data which might ultimately fail on the last chunk, however they will not be able to operate on decrypted and unauthenticated data which might ultimately fail on the last chunk. There is no way that I am aware of to create an API that allows streaming but also prevents the user from reading the streamed data prior to having processed the entire stream. The best that I can think of is one that prevents the user from reading data that we haven’t authenticated to make sure it wasn’t modified in transit.

> 
> I'd also be concerned about the cryptographic implications of this. I'm not sure if this is entirely correct, but it seems if you set your chunk size = AES block size, you essentially are encrypting in ECB mode.  

The actual details of what you’d need to do is more involved than just calling encrypt() with the same key on chunks. That’s just a high level “here’s the general idea thing”. In reality you’d encrypt the stream using the streaming encryption APIs (so you’d use something like CBC or CTR) and you’d take that output and break it into chunks as well, and you’d authenticate each of those chunks.

In pseudo code it’s look something like:

def safe_streaming_encrypt(plaintext_iterator):
    for encrypted_chunk in unauthenticated_streaming_encrypt(plaintext_iterator):
        yield HMAC(encypted_chunk), encrypted_chunk

You’d need this to actually yield bytes which represents a “record” (e.g. an encoded representation of the data of the encrypted chunk, the authentication tag, and anything else the scheme needs). However this would ensure that you’d never actually give the user unauthenticated data.

> 
> I would presume there is a block size sufficiently large to mitigate this problem, but I get chills up my spine when I use the word 'presume' in any sentence about cryptography. 
> 

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cryptography-dev/attachments/20150119/e4240e5e/attachment-0001.html>


More information about the Cryptography-dev mailing list