[Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?

Sat Jun 11 17:46:29 EDT 2016

> On Jun 11, 2016, at 5:16 PM, Guido van Rossum <guido at python.org> wrote:
> 
> On Sat, Jun 11, 2016 at 1:48 PM, Donald Stufft <donald at stufft.io <mailto:donald at stufft.io>> wrote:
> 
>> On Jun 11, 2016, at 3:40 PM, Guido van Rossum <guido at python.org <mailto:guido at python.org>> wrote:
>> 
>> Yeah, but we've already established that there's a lot more upset, rhetoric and worry than warranted by the situation.
> 
> Have we? There are real, documented security failures in the wild because of /dev/urandom’s behavior. This isn’t just a theoretical problem, it actually has had consequences in real life, and those same consequences could just have easily happened to Python (in one of the cases that most recently comes to mind it was a C program, but that’s not really relevant because the same problem would have happened if they had written in Python using os.urandom in 3.4 but not in 3.5.0 or 3.5.1.
> 
> Actually it's not clear to me at all that it could have happened to Python. (Wasn't it an embedded system?)

It was a RaspberryPI that ran a shell script on boot that called ssh-keygen. That shell script could have just as easily been a Python script that called os.urandom via https://github.com/sybrenstuvel/python-rsa <https://github.com/sybrenstuvel/python-rsa> instead of a shell script that called ssh-keygen.

>> Actually the proposal for that was the secrets module. And the secrets module would be the only user of os.urandom(blocking=True).
> 
> I’m fine if this lives in the secrets module— Steven asked for it to be an os function so that secrets.py could continue to be pure python.
> 
> The main thing that I want to avoid is that people start cargo-culting whatever the secrets module uses rather than just using the secrets module. Having it redundantly available as os.getrandom() is just begging for people to show off how much they know about writing secure code. 

I guess one question would be, what does the secrets module do if it’s on a Linux that is too old to have getrandom(0), off the top of my head I can think of:

* Silently fall back to reading os.urandom and hope that it’s been seeded.
* Fall back to os.urandom and hope that it’s been seeded and add a SecurityWarning or something like it to mention that it’s falling back to os.urandom and it may be getting predictable random from /dev/urandom.
* Hard fail because it can’t guarantee secure cryptographic random.

Of the three, I would probably suggest the second one, it doesn’t let the problem happen silently, but it still “works” (where it’s basically just hoping it’s being called late enough that /dev/urandom has been seeded), and people can convert it to the third case using the warnings module to turn the warning into an exception.

>>  
>> * If you want to ensure you get cryptographically secure bytes, os.getrandom, falling back to os.urandom on non Linux platforms and erroring on Linux.
>> 
>> "Erroring" doesn't sound like it satisfies the "ensure" part of the requirement. And I don't see the advantage of os.getrandom() over the secrets module. (Either way you have to fall back on os.urandom() to suppport Python 3.5 and before.)
> 
> Erroring does satisfy the ensure part, because if it’s not possible to get cryptographically secure bytes then the only option is to error if you want to be ensured of cryptographically secure bytes. 
> 
> It’s a bit like if you did open(“somefile.txt”), it’s reasonable to say that we should ensure that open(“somefile.txt”) actually opens ./somefile.txt, and doesn’t randomly open a different file if ./somefile.txt doesn’t exist— if it can’t open ./somefile.txt it should error. If I *need* cryptographically secure random bytes, and I’m on a platform that doesn’t provide those, then erroring is often times the correct behavior. This is such an important thing that OS X will flat out kernel panic and refuse to boot if it can’t ensure that it can give people cryptographically secure random bytes.
> 
> But what is a Python script going to do with that error? IIUC this kind of error would only happen very early during boot time, and rarely, so the most likely outcome is a hard-to-debug mystery failure.

Depends on why they’re calling it, which is sort of the underlying problem I suspect with why there isn’t agreement about what the right default behavior is. The correct answer for some application might be to hard fail and wait for the operator to fix the environment that it’s running in. It depends on how important the thing that is getting this random is.

One example: If I was writing a communication platform for people who are fighting oppressive regimes or to securely discuss sexual orientation in more dangerous parts of the world, I would want to make this program hard fail if it couldn’t ensure that it was using an interface that ensured cryptographic random, because the alternative is predictable numbers and someone possibly being arrested or executed. I know that’s a bit of an extreme edge case, but it’s also the kind of thing that people can might use Python for where the predictability of the CSPRNG it’s using is of the utmost importance.

For other things, the importance will fall somewhere between best effort being good enough and predictable random numbers being a catastrophic.

>  
> It’s a fairly simple decision tree, I go “hey, give me cryptographically secure random bytes, and only cryptographically secure random bytes”. If it cannot give them to me because the APIs of the system cannot guarantee they are cryptographically secure then there are only two options, either A) it is explicit about it’s inability to do this and raises an error or B) it does something completely different than what I asked it to do and pretends that it’s what I wanted.
> 
> I really don't believe that there is only one kind of cryptographically secure random bytes. There are many different applications (use cases) of randomness and they need different behaviors. (If it was simple we wouldn't still be arguing. :-) 

I mean for a CSPRNG there’s only one real important property: Can an attacker predict the next byte. Any other property for a CSPRNG doesn’t really matter. For other, non kinds of CSPRNGs they want other behaviors (equidistribution, etc) but those aren’t cryptographically secure (nor do they need to be).

>>  
>> * If you want to *ensure* that there’s no blocking, then os.urandom on Linux (or os.urandom wrapped with timeout code anywhere else, as that’s the only way to ensure not blocking cross platform).
>> 
>> That's fine with me.
>>  
>> * If you just don’t care, YOLO it up with either os.urandom or os.getrandom or random.random.
>> 
>> Now you're just taking the mickey.
> 
> No I’m not— random.Random is such a use case where it wants to seed with as secure of bytes as it can get it’s hands on, but it doesn’t care if it falls back to insecure bytes if it’s not possible to get secure bytes. This code even falls back to using time as a seed if all else fails.
> 
> Fair enough. The hash randomization is the other case I suppose (since not running any Python code at all isn't an option, and neither is waiting indefinitely before the user's code gets control).
> 
> It does show the point that there are different use cases with different needs. But I think the stdlib should limit the choices.
> 
> -- 
> --Guido van Rossum (python.org/~guido <http://python.org/~guido>)

—
Donald Stufft

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160611/55972f46/attachment.html>