[Security-sig] Take a decision for os.urandom() in Python 3.6

Ethan Furman ethan at stoneleaf.us
Sun Aug 7 12:28:02 EDT 2016


On 08/07/2016 09:14 AM, Nick Coghlan wrote:
> On 7 August 2016 at 03:21, Guido van Rossum wrote:
>>
>> There is one thing that is still really unresolved for me, and that is
>> a good understanding of how likely this feared event, "not having
>> enough entropy" actually is, for environments where Python may
>> actually be used. My main question is, can it occur in situations
>> *other* than during very early startup? What's the answer for various
>> platforms? Once I'm past this boot phase, can I safely assume
>> os.urandom() will never block, or is there still a possibility for a
>> system to run out of entropy later (say, by excessive calls to
>> os.urandom(), possibly in another process)? The text of [PEP 522]
>> suggests that that is *not* a possibility (since it recommends putting
>> that call in __main__).
>>
>> Anyways, if the answer ends up being "yes, some systems may
>> occasionally run out of entropy during normal operation", I would
>> count that as a further point against PEP 522.
>
> I see folks encountering the new exception proposed in one of two ways:
>
> 1. They're writing Linux system initialisation software, and forgot the
>  system RNG may not be ready yet
> 2. They're running security sensitive Python software on a misconfigured
>  hosting platform that isn't seeding the entropy pool correctly (either
>  in a VM or on an embedded system)
>
> For the first case, I think either approach to blocking (implicit or explicit) is fine.
>
> However, the concern I have with PEP 524 is that in the second case, it makes it incredibly hard for an operations team (who probably aren't going to be Python experts, and are frequently going to be running software they didn't write) to debug the problem - rather than a crashed application with a full Python traceback (which they can take back to the dev team or vendor and ask "What does this mean?", or else look up on the internet themselves), all the platform operators will have to go on is "This application hangs at startup". strace should at least be able to tell them that it's hanging in a getrandom() kernel call, but it's still going to take a pretty capable sysadmin to be able to figure out what's going on.
>
> In a lot of ways, I see it as being similar to our dependency on the Linux platform locale being set correctly to get boundary processing right: if you get an exception, the problem *isn't* generally with the application, it's with the way Linux has been configured. The same holds here - if you get BlockingIOError from os.urandom under PEP [522], there's nothing wrong with your application, but there *is* something wrong with your environment (since security sensitive Python code should only be run after the system RNG is ready)

+1

If I had not been involved in these discussions about early linux startup, virtual machines, and os.urandom I would be completely mystified by the error presented when I ran across it (stalled and eventually killed process), with no clue about the nature of the problem.

At this point we have concrete examples of the harm caused by blocking on os.urandom -- do we have any actual use-cases where it is hurtful to raise instead?

--
~Ethan~


More information about the Security-SIG mailing list