From victor.stinner at gmail.com  Mon Jun 20 17:00:48 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Mon, 20 Jun 2016 23:00:48 +0200
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
Message-ID: <CAMpsgwaX1KrgY4_R+qij0B2KtvLX9=_8UoqEWC0J2Vq9PBp4xA@mail.gmail.com>

Hi,

Warning: believe me or not, I only read the first ~50 messages of the
recent discussion about random on the Python bug tracker and then the
python-dev mailing list.

Warning 2: If this email thread gets 100 emails per day as it was the
case on the bug tracker and python-dev, I will have to ignore it
again. Sorry, but I don't have the bandwith to read so much messages
:-(

Here is a concrete proposal trying to make Python 3.6 more secure on
Linux, without blocking Python at startup.

I suggest to stick to Linux first. Sorry, but I don't have the skills
to propose a concrete change for other platforms since I don't know
well their exact behaviour, and I'm not that they give access to
blocking *and* non-blocking urandom.

Victor


HTML version:
https://haypo-notes.readthedocs.io/pep_random.html


+++++++++++++++++++++++++++++++++++
Make os.urandom() blocking on Linux
+++++++++++++++++++++++++++++++++++

Headers::

    PEP: xxx
    Title: Make os.urandom() blocking on Linux
    Version: $Revision$
    Last-Modified: $Date$
    Author: Victor Stinner <victor.stinner at gmail.com>
    Status: Draft
    Type: Standards Track
    Content-Type: text/x-rst
    Created: 20-June-2016
    Python-Version: 3.6


Abstract
========

Modify ``os.urandom()`` to block on Linux 3.17 and newer until the OS
urandom is initialized.


Rationale
=========

Linux 3.17 adds a new ``getrandom()`` syscall which allows to block
until the kernel collected enough entropy. It avoids to generate weak
cryptographic keys.

Python os.urandom() uses the ``getrandom()``, but falls back on reading
the non-blocking ``/dev/urandom`` if ``getrandom(GRND_NONBLOCK)`` fails
with ``EAGAIN``.

Security experts promotes ``os.urandom()`` to genereate cryptographic
keys, even instead of ``ssl.RAND_bytes()``.

Python 3.5.0 blocked at startup on virtual machines, waiting for the OS
urandom initialization, which was seen as a regression compared to
Python 3.4 by users.

This PEP proposes to modify os.urandom() to more is more secure, but
also ensure that Python will not block at startup.


Changes
=======

* Initialize hash secret from non-blocking OS urandom
* Initialize random._inst, a Random instance, with non-blocking OS
  urandom
* Modify os.urandom() to block until urandom is initialized on Linux

A new _PyOS_URandom_Nonblocking() private method will be added: read OS
urandom in non-blocking mode. In practice, it means that it falls back
on reading /dev/urandom on Linux.

_PyRandom_Init() is modified to call _PyOS_URandom_Nonblocking().
Moreover, a new ``random_inst_seed`` will be added to the
``_Py_HashSecret_t`` structure (see above).

random._inst will be initialized with the ``random_inst_seed`` secret. A
flag will be used to ensure that this secret is only used once.

If a second instance of random.Random is created, blocking os.urandom()
will be used.


Alternative
===========

Never use blocking urandom in the random module
-----------------------------------------------

The random module can use ``random_inst_seed`` as a seed, but add other
sources of entropy like the process identifier (``os.getpid()``), the
current time (``time.time()``), memory addresses, etc.

Reading 2500 bytes from os.urandom() to initialize the Mersenne Twister
RNG in random.Random is a deliberate choice to get access to the full
range of the RNG. This PEP is a compromise between "security" and
"feature". Python should not block at startup before the OS collected
enough entropy. But on the regular use case (OS urandom iniitalized),
the random module should continue to its code to initialize the seed.

Python 3.5.0 was blocked on ``import random``, not on building a second
instance of ``random.Random``.


Annexes
=======

Why using os.urandom()?
-----------------------

Since ``os.urandom()`` is implemented in the kernel, it doesn't have
some issues of user-space RNG. For example, it is much harder to get its
state. It is usually built on a CSPRNG, so even if its state is get, it
is hard to compute previously generated numbers. The kernel has a good
knowledge of entropy sources and feed regulary the entropy pool.


Linux getrandom()
-----------------

On OpenBSD, FreeBSD and Mac OS X, reading /dev/urandom blocks until the
kernel collected enough entropy. It is not the case on Linux. Basically,
if a design choice should be make between usability and security,
usability is preferred on Linux, whereas security is preferred on BSD.

The new ``getrandom()`` of Linux 3.17 allows users to choose security be
blocking until the kernel collected enough entropy.

On virtual machines and some embedded devices, it can take longer than a
minute to collect enough entropy. In the worst case, the application
will block forever because the kernel really has no entropy source and
so cannot unblock ``getrandom()``.


Copyright
=========

This document has been placed in the public domain.

From victor.stinner at gmail.com  Tue Jun 21 10:07:21 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 21 Jun 2016 16:07:21 +0200
Subject: [Security-sig] How to document changes related to security in
 Python changelog?
Message-ID: <CAMpsgwZtXPGPwVq5S=5p=H8dV06uxeAzudfvTCNQkxK3xBp3gQ@mail.gmail.com>

Hi,

I read the summary of Christian Heimes's talk at the language summit:
"The Python security response team"
http://lwn.net/Articles/691308/

Extract: "Some of the problems that have occurred are things like bug
reports being sent to the list, but that couldn't be reproduced, or
distributions not updating their Python packages because it wasn't
clear to them that there was a security fix made in an upstream
release. Heimes suggested that security fixes be clearly marked in the
"News" file that accompanies releases."

I suggest to add a new Security section to Misc/NEWS. So packagers
should be able to quickly identify changes which should be backported
(if they maintain a Python version which is no more supported
upstream, or if you cannot use the latest version).

Christian proposed to simply prefix changes with "[Security]".

What do you think?

Victor

From victor.stinner at gmail.com  Tue Jun 21 10:10:46 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 21 Jun 2016 16:10:46 +0200
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
Message-ID: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>

I'm not sure that anyone got my email since I sent it like 1 hour
after Ethan announced the creation of the list, so I repost my email
;-)
--

Hi,

Warning: believe me or not, I only read the first ~50 messages of the
recent discussion about random on the Python bug tracker and then the
python-dev mailing list.

Warning 2: If this email thread gets 100 emails per day as it was the
case on the bug tracker and python-dev, I will have to ignore it
again. Sorry, but I don't have the bandwith to read so much messages
:-(

Here is a concrete proposal trying to make Python 3.6 more secure on
Linux, without blocking Python at startup.

I suggest to stick to Linux first. Sorry, but I don't have the skills
to propose a concrete change for other platforms since I don't know
well their exact behaviour, and I'm not that they give access to
blocking *and* non-blocking urandom.

Victor


HTML version:
https://haypo-notes.readthedocs.io/pep_random.html


+++++++++++++++++++++++++++++++++++
Make os.urandom() blocking on Linux
+++++++++++++++++++++++++++++++++++

Headers::

    PEP: xxx
    Title: Make os.urandom() blocking on Linux
    Version: $Revision$
    Last-Modified: $Date$
    Author: Victor Stinner <victor.stinner at gmail.com>
    Status: Draft
    Type: Standards Track
    Content-Type: text/x-rst
    Created: 20-June-2016
    Python-Version: 3.6


Abstract
========

Modify ``os.urandom()`` to block on Linux 3.17 and newer until the OS
urandom is initialized.


Rationale
=========

Linux 3.17 adds a new ``getrandom()`` syscall which allows to block
until the kernel collected enough entropy. It avoids to generate weak
cryptographic keys.

Python os.urandom() uses the ``getrandom()``, but falls back on reading
the non-blocking ``/dev/urandom`` if ``getrandom(GRND_NONBLOCK)`` fails
with ``EAGAIN``.

Security experts promotes ``os.urandom()`` to genereate cryptographic
keys, even instead of ``ssl.RAND_bytes()``.

Python 3.5.0 blocked at startup on virtual machines, waiting for the OS
urandom initialization, which was seen as a regression compared to
Python 3.4 by users.

This PEP proposes to modify os.urandom() to more is more secure, but
also ensure that Python will not block at startup.


Changes
=======

* Initialize hash secret from non-blocking OS urandom
* Initialize random._inst, a Random instance, with non-blocking OS
  urandom
* Modify os.urandom() to block until urandom is initialized on Linux

A new _PyOS_URandom_Nonblocking() private method will be added: read OS
urandom in non-blocking mode. In practice, it means that it falls back
on reading /dev/urandom on Linux.

_PyRandom_Init() is modified to call _PyOS_URandom_Nonblocking().
Moreover, a new ``random_inst_seed`` will be added to the
``_Py_HashSecret_t`` structure (see above).

random._inst will be initialized with the ``random_inst_seed`` secret. A
flag will be used to ensure that this secret is only used once.

If a second instance of random.Random is created, blocking os.urandom()
will be used.


Alternative
===========

Never use blocking urandom in the random module
-----------------------------------------------

The random module can use ``random_inst_seed`` as a seed, but add other
sources of entropy like the process identifier (``os.getpid()``), the
current time (``time.time()``), memory addresses, etc.

Reading 2500 bytes from os.urandom() to initialize the Mersenne Twister
RNG in random.Random is a deliberate choice to get access to the full
range of the RNG. This PEP is a compromise between "security" and
"feature". Python should not block at startup before the OS collected
enough entropy. But on the regular use case (OS urandom iniitalized),
the random module should continue to its code to initialize the seed.

Python 3.5.0 was blocked on ``import random``, not on building a second
instance of ``random.Random``.


Annexes
=======

Why using os.urandom()?
-----------------------

Since ``os.urandom()`` is implemented in the kernel, it doesn't have
some issues of user-space RNG. For example, it is much harder to get its
state. It is usually built on a CSPRNG, so even if its state is get, it
is hard to compute previously generated numbers. The kernel has a good
knowledge of entropy sources and feed regulary the entropy pool.


Linux getrandom()
-----------------

On OpenBSD, FreeBSD and Mac OS X, reading /dev/urandom blocks until the
kernel collected enough entropy. It is not the case on Linux. Basically,
if a design choice should be make between usability and security,
usability is preferred on Linux, whereas security is preferred on BSD.

The new ``getrandom()`` of Linux 3.17 allows users to choose security be
blocking until the kernel collected enough entropy.

On virtual machines and some embedded devices, it can take longer than a
minute to collect enough entropy. In the worst case, the application
will block forever because the kernel really has no entropy source and
so cannot unblock ``getrandom()``.


Copyright
=========

This document has been placed in the public domain.

From ethan at stoneleaf.us  Tue Jun 21 10:52:02 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 21 Jun 2016 07:52:02 -0700
Subject: [Security-sig] How to document changes related to security in
 Python changelog?
In-Reply-To: <CAMpsgwZtXPGPwVq5S=5p=H8dV06uxeAzudfvTCNQkxK3xBp3gQ@mail.gmail.com>
References: <CAMpsgwZtXPGPwVq5S=5p=H8dV06uxeAzudfvTCNQkxK3xBp3gQ@mail.gmail.com>
Message-ID: <57695492.6060100@stoneleaf.us>

On 06/21/2016 07:07 AM, Victor Stinner wrote:

> Extract: "Some of the problems that have occurred are things like bug
> reports being sent to the list, but that couldn't be reproduced, or
> distributions not updating their Python packages because it wasn't
> clear to them that there was a security fix made in an upstream
> release. Heimes suggested that security fixes be clearly marked in the
> "News" file that accompanies releases."

> Christian proposed to simply prefix changes with "[Security]".

Seems good to me -- are there any downsides?

--
~Ethan~


From ethan at stoneleaf.us  Tue Jun 21 11:02:59 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 21 Jun 2016 08:02:59 -0700
Subject: [Security-sig] proposals from Nick and Nathaniel from the Py-Dev
 thread
Message-ID: <57695723.5080908@stoneleaf.us>

Nick expressed:

> The *actual bug* that triggered this latest firestorm of commentary
> (from experts and non-experts alike) had *nothing* to do with user
> code calling os.urandom, and instead was a combination of:
>
> - CPython startup requesting cryptographically secure randomness when
> it didn't need it
> - a systemd init script written in Python running before the kernel
> RNG was fully initialised
>
> That created a deadlock between CPython startup and the rest of the
> Linux init process, so the latter only continued when the systemd
> watchdog timed out and killed the offending script. As others have
> noted, this kind of deadlock scenario is generally impossible on other
> operating systems, as the operating system doesn't provide a way to
> run Python code before the random number generator is ready.
>
> The change Victor made in 3.5.2 to fall back to reading /dev/urandom
> directly if the getrandom() syscall returns EAGAIN (effectively
> reverting to the Python 3.4 behaviour) was the simplest possible fix
> for that problem (and an approach I thoroughly endorse, both for 3.5.2
> and for the life of the 3.5 series), but that doesn't make it the
> right answer for 3.6+.
>
> To repeat: the problem encountered was NOT due to user code calling
> os.urandom(), but rather due to the way CPython initialises its own
> internal hash algorithm at interpreter startup. However, due to the
> way CPython is currently implemented, fixing the regression in that
> not only changed the behaviour of CPython startup, it *also* changed
> the behaviour of every call to os.urandom() in Python 3.5.2+.
>
> For 3.6+, we can instead make it so that the only things that actually
> rely on cryptographic quality randomness being available are:
>
> - calling a secrets module API
> - calling a random.SystemRandom method
> - calling os.urandom directly
>
> These are all APIs that were either created specifically for use in
> security sensitive situations (secrets module), or have long been
> documented (both within our own documentation, and in third party
> documentation, books and Q&A sites) as being an appropriate choice for
> use in security sensitive situations (os.urandom and
> random.SystemRandom).
>
> However, we don't need to make those block waiting for randomness to
> be available - we can update them to raise BlockingIOError instead
> (which makes it trivial for people to decide for themselves how they
> want to handle that case).
>
> Along with that change, we can make it so that starting the
> interpreter will never block waiting for cryptographic randomness to
> be available (since it doesn't need it), and importing the random
> module won't block waiting for it either.
>
> To the best of our knowledge, on all operating systems other than
> Linux, encountering the new exception will still be impossible in
> practice, as there is no known opportunity to run Python code before
> the kernel random number generator is ready.
>
> On Linux, init scripts may still run before the kernel random number
> generator is ready, but will now throw an immediate BlockingIOError if
> they access an API that relies on crytographic randomness being
> available, rather than potentially deadlocking the init process. Folks
> encountering that situation will then need to make an explicit
> decision:
>
> - loop until the exception is no longer thrown
> - switch to reading from /dev/urandom directly instead of calling os.urandom()
> - switch to using a cross-platform non-cryptographic API (probably the
> random module)
>
> Victor has some additional technical details written up at
> http://haypo-notes.readthedocs.io/pep_random.html and I'd be happy to
> formalise this proposed approach as a PEP (the current reference is
> http://bugs.python.org/issue27282 )

and Nathaniel added:

> I'd make two additional suggestions:
>
> - one person did chime in on the thread to say that they've used
> os.urandom for non-security-sensitive purposes, simply because it
> provided a convenient "give me a random byte-string" API that is
> missing from random. I think we should go ahead and add a .randbytes
> method to random.Random that simply returns a random bytestring using
> the regular RNG, to give these users a nice drop-in replacement for
> os.urandom.
>
> Rationale: I don't think the existence of these users should block
> making os.urandom appropriate for generating secrets, because (1) a
> glance at github shows that this is very unusual -- if you skim
> through this search you get page after page of functions with names
> like "generate_secret_key"
>
>   https://github.com/search?l=python&p=2&q=urandom&ref=searchresults&type=Code&utf8=%E2%9C%93
>
> and (2) for the minority of people who are using os.urandom for
> non-security-sensitive purposes, if they find os.urandom raising an
> error, then this is just a regular bug that they will notice
> immediately and fix, and anyway it's basically never going to happen.
> (As far as we can tell, this has never yet happened in the wild, even
> once.) OTOH if os.urandom is allowed to fail silently, then people who
> are using it to generate secrets will get silent catastrophic
> failures, plus those users can't assume it will never happen because
> they have to worry about active attackers trying to drive systems into
> unusual states. So I'd much rather ask the non-security-sensitive
> users to switch to using something in random, than force the
> cryptographic users to switch to using secrets. But it does seem like
> it would be good to give those non-security-sensitive users something
> to switch to .
>
> - It's not exactly true that the Python interpreter doesn't need
> cryptographic randomness to initialize SipHash -- it's more that
> *some* Python invocations need unguessable randomness (to first
> approximation: all those which are exposed to hostile input), and some
> don't. And since the Python interpreter has no idea which case it's
> in, and since it's unacceptable for it to break invocations that don't
> need unguessable hashes, then it has to err on the side of continuing
> without randomness. All that's fine.
>
> But, given that the interpreter doesn't know which state it's in,
> there's also the possibility that this invocation *will* be exposed to
> hostile input, and the 3.5.2+ behavior gives absolutely no warning
> that this is what's happening. So instead of letting this potential
> error pass silently, I propose that if SipHash fails to acquire real
> randomness at startup, then it should issue a warning. In practice,
> this will almost never happen. But in the rare cases it does, it at
> least gives the user a fighting chance to realize that their system is
> in a potentially dangerous state. And by using the warnings module, we
> automatically get quite a bit of flexibility. If some particular
> invocation (e.g. systemd-cron) has audited their code and decided that
> they don't care about this issue, they can make the message go away:
>
>    PYTHONWARNINGS=ignore::NoEntropyAtStartupWarning
>
> OTOH if some particular invocation knows that they do process
> potentially hostile input early on (e.g. cloud-init, maybe?), then
> they can explicitly promote the warning to an error:
>
>   PYTHONWARNINGS=error::NoEntropyAtStartupWarning
>
> (I guess the way to implement this would be for the SipHash
> initialization code -- which runs very early -- to set some flag, and
> then we expose that flag in sys._something, and later in the startup
> sequence check for it after the warnings module is functional.
> Exposing the flag at the Python level would also make it possible for
> code like cloud-init to do its own explicit check and respond
> appropriately.)

Victor, does your PEP differ from these proposals?  (my apologies for my 
lack of time at the moment).

--
~Ethan~

From barry at python.org  Tue Jun 21 18:40:16 2016
From: barry at python.org (Barry Warsaw)
Date: Tue, 21 Jun 2016 18:40:16 -0400
Subject: [Security-sig] How to document changes related to security in
 Python changelog?
In-Reply-To: <57695492.6060100@stoneleaf.us>
References: <CAMpsgwZtXPGPwVq5S=5p=H8dV06uxeAzudfvTCNQkxK3xBp3gQ@mail.gmail.com>
 <57695492.6060100@stoneleaf.us>
Message-ID: <20160621184016.4ad487a3.barry@wooz.org>

On Jun 21, 2016, at 07:52 AM, Ethan Furman wrote:

>On 06/21/2016 07:07 AM, Victor Stinner wrote:
>> Christian proposed to simply prefix changes with "[Security]".  
>
>Seems good to me -- are there any downsides?

Nothing major IMHO.  The whole point is to make it easy for downstreams to
identify change.  To that effect, I'd mildly prefer a Misc/NEWS section
because it will be easier to pick out the changes, but OTOH "security" issues
can span multiple sections, so it may just be more accurate to add a
[Security] mark to issues that have a security aspect.

Once downstreams are properly trained on the new mark, it should be just as
easy to search for it.  It *is* a little difficult to search for specific
issues in NEWS that occur after a given release.  I usually search for "What's
new in X.Y" for the baseline X.Y I care about, and then search up for some
reference to the issue I'm looking for.  It wouldn't be much extra work to
also search for [Security].

As an aside, when/if we ever get auto-NEWS file generation (to reduce
conflicts), I would love to get the (git) commit id prepended to the NEWS
item.  Sure, a particular change can span multiple commits, but the one that
changes NEWS should be enough to quickly jump me to the relevant changes.

Cheers,
-Barry

From barry at python.org  Tue Jun 21 18:57:09 2016
From: barry at python.org (Barry Warsaw)
Date: Tue, 21 Jun 2016 18:57:09 -0400
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
Message-ID: <20160621185709.3ab50572.barry@wooz.org>

On Jun 21, 2016, at 04:10 PM, Victor Stinner wrote:

>    PEP: xxx
>    Title: Make os.urandom() blocking on Linux
>    Version: $Revision$
>    Last-Modified: $Date$
>    Author: Victor Stinner <victor.stinner at gmail.com>
>    Status: Draft
>    Type: Standards Track
>    Content-Type: text/x-rst
>    Created: 20-June-2016
>    Python-Version: 3.6
[...]
>
>Alternative
>===========

I would like to ask for some changes to this proto-PEP.

At a minimum, I think a proper treatment of the alternative where os.urandom()
remains (on Linux at least) a thin wrapper around /dev/urandom.  We would add
os.getrandom() as the low-level interface to the new C lib function, and
expose any higher level functionality in the secrets module if necessary.
Then we would also add a strong admonition to the documentation explaining the
trade-offs between os.urandom() and os.getrandom() and point people to the
latter for strong crypto use cases.

Your proto-PEP uses this as a rationale:

    Security experts promotes ``os.urandom()`` to genereate cryptographic
    keys, even instead of ``ssl.RAND_bytes()``.

and that's been a commonly cited reason for why strengthening os.urandom() is
preferable to adding a more direct mapping to the underlying function that
provides that strengthened randomness.  If if the assertion is true -and
respectfully, it isn't backed up by any actual citations in the proto-PEP- it
doesn't make it right.  It's also a bad precedence to follow IMHO.  Where do
we draw the line in changing existing APIs to their use or misuse as the case
may be?

We can discuss whether your proposal or my[*] alternative is the right one for
Python to follow, and I may lose that argument, but I think it's only proper
and fair to represent this point of view in this proto-PEP.  I do not think a
separate competing PEP is appropriate.

I should also note that my proposed alternative would make the title
incorrect, so I'd like to suggestion something like: "Providing a
cryptographically strong source of random bytes."

Cheers,
-Barry

[*] Although labeling it "my" gives me undo credit for points of view also
held and suggested by others; it's just a handy way of referring to it.

From ncoghlan at gmail.com  Tue Jun 21 21:28:05 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 21 Jun 2016 18:28:05 -0700
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security sensitive
 APIs on Linux
Message-ID: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>

Hi folks,

Over the weekend, Nathaniel Smith and I put together a proposal to
allow security sensitive APIs (os.urandom, random.SystemRandom and the
new secrets module) to throw BlockingIOError if the operating system's
random number generator isn't ready.

We think this approach provides all the desired security guarantees,
while being relatively straightforward for affected system integrators
to diagnose and appropriately resolve if they're currently using these
APIs in a context where Linux is currently feeding them potentially
predictable random values.

Rendered: https://www.python.org/dev/peps/pep-0522/
GitHub: https://github.com/python/peps/blob/master/pep-0522.txt

The "Additional Background" section is mainly for the sake of folks
that haven't been following any of the previous discussions, but also
provides the reasoning for why we don't consider retaining consistency
with "man urandom" to be a useful design goal (any more than the
builtin open tries to retain consistency with "man open")

Cheers,
Nick.

=================

PEP: 522
Title: Allow BlockingIOError in security sensitive APIs on Linux
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan at gmail.com>, Nathaniel J. Smith <njs at pobox.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 16 June 2016
Python-Version: 3.6


Abstract
========

A number of APIs in the standard library that return random values nominally
suitable for use in security sensitive operations currently have an obscure
Linux-specific failure mode that allows them to return values that are not,
in fact, suitable for such operations.

This PEP proposes changing such failures in Python 3.6 from the current silent,
hard to detect, and hard to debug, errors to easily detected and debugged errors
by raising ``BlockingIOError`` with a suitable error message, allowing
developers the opportunity to unambiguously specify their preferred approach
for handling the situation.

The APIs affected by this change would be:

* ``os.urandom``
* ``random.SystemRandom``
* the new ``secrets`` module added by PEP 506

The new exception would potentially be encountered in the following situations:

* Python code calling these APIs during Linux system initialization
* Python code running on improperly initialized Linux systems (e.g. embedded
  hardware without adequate sources of entropy to seed the system random number
  generator, or Linux VMs that aren't configured to accept entropy from the
  VM host)

CPython interpreter initialization and ``random`` module initialization would
also be updated to gracefully fall back to alternative seeding options if the
system random number generator is not ready.


Proposal
========

Changing ``os.urandom()`` on Linux
----------------------------------

This PEP proposes that in Python 3.6+, ``os.urandom()`` be updated to call
the new Linux ``getrandom()`` syscall in non-blocking mode if available and
raise ``BlockingIOError: system random number generator is not ready`` if
the kernel reports that the call would block.

This behaviour will then
propagate through to higher level standard library APIs that depend on
``os.urandom`` (specifically ``random.SystemRandom`` and the new ``secrets``
module introduced by PEP 506).

In all cases, as soon as a call to one of these security sensitive APIs
succeeds, all future calls to these APIs in that process will succeed (once
the operating system random number generator is ready after system boot, it
remains ready).


Related changes
---------------

Currently, SipHash initialization and ``random`` module initialization
both gather random bytes using the same code that underlies
``os.urandom``. This PEP proposes to modify these so that in situations where
``os.urandom`` would raise a ``BlockingIOError``, they automatically
fall back on potentially more predictable sources of randomness (and in the
SipHash case, print a warning message to ``stderr`` indicating that that
particular Python process should not be used to process untrusted data).

To transparently accommodate a potential future where Linux adopts the same
"potentially blocking during system initialization" ``/dev/urandom`` behaviour
used by other \*nix systems, this fallback source of randomness will *not* be
the ``/dev/urandom`` device.


Limitations on scope
--------------------

No changes are proposed for Windows or Mac OS X systems, as neither of those
platforms provides any mechanism to run Python code before the operating
system random number generator has been initialized. Mac OS X goes so far as
to kernel panic and abort the boot process if it can't properly initialize the
random number generator (although Apple's restrictions on the supported
hardware platforms make that exceedingly unlikely in practice).

Similarly, no changes are proposed for other \*nix systems where
``os.urandom()`` will currently block waiting for the system random number
generator to be initialized, rather than returning values that are potentially
unsuitable for use in security sensitive applications.

While other \*nix systems that offer a non-blocking API for requesting random
numbers suitable for use in security sensitive applications could potentially
receive a similar update to the one proposed for Linux in this PEP, such
changes are out of scope for this particular proposal.

Python's behaviour on older Linux systems that do not offer the new
``getrandom()`` syscall will also remain unchanged.


Rationale
=========

Raising ``BlockingIOError`` in ``os.urandom()`` on Linux
--------------------------------------------------------

For several years now, the security community's guidance has been to use
``os.urandom()`` (or the ``random.SystemRandom()`` wrapper) when implementing
security sensitive operations in Python.

To help improve API discoverability and make it clearer that secrecy and
simulation are not the same problem (even though they both involve
random numbers), PEP 506 collected several of the one line recipes based
on the lower level ``os.urandom()`` API into a new ``secrets`` module.

However, this guidance has also come with a longstanding caveat: developers
writing security sensitive software at least for Linux, and potentially for
some other \*BSD systems, may need to wait until the operating system's
random number generator is ready before relying on it for security sensitive
operations. This generally only occurs if ``os.urandom()`` is read very
early in the system initialization process, or on systems with few sources of
available entropy (e.g. some kinds of virtualized or embedded systems), but
unfortunately the exact conditions that trigger this are difficult to predict,
and when it occurs then there is no direct way for userspace to tell it has
happened without querying operating system specific interfaces.

On \*BSD systems (if the particular \*BSD variant allows the problem to occur
at all), encountering this situation means ``os.urandom()`` will either block
waiting for the system random number generator to be ready (the associated
symptom would be for the affected script to pause unexpectedly on the first
call to ``os.urandom()``) or else will behave the same way as it does on Linux.

On Linux, in Python versions up to and including Python 3.4, and in
Python 3.5 maintenance versions following Python 3.5.2, there's no clear
indicator to developers that their software may not be working as expected
when run early in the Linux boot process, or on hardware without good
sources of entropy to seed the operating system's random number generator: due
to the behaviour of the underlying ``/dev/urandom`` device, ``os.urandom()``
on Linux returns a result either way, and it takes extensive statistical
analysis to show that a security vulnerability exists.

By contrast, if ``BlockingIOError`` is raised in those situations, then
developers using Python 3.6+ can easily choose their desired behaviour:

1. Loop until the call succeeds (security sensitive)
2. Switch to using the random module (non-security sensitive)
3. Switch to reading ``/dev/urandom`` directly (non-security sensitive)


Issuing a warning for potentially predictable internal hash initialization
--------------------------------------------------------------------------

The challenge for internal hash initialization is that it might be very
important to initialize SipHash with a reliably unpredictable random seed
(for processes that are exposed to potentially hostile input) or it might be
totally unimportant (for processes that never have to deal with untrusted data).

The Python runtime has no way to know which case a given invocation involves,
which means that if we allow SipHash initialization to block or error out,
then our intended security enhancement may break code that is already safe
and working fine, which is unacceptable -- especially since we are reasonably
confident that most Python invocations that might run during Linux system
initialization fall into this category (exposure to untrusted input tends to
involve network access, which typically isn't brought up until after the system
random number generator is initialized).

However, at the same time, since Python has no way to know whether any given
invocation needs to handle untrusted data, when the default SipHash
initialization fails this *might* indicate a genuine security problem, which
should not be allowed to pass silently.

Accordingly, if internal hash initialization needs to fall back to a potentially
predictable seed due to the system random number generator not being ready, it
will also emit a warning message on ``stderr`` to say that the system random
number generator is not available and that processing potentially hostile
untrusted data should be avoided.


Allowing potentially predictable ``random`` module initialization
-----------------------------------------------------------------

Other than for ``random.SystemRandom`` (which is a relatively thin
wrapper around ``os.urandom``), the ``random`` module has never made
any guarantees that the numbers it generates are suitable for use in
security sensitive operations, so the use of the system random number
generator to seed the default Mersenne Twister instance is mainly beneficial
as a harm mitigation measure for code that is using the ``random`` module
inappropriately.

Since a single call to ``os.urandom()`` is cheap once the system random
number generator has been initialized it makes sense to retain that as the
default behaviour, but there's no need to issue a warning when falling back to
a potentially more predictable alternative when necessary (in such cases,
a warning will typically already have been issued as part of interpreter
startup, as the only way for the call when importing the random module to
fail without the implicit call during interpreter startup also failing if for
the latter to have been skipped by entirely disabling the hash randomization
mechanism).


Backwards Compatibility Impact Assessment
=========================================

Similar to PEP 476, this is a proposal to turn a previously silent security
failure into a noisy exception that requires the application developer to
make an explicit decision regarding the behaviour they desire.

As no changes are proposed for operating systems other than Linux,
``os.urandom()`` retains its existing behaviour as a nominally blocking API
that is non-blocking in practice due to the difficulty of scheduling Python
code to run before the operating system random number generator is ready. We
believe it may be possible to encounter problems akin to those described in
this PEP on at least some \*BSD variants, but nobody has explicitly
demonstrated that. On Mac OS X and Windows, it appears to be straight up
impossible to even try to run a Python interpreter that early in the boot
process.

On Linux, ``os.urandom()`` retains its status as a guaranteed non-blocking API.
However, the means of achieving that status changes in the specific case of
the operating system random number generator not being ready for use in security
sensitive operations: historically it would return potentially predictable
random data, with this PEP it would change to raise ``BlockingIOError``.

Developers of affected applications would then be required to make one of the
following changes to gain forward compatibility with Python 3.6, based on the
kind of application they're developing.


Unaffected Applications
-----------------------

The following kinds of applications would be entirely unaffected by the change,
regardless of whether or not they perform security sensitive operations:

- applications that don't support Linux
- applications that are only run on desktops or conventional servers
- applications that are only run after the system RNG is ready

Applications in this category simply won't encounter the new exception, so it
will be reasonable for developers to wait and see if they receive
Python 3.6 compatibility bugs related to the new runtime behaviour, rather than
attempting to pre-emptively determine whether or not they're affected.


Affected security sensitive applications
----------------------------------------

Security sensitive applications would need to either change their system
configuration so the application is only started after the operating system
random number generator is ready for security sensitive operations, or else
change their code to busy loop until the operating system is ready::

    def blocking_urandom(num_bytes):
        while True:
            try:
                return os.urandom(num_bytes)
            except BlockingIOError:
                pass


Affected non-security sensitive applications
--------------------------------------------

Non-security sensitive applications that don't want to assume access to
``/dev/urandom`` (or assume a non-blocking implementation of that device)
can be updated to use the ``random`` module as a fallback option::

    def pseudorandom_fallback(num_bytes):
        try:
            return os.urandom(num_bytes)
        except BlockingIOError:
            random.getrandbits(num_bytes*8).to_bytes(num_bytes, "little")

Depending on the application, it may also be appropriate to skip accessing
``os.urandom`` at all, and instead rely solely on the ``random`` module.


Affected Linux specific non-security sensitive applications
-----------------------------------------------------------

Non-security sensitive applications that don't need to worry about cross
platform compatibility and are willing to assume that ``/dev/urandom`` on
Linux will always retain its current behaviour can be updated to access
``/dev/urandom`` directly::

    def dev_urandom(num_bytes):
        with open("/dev/urandom", "rb") as f:
            return f.read(num_bytes)

However, pursuing this option has the downside of contributing to ensuring
that the default behaviour of Linux at the operating system level can never
be changed.


Additional Background
=====================

Why propose this now?
---------------------

The main reason is because the Python 3.5.0 release switched to using the new
Linux ``getrandom()`` syscall when available in order to avoid consuming a
file descriptor [1]_, and this had the side effect of making the following
operations block waiting for the system random number generator to be ready:

* ``os.urandom`` (and APIs that depend on it)
* importing the ``random`` module
* initializing the randomized hash algorithm used by some builtin types

While the first of those behaviours is arguably desirable (and consistent with
``os.urandom``'s existing behaviour on other operating systems), the latter two
behaviours are unnecessary and undesirable, and the last one is now known to
cause a system level deadlock when attempting to run Python scripts during the
Linux init process with Python 3.5.0 or 3.5.1 [2]_, while the second one can
cause problems when using virtual machines without robust entropy sources
configured [3]_.

Since decoupling these behaviours in CPython will involve a number of
implementation changes more appropriate for a feature release than a maintenance
release, the relatively simple resolution applied in Python 3.5.2 was to revert
all three of them to a behaviour similar to that of previous Python versions:
if the new Linux syscall indicates it will block, then Python 3.5.2 will
implicitly fall back on reading ``/dev/urandom`` directly [4]_.

However, this bug report *also* resulted in a range of proposals to add *new*
APIs like ``os.getrandom()`` [5]_, ``os.urandom_block()`` [6]_,
``os.pseudorandom()`` and ``os.cryptorandom()`` [7]_, or adding new optional
parameters to ``os.urandom()`` itself [8]_, and then attempting to educate
users on when they should call those APIs instead of just using a plain
``os.urandom()`` call.

These proposals represent dramatic overreactions, as the question of reliably
obtaining random numbers suitable for security sensitive work on Linux is a
relatively obscure problem of interest mainly to operating system developers
and embedded systems programmers, that in no way justifies cluttering up the
Python standard library's cross-platform APIs with new Linux-specific concerns.
This is especially so with the ``secrets`` module already being added as the
"use this and don't worry about the low level details" option for developers
writing security sensitive software that for some reason can't rely on even
higher level domain specific APIs (like web frameworks) and also don't need to
worry about Python versions prior to Python 3.6.

That said, it's also the case that low cost ARM devices are becoming
increasingly prevalent, with a lot of them running Linux, and a lot of folks
writing Python applications that run on those devices. That creates an
opportunity to take an obscure security problem that currently requires a lot
of knowledge about Linux boot processes and provably unpredictable random
number generation to diagnose and resolve, and instead turn it into a
relatively mundane and easy-to-find-in-an-internet-search runtime exception.


The cross-platform behaviour of ``os.urandom()``
------------------------------------------------

On operating systems other than Linux, ``os.urandom()`` may already block
waiting for the operating system's random number generator to be ready. This
will happen at most once in the lifetime of the process, and the call is
subsequently guaranteed to be non-blocking.

Linux is unique in that, even when the operating system's random number
generator doesn't consider itself ready for use in security sensitive
operations, reading from the ``/dev/urandom`` device will return random values
based on the entropy it has available.

This behaviour is potentially problematic, so Linux 3.17 added a new
``getrandom()`` syscall that (amongst other benefits) allows callers to
either block waiting for the random number generator to be ready, or
else request an error return if the random number generator is not ready.
Notably, the new API does *not* support the old behaviour of returning
data that is not suitable for security sensitive use cases.

Versions of Python prior up to and including Python 3.4 access the
Linux ``/dev/urandom`` device directly.

Python 3.5.0 and 3.5.1 called ``getrandom()`` in blocking mode in order to
avoid the use of a file descriptor to access ``/dev/urandom``. While there
were no specific problems reported due to ``os.urandom()`` blocking in user
code, there *were* problems due to CPython implicitly invoking the blocking
behaviour during interpreter startup and when importing the ``random`` module.

Rather than trying to decouple SipHash initialization from the
``os.urandom()`` implementation, Python 3.5.2 switched to calling
``getrandom()`` in non-blocking mode, and falling back to reading from
``/dev/urandom`` if the syscall indicates it will block.

As a result of the above, ``os.urandom()`` in all Python versions up to and
including Python 3.5 propagate the behaviour of the underling ``/dev/urandom``
device to Python code.


Problems with the behaviour of ``/dev/urandom`` on Linux
--------------------------------------------------------

The Python ``os`` module has largely co-evolved with Linux APIs, so having
``os`` module functions closely follow the behaviour of their Linux operating
system level counterparts when running on Linux is typically considered to be
a desirable feature.

However, ``/dev/urandom`` represents a case where the current behaviour is
acknowledged to be problematic, but fixing it unilaterally at the kernel level
has been shown to prevent some Linux distributions from booting (at least in
part due to components like Python currently using it for
non-security-sensitive purposes early in the system initialization process).

As an analogy, consider the following two functions::

    def generate_example_password():
        """Generates passwords solely for use in code examples"""
        return generate_unpredictable_password()

    def generate_actual_password():
        """Generates actual passwords for use in real applications"""
        return generate_unpredictable_password()

If you think of an operating system's random number generator as a method for
generating unpredictable, secret passwords, then you can think of Linux's
``/dev/urandom`` as being implemented like::

    # Oversimplified artist's conception of the kernel code
    # implementing /dev/urandom
    def generate_unpredictable_password():
        if system_rng_is_ready:
            return use_system_rng_to_generate_password()
        else:
            # we can't make an unpredictable password; silently return a
            # potentially predictable one instead:
            return "p4ssw0rd"

In this scenario, the author of ``generate_example_password`` is fine - even if
``"p4ssw0rd"`` shows up a bit more often than they expect, it's only used in
examples anyway. However, the author of ``generate_actual_password`` has a
problem - how do they prove that their calls to
``generate_unpredictable_password`` never follow the path that returns a
predictable answer?

In real life it's slightly more complicated than this, because there
might be some level of system entropy available -- so the fallback might
be more like ``return random.choice(["p4ssword", "passw0rd",
"p4ssw0rd"])`` or something even more variable and hence only statistically
predictable with better odds than the author of ``generate_actual_password``
was expecting. This doesn't really make things more provably secure, though;
mostly it just means that if you try to catch the problem in the obvious way --
``if returned_password == "p4ssw0rd": raise UhOh`` -- then it doesn't work,
because ``returned_password`` might instead be ``p4ssword`` or even
``pa55word``, or just an arbitrary 64 bit sequence selected from fewer than
2**64 possibilities. So this rough sketch does give the right general idea of
the consequences of the "more predictable than expected" fallback behaviour,
even though it's thoroughly unfair to the Linux kernel team's efforts to
mitigate the practical consequences of this problem without resorting to
breaking backwards compatibility.

This design is generally agreed to be a bad idea. As far as we can
tell, there are no use cases whatsoever in which this is the behavior
you actually want. It has led to the use of insecure ``ssh`` keys on
real systems, and many \*nix-like systems (including at least Mac OS
X, OpenBSD, and FreeBSD) have modified their ``/dev/urandom``
implementations so that they never return predictable outputs, either
by making reads block in this case, or by simply refusing to run any
userspace programs until the system RNG has been
initialized. Unfortunately, Linux has so far been unable to follow
suit, because it's been empirically determined that enabling the
blocking behavior causes some currently extant distributions to
fail to boot.

Instead, the new ``getrandom()`` syscall was introduced, making
it *possible* for userspace applications to access the system random number
generator safely, without introducing hard to debug deadlock problems into
the system initialization processes of existing Linux distros.


Consequences of ``getrandom()`` availability for Python
-------------------------------------------------------

Prior to the introduction of the ``getrandom()`` syscall, it simply wasn't
feasible to access the Linux system random number generator in a provably
safe way, so we were forced to settle for reading from ``/dev/urandom`` as the
best available option. However, with ``getrandom()`` insisting on raising an
error or blocking rather than returning predictable data, as well as having
other advantages, it is now the recommended method for accessing the kernel
RNG on Linux, with reading ``/dev/urandom`` directly relegated to "legacy"
status. This moves Linux into the same category as other operating systems
like Windows, which doesn't provide a ``/dev/urandom`` device at all: the
best available option for implementing ``os.urandom()`` is no longer simply
reading bytes from the ``/dev/urandom`` device.

This means that what used to be somebody else's problem (the Linux kernel
development team's) is now Python's problem -- given a way to detect that the
system RNG is not initialized, we have to choose how to handle this
situation whenever we try to use the system RNG.

It could simply block, as was somewhat inadvertently implemented in 3.5.0::

    # artist's impression of the CPython 3.5.0-3.5.1 behavior
    def generate_unpredictable_bytes_or_block(num_bytes):
        while not system_rng_is_ready:
            wait
        return unpredictable_bytes(num_bytes)

Or it could raise an error, as this PEP proposes (in *some* cases)::

    # artist's impression of the behavior proposed in this PEP
    def generate_unpredictable_bytes_or_raise(num_bytes):
        if system_rng_is_ready:
            return unpredictable_bytes(num_bytes)
        else:
            raise BlockingIOError

Or it could explicitly emulate the ``/dev/urandom`` fallback behavior,
as was implemented in 3.5.2rc1 and is expected to remain for the rest
of the 3.5.x cycle::

    # artist's impression of the CPython 3.5.2rc1+ behavior
    def generate_unpredictable_bytes_or_maybe_not(num_bytes):
        if system_rng_is_ready:
            return unpredictable_bytes(num_bytes)
        else:
            return (b"p4ssw0rd" * (num_bytes // 8 + 1))[:num_bytes]

(And the same caveats apply to this sketch as applied to the
``generate_unpredictable_password`` sketch of ``/dev/urandom`` above.)

There are five places where CPython and the standard library attempt to use the
operating system's random number generator, and thus five places where this
decision has to be made:

* initializing the SipHash used to protect ``str.__hash__`` and
  friends against DoS attacks (called unconditionally at startup)
* initializing the ``random`` module (called when ``random`` is
  imported)
* servicing user calls to the ``os.urandom`` public API
* the higher level ``random.SystemRandom`` public API
* the new ``secrets`` module public API added by PEP 506

Currently, these five places all use the same underlying code, and
thus make this decision in the same way.

This whole problem was first noticed because 3.5.0 switched that
underlying code to the ``generate_unpredictable_bytes_or_block`` behavior,
and it turns out that there are some rare cases where Linux boot
scripts attempted to run a Python program as part of system initialization, the
Python startup sequence blocked while trying to initialize SipHash,
and then this triggered a deadlock because the system stopped doing
anything -- including gathering new entropy -- until the Python script
was forcibly terminated by an external timer. This is particularly unfortunate
since the scripts in question never processed untrusted input, so there was no
need for SipHash to be initialized with provably unpredictable random data in
the first place. This motivated the change in 3.5.2rc1 to emulate the old
``/dev/urandom`` behavior in all cases (by calling ``getrandom()`` in
non-blocking mode, and then falling back to reading ``/dev/urandom``
if the syscall indicates that the ``/dev/urandom`` pool is not yet
fully initialized.)

A similar problem was found due to the ``random`` module calling
``os.urandom`` as a side-effect of import in order to seed the default
global ``random.Random()`` instance.

We have not received any specific complaints regarding direct calls to
``os.urandom()`` or ``random.SystemRandom()`` blocking with 3.5.0 or 3.5.1 -
only problem reports due to the implicit blocking on interpreter startup and
as a side-effect of importing the random module.

Accordingly, this PEP proposes providing consistent shared behaviour for the
latter three cases (ensuring that their behaviour is unequivocally suitable for
all security sensitive operations), while updating the first two cases to
account for that behavioural change.

This approach should mean that the vast majority of Python users never need to
even be aware that this change was made, while those few whom it affects will
receive an exception at runtime that they can look up online and find suitable
guidance on addressing.


References
==========

.. [1] os.urandom() should use Linux 3.17 getrandom() syscall
   (http://bugs.python.org/issue22181)

.. [2] Python 3.5 running on Linux kernel 3.17+ can block at startup or on
   importing the random module on getrandom()
   (http://bugs.python.org/issue26839)

.. [3] "import random" blocks on entropy collection on Linux with low entropy
   (http://bugs.python.org/issue25420)

.. [4] os.urandom() doesn't block on Linux anymore
   (https://hg.python.org/cpython/rev/9de508dc4837)

.. [5] Proposal to add os.getrandom()
   (http://bugs.python.org/issue26839#msg267803)

.. [6] Add os.urandom_block()
   (http://bugs.python.org/issue27250)

.. [7] Add random.cryptorandom() and random.pseudorandom, deprecate os.urandom()
   (http://bugs.python.org/issue27279)

.. [8] Always use getrandom() in os.random() on Linux and add
   block=False parameter to os.urandom()
   (http://bugs.python.org/issue27266)

For additional background details beyond those captured in this PEP, also see
Victor Stinner's summary at http://haypo-notes.readthedocs.io/pep_random.html


Copyright
=========

This document has been placed into the public domain.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From cory at lukasa.co.uk  Wed Jun 22 06:13:28 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Wed, 22 Jun 2016 11:13:28 +0100
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <20160621185709.3ab50572.barry@wooz.org>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
Message-ID: <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>


> On 21 Jun 2016, at 23:57, Barry Warsaw <barry at python.org> wrote:
> 
> At a minimum, I think a proper treatment of the alternative where os.urandom()
> remains (on Linux at least) a thin wrapper around /dev/urandom.  We would add
> os.getrandom() as the low-level interface to the new C lib function, and
> expose any higher level functionality in the secrets module if necessary.
> Then we would also add a strong admonition to the documentation explaining the
> trade-offs between os.urandom() and os.getrandom() and point people to the
> latter for strong crypto use cases.

I?d like to explore this approach further.

In a model like this, os.getrandom() would basically need to have, in its documentation, a recipe for using it in a general-purpose, cross-OS manner. That recipe would be, at minimum, an admonition to use the secrets module.

However, if we?re going to implement an entire function in order to say ?Do not use this, use secrets instead?, why are we bothering? Why add the API surface and a function that needs to be maintained? Why not just make the use of getrandom a private implementation detail of secrets?

Making getrandom() a private detail of secrets has the advantage of freeing us from some backward compatibility concerns, which as we?ve identified are a real problem here. Given that there?s no understandable use case where someone would write anything but "try: os.getrandom(); except AttributeError: os.urandom?, it doesn?t seem sensible to give people the option to get this wrong.

The other way to approach this is to have os.getrandom() do the appropriate dance, but others have suggested that the os module is intended only to be thin wrappers around things that the OS provides (a confusing argument given that closerange() exists, but that?s by the by).

> Your proto-PEP uses this as a rationale:
> 
>    Security experts promotes ``os.urandom()`` to genereate cryptographic
>    keys, even instead of ``ssl.RAND_bytes()``.
> 
> and that's been a commonly cited reason for why strengthening os.urandom() is
> preferable to adding a more direct mapping to the underlying function that
> provides that strengthened randomness.  If if the assertion is true -and
> respectfully, it isn't backed up by any actual citations in the proto-PEP- it
> doesn't make it right.  It's also a bad precedence to follow IMHO.  Where do
> we draw the line in changing existing APIs to their use or misuse as the case
> may be?

Here?s some relevant citations:

- https://stackoverflow.com/questions/10341112/whats-more-random-hashlib-or-urandom
- https://cryptography.io/en/latest/random-numbers/
- https://code.google.com/p/googleappengine/issues/detail?id=1055

However, I don?t think I agree with your assertion that it?s a bad precedent. I think the bad precedent is introducing new functions that do what the old functions should have done. Some examples of this:

- yaml.safe_load, introduced to replace yaml.load which leads to documents like this: https://security.openstack.org/guidelines/dg_avoid-dangerous-input-parsing-libraries.html#incorrect
- PHP?s mysql_real_escape_string, introduced to replace mysql_escape_string, which leads to misguided questions like this one: https://security.stackexchange.com/questions/8028/does-mysql-escape-string-have-any-security-vulnerabilities-if-all-tables-using-l

Each of these functions has been a never-ending supply of security vulnerabilities because they encourage users to fall into a pit of failure. Users who are not sufficiently defensive when approaching their code will reach for the most obvious tool in the box, and the Python cryptographic community has spent a long time making os.urandom() the most obvious tool in the box because no other tool was available. The argument, then, is that we should make that tool better, rather than build a new tool and let the old one fester.

> We can discuss whether your proposal or my[*] alternative is the right one for
> Python to follow, and I may lose that argument, but I think it's only proper
> and fair to represent this point of view in this proto-PEP.  I do not think a
> separate competing PEP is appropriate.

I agree with this. The PEP should accurately represent competing views, even if it doesn?t agree with them.

Cory


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/security-sig/attachments/20160622/6733ed92/attachment.sig>

From cory at lukasa.co.uk  Wed Jun 22 06:23:33 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Wed, 22 Jun 2016 11:23:33 +0100
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
Message-ID: <A2CB697F-7D75-421B-AF07-DACC12A2EBB3@lukasa.co.uk>


> On 22 Jun 2016, at 02:28, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> Hi folks,
> 
> Over the weekend, Nathaniel Smith and I put together a proposal to
> allow security sensitive APIs (os.urandom, random.SystemRandom and the
> new secrets module) to throw BlockingIOError if the operating system's
> random number generator isn't ready.

In general I like this approach. One note inline below.

> Limitations on scope
> --------------------
> 
> No changes are proposed for Windows or Mac OS X systems, as neither of those
> platforms provides any mechanism to run Python code before the operating
> system random number generator has been initialized. Mac OS X goes so far as
> to kernel panic and abort the boot process if it can't properly initialize the
> random number generator (although Apple's restrictions on the supported
> hardware platforms make that exceedingly unlikely in practice).
> 
> Similarly, no changes are proposed for other \*nix systems where
> ``os.urandom()`` will currently block waiting for the system random number
> generator to be initialized, rather than returning values that are potentially
> unsuitable for use in security sensitive applications.

You may want to be careful around this point. Solaris provides a getrandom() syscall as well, that Python *does* use. Furthermore, if other *nix OSes provide a getrandom() syscall then the current Python code will favour it over the urandom fallback: care should be taken to clarify what the expected plan is in these cases.

Cory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/security-sig/attachments/20160622/a7cb32ac/attachment.sig>

From victor.stinner at gmail.com  Wed Jun 22 12:52:01 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Wed, 22 Jun 2016 18:52:01 +0200
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
Message-ID: <CAMpsgwYxfoOe+Rkbf+i_b-7X62mWrar0=fnjWiNERVE-WMHDHg@mail.gmail.com>

2016-06-22 12:13 GMT+02:00 Cory Benfield <cory at lukasa.co.uk>:
> I agree with this. The PEP should accurately represent competing views, even if it doesn?t agree with them.

I started to complete my PEP, but it takes time. I will keep you in touch ;-)

Victor

From victor.stinner at gmail.com  Wed Jun 22 12:56:39 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Wed, 22 Jun 2016 18:56:39 +0200
Subject: [Security-sig] How to document changes related to security in
 Python changelog?
In-Reply-To: <20160621184016.4ad487a3.barry@wooz.org>
References: <CAMpsgwZtXPGPwVq5S=5p=H8dV06uxeAzudfvTCNQkxK3xBp3gQ@mail.gmail.com>
 <57695492.6060100@stoneleaf.us> <20160621184016.4ad487a3.barry@wooz.org>
Message-ID: <CAMpsgwYNsb4ynvGbeGN-j5EMtMCqZsZMv91ENGuY+oDTqyjXmw@mail.gmail.com>

I don't think that it matters much at this point. We can start with
the [Security] prefix and decide later to move items to a dedicated
section.

I expect that we have 10 security related changes or less. Maybe I'm
wrong and we have way much than that :-)

Victor

2016-06-22 0:40 GMT+02:00 Barry Warsaw <barry at python.org>:
> On Jun 21, 2016, at 07:52 AM, Ethan Furman wrote:
>
>>On 06/21/2016 07:07 AM, Victor Stinner wrote:
>>> Christian proposed to simply prefix changes with "[Security]".
>>
>>Seems good to me -- are there any downsides?
>
> Nothing major IMHO.  The whole point is to make it easy for downstreams to
> identify change.  To that effect, I'd mildly prefer a Misc/NEWS section
> because it will be easier to pick out the changes, but OTOH "security" issues
> can span multiple sections, so it may just be more accurate to add a
> [Security] mark to issues that have a security aspect.
>
> Once downstreams are properly trained on the new mark, it should be just as
> easy to search for it.  It *is* a little difficult to search for specific
> issues in NEWS that occur after a given release.  I usually search for "What's
> new in X.Y" for the baseline X.Y I care about, and then search up for some
> reference to the issue I'm looking for.  It wouldn't be much extra work to
> also search for [Security].
>
> As an aside, when/if we ever get auto-NEWS file generation (to reduce
> conflicts), I would love to get the (git) commit id prepended to the NEWS
> item.  Sure, a particular change can span multiple commits, but the one that
> changes NEWS should be enough to quickly jump me to the relevant changes.
>
> Cheers,
> -Barry
> _______________________________________________
> Security-SIG mailing list
> Security-SIG at python.org
> https://mail.python.org/mailman/listinfo/security-sig

From barry at python.org  Wed Jun 22 20:35:15 2016
From: barry at python.org (Barry Warsaw)
Date: Wed, 22 Jun 2016 20:35:15 -0400
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
Message-ID: <20160622203515.12e20601@anarchist.wooz.org>

On Jun 22, 2016, at 11:13 AM, Cory Benfield wrote:

>In a model like this, os.getrandom() would basically need to have, in its
>documentation, a recipe for using it in a general-purpose, cross-OS
>manner. That recipe would be, at minimum, an admonition to use the secrets
>module.
>
>However, if we?re going to implement an entire function in order to say ?Do
>not use this, use secrets instead?, why are we bothering? Why add the API
>surface and a function that needs to be maintained? Why not just make the use
>of getrandom a private implementation detail of secrets?

Because the os module has traditionally surfaced lower-level operating system
functions, so os.getrandom() would be an extension of this.  That's also why I
advocate simplifying os.urandom() so that it reverts more or less to exposing
/dev/urandom to Python.  With perhaps a few exceptions, os doesn't provide
higher level APIs.

The point here is that, let's say you're an experienced Linux developer and
you know you want to use getrandom(2) in Python.  os.getrandom() is exactly
that.  It's completely analogous to why we provide, e.g. os.chroot() and such.

Now, let's say you just want some guaranteed high quality random bytes, and
you don't really know or care what's being used.  The lower level os functions
are *not* the right APIs to use, but secrets is.  That's why the documentation
points people over there for better, higher-level APIs, and it's there that we
have the freedom to change underlying implementation as needed to deliver on
the promised improved security.

>Making getrandom() a private detail of secrets has the advantage of freeing
>us from some backward compatibility concerns, which as we?ve identified are a
>real problem here.

I agree.

>Given that there?s no understandable use case where someone would write
>anything but "try: os.getrandom(); except AttributeError: os.urandom?, it
>doesn?t seem sensible to give people the option to get this wrong.

This doesn't follow though.  Again, it's about providing low-level Python
bindings to underlying operation system functions in os, and higher level APIs
with more cross-platform guarantees in secret.

>The other way to approach this is to have os.getrandom() do the appropriate
>dance, but others have suggested that the os module is intended only to be
>thin wrappers around things that the OS provides (a confusing argument given
>that closerange() exists, but that?s by the by).

As I mentioned, there are exceptions (os.makedirs() is the other one that
comes to mind), but I do think the rule for os should be -and has usually
traditionally been- exactly as you say.  OTOH, neither os.makedirs() nor
os.closerange() are that far removed from their lower level cousins, so that's
a practicality over purity justification.

>However, I don?t think I agree with your assertion that it?s a bad
>precedent. I think the bad precedent is introducing new functions that do
>what the old functions should have done.

I don't agree that any of this is what os.urandom() should have done.  It's
that people have used it for other purposes and changed what they think it
should have done.  Now we're redefining os.urandom() to fit that new purpose.
That's the bad precedence IMHO.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/security-sig/attachments/20160622/9b845b71/attachment.sig>

From ethan at stoneleaf.us  Wed Jun 22 21:29:21 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 22 Jun 2016 18:29:21 -0700
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <20160622203515.12e20601@anarchist.wooz.org>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
Message-ID: <576B3B71.3000303@stoneleaf.us>

Barry, Cory, et al:

We all know there are two camps here:

- Those that want "secure by default" behavior, and
- Those that want "thin wrapper" behavior.

We have discussed the reasoning behind those two camps ad nauseam on 
Python Dev, with fairly disastrous results.  I did not create this list 
so we could do it again.

At this point we have two PEPs going.  Let's make sure that whichever 
PEP we take back to Py-Dev includes all the arguments and objections 
noted, and then let Guido or his delegate make the final call.

Please.

--
~Ethan~

From ncoghlan at gmail.com  Wed Jun 22 21:31:07 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 22 Jun 2016 18:31:07 -0700
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <20160622203515.12e20601@anarchist.wooz.org>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
Message-ID: <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>

On 22 June 2016 at 17:35, Barry Warsaw <barry at python.org> wrote:
> On Jun 22, 2016, at 11:13 AM, Cory Benfield wrote:
>
>>In a model like this, os.getrandom() would basically need to have, in its
>>documentation, a recipe for using it in a general-purpose, cross-OS
>>manner. That recipe would be, at minimum, an admonition to use the secrets
>>module.
>>
>>However, if we?re going to implement an entire function in order to say ?Do
>>not use this, use secrets instead?, why are we bothering? Why add the API
>>surface and a function that needs to be maintained? Why not just make the use
>>of getrandom a private implementation detail of secrets?
>
> Because the os module has traditionally surfaced lower-level operating system
> functions, so os.getrandom() would be an extension of this.  That's also why I
> advocate simplifying os.urandom() so that it reverts more or less to exposing
> /dev/urandom to Python.  With perhaps a few exceptions, os doesn't provide
> higher level APIs.
>
> The point here is that, let's say you're an experienced Linux developer and
> you know you want to use getrandom(2) in Python.  os.getrandom() is exactly
> that.  It's completely analogous to why we provide, e.g. os.chroot() and such.

My own objection (as spelled out in PEP 522) is only to leaving
os.urandom() silently broken when we have the ability to improve on
that - it's an "errors pass silently" and "guessing in the face of
ambiguity" scenario that we previously couldn't sensibly do anything
about, but now have additional options to better handle on behalf of
our users.

As long as os.urandom() is fixed to fail cleanly rather than silently,
I don't object to exposing os.getrandom() as well for the sake of
folks writing Linux specific software that want direct access to the
kernel's blocking behaviour rather than a busy loop. I *do* object to
any solution that proposes that all correct cross-platform code that
needs reliably unpredictable random data necessarily end up looking
like:

    try:
        my_random = os.getrandom
    except AttributeError:
        my_random = os.urandom

WIth the simpler and cleaner "my_random = os.urandom" continuing to
risk silent security failures if the software is used in an
unanticipated context.

Instead, I'm after an outcome for os.urandom() akin to that in PEP
418, where time.time() now looks for several other preferred options
before falling back to _time.time() as a last resort:
https://www.python.org/dev/peps/pep-0418/#time-time

Even if we did add a blocking getrandom() though, I'd still advocate
for secrets and random.SystemRandom to throw BlockingIOError by
default  - with system RNG initialisation being a "once and done"
thing and os.getrandom() exposed, it becomes straightforward to add an
application level "Wait for the random number generator to be ready"
check:

    try:
        wait_for_system_rng = os.getrandom
    except AttributeError:
        pass
    else:
        wait_for_system_rng(1)

The hard part is then knowing that your *need* to wait. If you're
silently getting more-predictable-than-you-expected random data, you
may never realise. If your system hangs, you might eventually figure
it out, but only after a likely frustrating debugging effort. By
contrast, if your application fails with "BlockingIOError: system
random number generator not ready", then you can search for that on
the internet, see the above snippet for "How to wait for the system
random number generator to be ready on Linux" and stick that into your
code.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From guido at python.org  Wed Jun 22 22:15:41 2016
From: guido at python.org (Guido van Rossum)
Date: Wed, 22 Jun 2016 19:15:41 -0700
Subject: [Security-sig] Can /dev/urandom ever revert from the "good" to the
 "bad" state?
Message-ID: <CAP7+vJJzBUd=aAd7M_=4iF5EV3WLjOPRhC1OqhXnxO72Bj5zyw@mail.gmail.com>

Before I can possibly start thinking about what to do when the system's
CSPRNG is initialized, I need to understand more about how it works.
Apparently there's a possible transition from the "not ready yet" ("bad")
state to "ready" ("good"), and all it takes is usually waiting for a second
or two. But is this a wait that only gets incurred once, somewhere early
after a boot, or is this something that can happen at any time?

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/security-sig/attachments/20160622/b144518c/attachment.html>

From donald at stufft.io  Wed Jun 22 22:18:09 2016
From: donald at stufft.io (Donald Stufft)
Date: Wed, 22 Jun 2016 22:18:09 -0400
Subject: [Security-sig] Can /dev/urandom ever revert from the "good" to
 the "bad" state?
In-Reply-To: <CAP7+vJJzBUd=aAd7M_=4iF5EV3WLjOPRhC1OqhXnxO72Bj5zyw@mail.gmail.com>
References: <CAP7+vJJzBUd=aAd7M_=4iF5EV3WLjOPRhC1OqhXnxO72Bj5zyw@mail.gmail.com>
Message-ID: <99E3006C-31F3-466B-92DA-E1396D1E3995@stufft.io>


> On Jun 22, 2016, at 10:15 PM, Guido van Rossum <guido at python.org> wrote:
> 
> Before I can possibly start thinking about what to do when the system's CSPRNG is initialized, I need to understand more about how it works. Apparently there's a possible transition from the "not ready yet" ("bad") state to "ready" ("good"), and all it takes is usually waiting for a second or two. But is this a wait that only gets incurred once, somewhere early after a boot, or is this something that can happen at any time?


Once, only after boot. On most (all?) modern Linux systems there?s even part of the boot process that attempts to seed the CSPRNG using random values stored during a previous boot to shorten the time window between when it?s ready and when it?s not yet initialized. However, once it is initialized it will never block (or EAGAIN) again.


?
Donald Stufft


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/security-sig/attachments/20160622/4ee13a49/attachment.html>

From guido at python.org  Wed Jun 22 22:29:33 2016
From: guido at python.org (Guido van Rossum)
Date: Wed, 22 Jun 2016 19:29:33 -0700
Subject: [Security-sig] Can /dev/urandom ever revert from the "good" to
 the "bad" state?
In-Reply-To: <99E3006C-31F3-466B-92DA-E1396D1E3995@stufft.io>
References: <CAP7+vJJzBUd=aAd7M_=4iF5EV3WLjOPRhC1OqhXnxO72Bj5zyw@mail.gmail.com>
 <99E3006C-31F3-466B-92DA-E1396D1E3995@stufft.io>
Message-ID: <CAP7+vJJgj3HDPjuOSgJfzJY-8fT9jZxXU2iOoCpLp_6Y4r7ifg@mail.gmail.com>

On Wed, Jun 22, 2016 at 7:18 PM, Donald Stufft <donald at stufft.io> wrote:

>
> On Jun 22, 2016, at 10:15 PM, Guido van Rossum <guido at python.org> wrote:
>
> Before I can possibly start thinking about what to do when the system's
> CSPRNG is initialized, I need to understand more about how it works.
> Apparently there's a possible transition from the "not ready yet" ("bad")
> state to "ready" ("good"), and all it takes is usually waiting for a second
> or two. But is this a wait that only gets incurred once, somewhere early
> after a boot, or is this something that can happen at any time?
>
>
>
> Once, only after boot. On most (all?) modern Linux systems there?s even
> part of the boot process that attempts to seed the CSPRNG using random
> values stored during a previous boot to shorten the time window between
> when it?s ready and when it?s not yet initialized. However, once it is
> initialized it will never block (or EAGAIN) again.
>

Then shouldn't it be the responsibility of the boot sequence rather than of
the Python stdlib to wait for that event? IIUC that's what OS X does (I
think someone described that it even kernel-panics when it can't enter the
"good" state).

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/security-sig/attachments/20160622/54bd727c/attachment.html>

From donald at stufft.io  Wed Jun 22 22:37:16 2016
From: donald at stufft.io (Donald Stufft)
Date: Wed, 22 Jun 2016 22:37:16 -0400
Subject: [Security-sig] Can /dev/urandom ever revert from the "good" to
 the "bad" state?
In-Reply-To: <CAP7+vJJgj3HDPjuOSgJfzJY-8fT9jZxXU2iOoCpLp_6Y4r7ifg@mail.gmail.com>
References: <CAP7+vJJzBUd=aAd7M_=4iF5EV3WLjOPRhC1OqhXnxO72Bj5zyw@mail.gmail.com>
 <99E3006C-31F3-466B-92DA-E1396D1E3995@stufft.io>
 <CAP7+vJJgj3HDPjuOSgJfzJY-8fT9jZxXU2iOoCpLp_6Y4r7ifg@mail.gmail.com>
Message-ID: <54E16C16-F9A3-4D39-8A36-4E356075A3A2@stufft.io>


> On Jun 22, 2016, at 10:29 PM, Guido van Rossum <guido at python.org> wrote:
> 
> On Wed, Jun 22, 2016 at 7:18 PM, Donald Stufft <donald at stufft.io <mailto:donald at stufft.io>> wrote:
> 
>> On Jun 22, 2016, at 10:15 PM, Guido van Rossum <guido at python.org <mailto:guido at python.org>> wrote:
>> 
>> Before I can possibly start thinking about what to do when the system's CSPRNG is initialized, I need to understand more about how it works. Apparently there's a possible transition from the "not ready yet" ("bad") state to "ready" ("good"), and all it takes is usually waiting for a second or two. But is this a wait that only gets incurred once, somewhere early after a boot, or is this something that can happen at any time?
> 
> 
> Once, only after boot. On most (all?) modern Linux systems there?s even part of the boot process that attempts to seed the CSPRNG using random values stored during a previous boot to shorten the time window between when it?s ready and when it?s not yet initialized. However, once it is initialized it will never block (or EAGAIN) again.
> 
> Then shouldn't it be the responsibility of the boot sequence rather than of the Python stdlib to wait for that event? IIUC that's what OS X does (I think someone described that it even kernel-panics when it can't enter the "good" state).
> 

In an ideal world? Yes. However we live in a not ideal world where Linux doesn?t ensure that, so absent Linux deciding to do something like what OS X, FreeBSD, Windows, OpenBSD, etc do we have to make a choice, either we pass along the possibility that Linux left us with, and make it so people who attempt to use Python early in the boot sequence can get predictable random numbers (without any way to determine if they?re getting ?good? or ?bad? numbers) or we use the newer API that Linux has given us to make that assurance.

AFAIK Linux (or, well Ted) has stated that the way for people who care about getting cryptographically secure random out of the kernel is to use getrandom(0) (or getrandom(GRDB_NONBLOCK) and fail on an EAGAIN) so the question I think really comes down to whether os.urandom is something we want to provide the best source of (generally) non blocking CSPRNG or whether we want it to be a narrow wrapper around whatever semantics /dev/urandom specifically has.

?
Donald Stufft


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/security-sig/attachments/20160622/86f2d372/attachment-0001.html>

From tim.peters at gmail.com  Wed Jun 22 22:40:02 2016
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 22 Jun 2016 21:40:02 -0500
Subject: [Security-sig] Can /dev/urandom ever revert from the "good" to
 the "bad" state?
In-Reply-To: <99E3006C-31F3-466B-92DA-E1396D1E3995@stufft.io>
References: <CAP7+vJJzBUd=aAd7M_=4iF5EV3WLjOPRhC1OqhXnxO72Bj5zyw@mail.gmail.com>
 <99E3006C-31F3-466B-92DA-E1396D1E3995@stufft.io>
Message-ID: <CAExdVNn-wFaqQ8ahwjQVNRWsMPWdck4LcuwNyFSDTK2p6-09_Q@mail.gmail.com>

[Guido]
> Before I can possibly start thinking about what to do when the system's
> CSPRNG is initialized, I need to understand more about how it works.
> Apparently there's a possible transition from the "not ready yet" ("bad")
> state to "ready" ("good"), and all it takes is usually waiting for a second
> or two. But is this a wait that only gets incurred once, somewhere early
> after a boot, or is this something that can happen at any time?

[Donald Stufft]
> Once, only after boot. On most (all?) modern Linux systems there?s even part
> of the boot process that attempts to seed the CSPRNG using random values
> stored during a previous boot to shorten the time window between when it?s
> ready and when it?s not yet initialized. However, once it is initialized it
> will never block (or EAGAIN) again.

Donald, at the end you're talking about how getrandom() behaves -
/dev/urandom on Linux never blocks, as I understand it (but there's no
advertised way to tell when /dev/urandom enters the "good" state).


[Guido]
> Then shouldn't it be the responsibility of the boot sequence rather than
> of the Python stdlib to wait for that event? IIUC that's what OS X
> does (I think someone described that it even kernel-panics when it can't
> enter the "good" state).

The rub is that sometimes Python is running soooo early in the boot
sequence in these rare Linux cases.  That's said to be impossible on
OS X (or Windows).

From donald at stufft.io  Wed Jun 22 23:02:00 2016
From: donald at stufft.io (Donald Stufft)
Date: Wed, 22 Jun 2016 23:02:00 -0400
Subject: [Security-sig] Can /dev/urandom ever revert from the "good" to
 the "bad" state?
In-Reply-To: <CAExdVNn-wFaqQ8ahwjQVNRWsMPWdck4LcuwNyFSDTK2p6-09_Q@mail.gmail.com>
References: <CAP7+vJJzBUd=aAd7M_=4iF5EV3WLjOPRhC1OqhXnxO72Bj5zyw@mail.gmail.com>
 <99E3006C-31F3-466B-92DA-E1396D1E3995@stufft.io>
 <CAExdVNn-wFaqQ8ahwjQVNRWsMPWdck4LcuwNyFSDTK2p6-09_Q@mail.gmail.com>
Message-ID: <9766EAF5-7E94-43DD-BAFF-076AC6776C20@stufft.io>


> On Jun 22, 2016, at 10:40 PM, Tim Peters <tim.peters at gmail.com> wrote:
> 
> [Guido]
>> Before I can possibly start thinking about what to do when the system's
>> CSPRNG is initialized, I need to understand more about how it works.
>> Apparently there's a possible transition from the "not ready yet" ("bad")
>> state to "ready" ("good"), and all it takes is usually waiting for a second
>> or two. But is this a wait that only gets incurred once, somewhere early
>> after a boot, or is this something that can happen at any time?
> 
> [Donald Stufft]
>> Once, only after boot. On most (all?) modern Linux systems there?s even part
>> of the boot process that attempts to seed the CSPRNG using random values
>> stored during a previous boot to shorten the time window between when it?s
>> ready and when it?s not yet initialized. However, once it is initialized it
>> will never block (or EAGAIN) again.
> 
> Donald, at the end you're talking about how getrandom() behaves -
> /dev/urandom on Linux never blocks, as I understand it (but there's no
> advertised way to tell when /dev/urandom enters the "good" state).

Yes sorry, Guido asked about the system CSPRNG, in Linux there are three
(previously two) basic interfaces to the same CSPRNG:

/dev/urandom
  - This will never block, but until it gathers enough entropy in the boot
    process it will silently return data that is not cryptographically secure.
    Essentially, predictably random, however to what degree it is predictable
    depends on a lot of factors. As far as I am aware, there is no practical
    way to determine ?given a read of /dev/urandom did I get ?good? or ?bad?
    data out of it?.

/dev/random
   - This will randomly block whenever the kernel thinks that the entropy is
     ?running low?. All security experts I?m aware of with maybe the exception
     of Ted (I don?t know how he feels about this) believe that this action of
     counting entropy is pure bollocks and that /dev/random randomly blocking
     because it thinks the entropy is low achieves nothing except to hurt the
     performance of things that need randomness at runtime.

And on newer kernels there is the getrandom() sys call which has flags that
enable three different mode of operations:

getrandom(0)
   - This will block until the same ?pool? of entropy that /dev/urandom uses
     has been initialized once, at boot, and then it will never block again.

getrandom(GRND_NONBLOCK)
   - This will return a -1 and set errno to EAGAIN if the same pool of entropy
     that /dev/urandom uses has not been initialized, and will otherwise always
     return data. This is essentially the same as getrandom(0) except instead
     of blocking it returns an error.

getrandom(GRND_RANDOM)
   - This is basically just a syscall interface to /dev/random and it doesn?t
     meaningfully deviate from what /dev/random does, except not require a file
     descriptor to use it.

This getrandom() interface is the newer way to access these two types of random
and I think it is important to notice that this newer interface does *not* have
a way to get ?sometimes a CSPRNG, sometimes not? data out of it like /dev/urandom
does. This newer interface promises that you?ll always get cryptographically secure
random and it will either block until it can do that or will EAGAIN to let you
take some other action instead of relying on a CSPRNG if that suits your application.


> 
> 
> [Guido]
>> Then shouldn't it be the responsibility of the boot sequence rather than
>> of the Python stdlib to wait for that event? IIUC that's what OS X
>> does (I think someone described that it even kernel-panics when it can't
>> enter the "good" state).
> 
> The rub is that sometimes Python is running soooo early in the boot
> sequence in these rare Linux cases.  That's said to be impossible on
> OS X (or Windows).

Yes, once the system has booted and initialized then all forms of accessing the
/dev/urandom pool (/dev/urandom, getrandom(0), getrandom(GRND_NONBLOCK)) function
basically the same (plus or minus a file descriptor). The problem comes in a few
flavors but really they all boil down to the same thing: Code that is calling
os.urandom() prior to the /dev/urandom CSPRNG being initialized.

The primary case this will happen is code that is called early on in the boot
sequence prior to pid 0 initializing the urandom CSPRNG from random data saved
in the previous boot [1]. There are other cases this could happen though, like
embedded Linux systems or RaspberryPi?s or the like that don?t have great sources
of hardware entropy that will make it so the initialization of the CSPRNG will
take a longer period of time. This is particularly true on systems that don?t
(currently) have an active network connection since Networking is one of the better
sources of randomness that the kernel can use to seed these values with.

[1] This is basically what caused the initial report, systemd-cron was a Python
    script and the SipHash for the dictionary hash randomization was calling
    os.urandom to seed itself. However this particular thing isn?t being asked
    to be made blocking (or an error). As far as I know, most everyone agrees
    that for SipHash?s purpose it?s reasonable fine to fall back to an insecure
    source of random if a secure source isn?t available at the moment. What the
    security side wants is for people explicitly calling os.urandom (directly or
    indirectly) as part of the execute of their Python program to always get
    secure random if the platform we are on provides a reasonable interface to
    get access to it (e.g. /dev/random is not a reasonable interface, but
    getrandom() is).

?
Donald Stufft


From tim.peters at gmail.com  Wed Jun 22 23:33:58 2016
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 22 Jun 2016 22:33:58 -0500
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
Message-ID: <CAExdVN=QGn-TN448LwbvDTtHXtuGriLBtzV6Gi25x7XZ9dfFBQ@mail.gmail.com>

[Nick Coghlan]
> PEP: 522
> Title: Allow BlockingIOError in security sensitive APIs on Linux
> ...
> Other than for ``random.SystemRandom`` (which is a relatively thin
> wrapper around ``os.urandom``), the ``random`` module has never made
> any guarantees that the numbers it generates are suitable for use in
> security sensitive operations,

To the contrary, it explicitly says it "should not be used for
security purposes".


> so the use of the system random number generator to seed the default
> Mersenne Twister instance is mainly beneficial as a harm mitigation
> measure for code that is using the ``random`` module inappropriately.

Except that's largely accidental.  It so happens that using urandom()
left Python immune to the "poor seeding" attacks in the PHP paper
widely discussed when `secrets` was gestating, and it's entirely
accidental that Python 3 (but not Python 2) happens to implement
random.choice(), .randrange(), etc in such a way as to leave it
resistant even to the PHP paper's "deduce MT state from partial
outputs" attacks.  Even wholly naive "generate a password" snippets
using small alphabets with random.choice() are highly resistant to
state-deducing attacks in Python 3.

Those continue to be worth something.

But the _real_ reason MT uses urandom() is that MT has massive
internal state, and initialization wants the best chance it can get at
picking any of the 2**19937-1 possible initial states.  For example,
seeding with time.time() and/or pid can't possibly get at more than an
infinitesimal fraction of those.

This has nothing to do with "security" - it has to do with best
practice for simulations.  Seeding the Twister (any PRNG with massive
state) "fairly" is a puzzle, and seeding from urandom() was the best
that could be done.  Quite possible that, e.g., the system CSPRNG has
only 512 bits of state, but that's still far better than brewing
pseudo-nonsense out of a comparative handful of time.time() (etc)
bits.


> Since a single call to ``os.urandom()`` is cheap once the system random
> number generator has been initialized it makes sense to retain that as the
> default behaviour, but there's no need to issue a warning when falling back to
> a potentially more predictable alternative when necessary (in such cases,
> a warning will typically already have been issued as part of interpreter
> startup, as the only way for the call when importing the random module to
> fail without the implicit call during interpreter startup also failing if for
> the latter to have been skipped by entirely disabling the hash randomization
> mechanism).

Since the set of people who start simulations very early in the boot
sequence is empty, I have no objection to any change here - so long as
MT initialization continues using the OS RNG when possible.

From barry at python.org  Thu Jun 23 08:42:38 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 23 Jun 2016 08:42:38 -0400
Subject: [Security-sig] Can /dev/urandom ever revert from the "good" to
 the "bad" state?
In-Reply-To: <54E16C16-F9A3-4D39-8A36-4E356075A3A2@stufft.io>
References: <CAP7+vJJzBUd=aAd7M_=4iF5EV3WLjOPRhC1OqhXnxO72Bj5zyw@mail.gmail.com>
 <99E3006C-31F3-466B-92DA-E1396D1E3995@stufft.io>
 <CAP7+vJJgj3HDPjuOSgJfzJY-8fT9jZxXU2iOoCpLp_6Y4r7ifg@mail.gmail.com>
 <54E16C16-F9A3-4D39-8A36-4E356075A3A2@stufft.io>
Message-ID: <20160623084238.636353f6.barry@wooz.org>

On Jun 22, 2016, at 10:37 PM, Donald Stufft wrote:

>so the question I think really comes down to whether os.urandom is something
>we want to provide the best source of (generally) non blocking CSPRNG or
>whether we want it to be a narrow wrapper around whatever semantics
>/dev/urandom specifically has.

... with os.getrandom() exposed on platforms that provide it.

Cheers,
-Barry

From barry at python.org  Thu Jun 23 08:45:01 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 23 Jun 2016 08:45:01 -0400
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <576B3B71.3000303@stoneleaf.us>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <576B3B71.3000303@stoneleaf.us>
Message-ID: <20160623084501.44da9f60.barry@wooz.org>

On Jun 22, 2016, at 06:29 PM, Ethan Furman wrote:

>We have discussed the reasoning behind those two camps ad nauseam on Python
>Dev, with fairly disastrous results.  I did not create this list so we could
>do it again.
>
>At this point we have two PEPs going.  Let's make sure that whichever PEP we
>take back to Py-Dev includes all the arguments and objections noted, and then
>let Guido or his delegate make the final call.

Yes, agreed.  Since this is a new list with a new proposed PEP, I want to be
sure that my view is accurately represented.  I won't continue to push it and
don't plan on responding unless my position isn't accurately represented.
Once Guido or his delegate makes the call, it's over.

Cheers,
-Barry

From barry at python.org  Thu Jun 23 08:48:43 2016
From: barry at python.org (Barry Warsaw)
Date: Thu, 23 Jun 2016 08:48:43 -0400
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
Message-ID: <20160623084843.5bbfe3bf@anarchist.wooz.org>

On Jun 22, 2016, at 06:31 PM, Nick Coghlan wrote:

>    try:
>        my_random = os.getrandom
>    except AttributeError:
>        my_random = os.urandom

Once Python 3.6 is widely available, and/or secrets is backported and
available on PyPI, why would you ever do that rather than just get the best
source of randomness out of the secrets module?

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/security-sig/attachments/20160623/8a980da1/attachment.sig>

From donald at stufft.io  Thu Jun 23 09:54:43 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 23 Jun 2016 09:54:43 -0400
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <20160623084843.5bbfe3bf@anarchist.wooz.org>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
 <20160623084843.5bbfe3bf@anarchist.wooz.org>
Message-ID: <4282382E-473E-4C83-89D5-BF36B2D8F8D5@stufft.io>


> On Jun 23, 2016, at 8:48 AM, Barry Warsaw <barry at python.org> wrote:
> 
> On Jun 22, 2016, at 06:31 PM, Nick Coghlan wrote:
> 
>>   try:
>>       my_random = os.getrandom
>>   except AttributeError:
>>       my_random = os.urandom
> 
> Once Python 3.6 is widely available, and/or secrets is backported and
> available on PyPI, why would you ever do that rather than just get the best
> source of randomness out of the secrets module?

Because projects are likely going to be supporting things other than 3.6 for
a very long time. The ?typical? support matrix for a project on PyPI currently
looks roughly like 2.6, 2.7, and 3.3+. We?re seeing some projects dropping 2.6
finally on PyPI but it?s still a major source of downloads and 2.7 itself is
still ~86% of downloads initiated by pip across all of PyPI. There is the idea
of a secrets module back port on PyPI, but without adding C code to that it?s
going to basically just do the same thing as that try ? except and if the secrets
backport requires C I think you won?t get a very large uptick since os.urandom
exists already and the issues are subtle enough that I don?t think most people
are going to grok them immediately and will just automatically avoid a C
dependency where they don?t immediately see the need for one.

Even if we pretend that 3.6+ only is something that?s going to happen in anything
approaching a short timeline, we?re still going to be fighting against the tide
for what the vast bulk of documentation out there states to do. So not only do we
need to wait it out for pre 3.6 to die out, but we also need to wait it out for
the copious amounts of third party documentation out there telling people to just
use os.urandom dies.

And even in the future, once we get to a 3.6+ only world, os.urandom and the
try .. except shim will still ?work? for all anyone can tell (since the failure
mode on os.urandom itself is practically silent in every way imaginable) so unless
they already know about this issue and go out of their way to switch over to the
secrets module, they?re likely to continue using something in the os module for
a long time.

IOW, I think secrets is great, but I think it mostly helps new code written
targeting 3.6+ only, rather than being a solution for the vast bulk of software
already out there or which doesn?t yet exist but is going to support older things
than 3.6.


?
Donald Stufft


From guido at python.org  Thu Jun 23 11:27:07 2016
From: guido at python.org (Guido van Rossum)
Date: Thu, 23 Jun 2016 08:27:07 -0700
Subject: [Security-sig] Can /dev/urandom ever revert from the "good" to
 the "bad" state?
In-Reply-To: <20160623084238.636353f6.barry@wooz.org>
References: <CAP7+vJJzBUd=aAd7M_=4iF5EV3WLjOPRhC1OqhXnxO72Bj5zyw@mail.gmail.com>
 <99E3006C-31F3-466B-92DA-E1396D1E3995@stufft.io>
 <CAP7+vJJgj3HDPjuOSgJfzJY-8fT9jZxXU2iOoCpLp_6Y4r7ifg@mail.gmail.com>
 <54E16C16-F9A3-4D39-8A36-4E356075A3A2@stufft.io>
 <20160623084238.636353f6.barry@wooz.org>
Message-ID: <CAP7+vJ+stNNH87pyQxzMKZwez-J9vxXmBmQx=iCt8-m-c5-KnQ@mail.gmail.com>

On Thu, Jun 23, 2016 at 5:42 AM, Barry Warsaw <barry at python.org> wrote:

> On Jun 22, 2016, at 10:37 PM, Donald Stufft wrote:
>
> >so the question I think really comes down to whether os.urandom is
> something
> >we want to provide the best source of (generally) non blocking CSPRNG or
> >whether we want it to be a narrow wrapper around whatever semantics
> >/dev/urandom specifically has.
>
> ... with os.getrandom() exposed on platforms that provide it.
>

Personally I think it's better to have one API than two, even if it is
named after a platform-specific API.

FWIW I don't really buy the philosophy that the os module should only
provide thin wrappers over what the platform offers. E.g. in the case of
Windows most of what's in the os module is part of Microsoft's libc
emulation, and the platform APIs have a totally different shape.
os.urandom()'s past is already another example. So I don't see a reason to
offer two different APIs and force users of those APIs to either commit to
a platform or use an ugly try/except. Especially since in Python <= 3.5
they'll only have os.urandom().

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/security-sig/attachments/20160623/e2d3dcc7/attachment.html>

From donald at stufft.io  Thu Jun 23 11:41:08 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 23 Jun 2016 11:41:08 -0400
Subject: [Security-sig] Can /dev/urandom ever revert from the "good" to
 the "bad" state?
In-Reply-To: <CAP7+vJ+stNNH87pyQxzMKZwez-J9vxXmBmQx=iCt8-m-c5-KnQ@mail.gmail.com>
References: <CAP7+vJJzBUd=aAd7M_=4iF5EV3WLjOPRhC1OqhXnxO72Bj5zyw@mail.gmail.com>
 <99E3006C-31F3-466B-92DA-E1396D1E3995@stufft.io>
 <CAP7+vJJgj3HDPjuOSgJfzJY-8fT9jZxXU2iOoCpLp_6Y4r7ifg@mail.gmail.com>
 <54E16C16-F9A3-4D39-8A36-4E356075A3A2@stufft.io>
 <20160623084238.636353f6.barry@wooz.org>
 <CAP7+vJ+stNNH87pyQxzMKZwez-J9vxXmBmQx=iCt8-m-c5-KnQ@mail.gmail.com>
Message-ID: <33B064DF-1270-4EEB-849D-D4B3F49ED9F9@stufft.io>


> On Jun 23, 2016, at 11:27 AM, Guido van Rossum <guido at python.org> wrote:
> 
> On Thu, Jun 23, 2016 at 5:42 AM, Barry Warsaw <barry at python.org <mailto:barry at python.org>> wrote:
> On Jun 22, 2016, at 10:37 PM, Donald Stufft wrote:
> 
> >so the question I think really comes down to whether os.urandom is something
> >we want to provide the best source of (generally) non blocking CSPRNG or
> >whether we want it to be a narrow wrapper around whatever semantics
> >/dev/urandom specifically has.
> 
> ... with os.getrandom() exposed on platforms that provide it.
> 
> Personally I think it's better to have one API than two, even if it is named after a platform-specific API.
> 
> FWIW I don't really buy the philosophy that the os module should only provide thin wrappers over what the platform offers. E.g. in the case of Windows most of what's in the os module is part of Microsoft's libc emulation, and the platform APIs have a totally different shape. os.urandom()'s past is already another example. So I don't see a reason to offer two different APIs and force users of those APIs to either commit to a platform or use an ugly try/except. Especially since in Python <= 3.5 they'll only have os.urandom().
> 

For what it?s worth, I agree with this sentiment, though I think calling getrandom() and either blocking or erroring is still a pretty thin wrapper over what the OS provides, it?s just using a different interface to the same underlying functionality with only two real differences (1) Lack of a File Descriptor (2) Inability to get insecure values out of the API, both of which I think are good things. As far as I know, nobody has argued that os.random should *not* use getrandom(), they just want it to fall back to the same behavior as the /dev/urandom does in the (2) case? which is actually a thicker wrapper around what the OS provides than just using getrandom() since that fall back logic needs to be added ;)

?
Donald Stufft


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/security-sig/attachments/20160623/e3764672/attachment-0001.html>

From ncoghlan at gmail.com  Thu Jun 23 13:31:17 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 23 Jun 2016 10:31:17 -0700
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <4282382E-473E-4C83-89D5-BF36B2D8F8D5@stufft.io>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
 <20160623084843.5bbfe3bf@anarchist.wooz.org>
 <4282382E-473E-4C83-89D5-BF36B2D8F8D5@stufft.io>
Message-ID: <CADiSq7e38cn7pvYNgsH=VxHAQpznHrn5ATOx4sDWvQwhArO1gQ@mail.gmail.com>

On 23 June 2016 at 06:54, Donald Stufft <donald at stufft.io> wrote:
>
>> On Jun 23, 2016, at 8:48 AM, Barry Warsaw <barry at python.org> wrote:
>>
>> On Jun 22, 2016, at 06:31 PM, Nick Coghlan wrote:
>>
>>>   try:
>>>       my_random = os.getrandom
>>>   except AttributeError:
>>>       my_random = os.urandom
>>
>> Once Python 3.6 is widely available, and/or secrets is backported and
>> available on PyPI, why would you ever do that rather than just get the best
>> source of randomness out of the secrets module?
>
> Because projects are likely going to be supporting things other than 3.6 for
> a very long time. The ?typical? support matrix for a project on PyPI currently
> looks roughly like 2.6, 2.7, and 3.3+. We?re seeing some projects dropping 2.6
> finally on PyPI but it?s still a major source of downloads and 2.7 itself is
> still ~86% of downloads initiated by pip across all of PyPI.

RIght, the missing qualifier on my statement is that one of the key
aspects I'm specifically interested in the guidance we give to folks
writing single source compatible Python 2/3 code that *also* want to
use the best available initialization option given the vagaries of
build platform, deployment platform, and the precise versions of
those.

Reasonable developer experience:

* just keep using os.urandom(), Python will transparently upgrade your
code to the best non-blocking-in-practice system interface the OS has
to offer
* if os.urandom() throws BlockingIOError, you may need to add
application startup code to wait until the system random number
generator is ready

Dubious developer experience:

* if osgetrandom() is available use that, otherwise use os.urandom()

Dubious developer experience:

* if the secrets module is available use that, otherwise use os.urandom()

Dubious developer experience:

* add a dependency on a third party library which implements one of
the above dubious options

For folks that don't need to worry about compatibility with old
versions, the guidance will be "just use the secrets module"
regardless of what we do with os.urandom(), and that's fine.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Thu Jun 23 13:54:53 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 23 Jun 2016 10:54:53 -0700
Subject: [Security-sig] Can /dev/urandom ever revert from the "good" to
 the "bad" state?
In-Reply-To: <CAP7+vJJgj3HDPjuOSgJfzJY-8fT9jZxXU2iOoCpLp_6Y4r7ifg@mail.gmail.com>
References: <CAP7+vJJzBUd=aAd7M_=4iF5EV3WLjOPRhC1OqhXnxO72Bj5zyw@mail.gmail.com>
 <99E3006C-31F3-466B-92DA-E1396D1E3995@stufft.io>
 <CAP7+vJJgj3HDPjuOSgJfzJY-8fT9jZxXU2iOoCpLp_6Y4r7ifg@mail.gmail.com>
Message-ID: <CADiSq7cB5EKzYEwdnNtfL1LDkXY9K1Ajo7nx_fgieQXoWjxiOA@mail.gmail.com>

On 22 June 2016 at 19:29, Guido van Rossum <guido at python.org> wrote:
> On Wed, Jun 22, 2016 at 7:18 PM, Donald Stufft <donald at stufft.io> wrote:
>> Once, only after boot. On most (all?) modern Linux systems there?s even
>> part of the boot process that attempts to seed the CSPRNG using random
>> values stored during a previous boot to shorten the time window between when
>> it?s ready and when it?s not yet initialized. However, once it is
>> initialized it will never block (or EAGAIN) again.
>
> Then shouldn't it be the responsibility of the boot sequence rather than of
> the Python stdlib to wait for that event? IIUC that's what OS X does (I
> think someone described that it even kernel-panics when it can't enter the
> "good" state).

I spent some time browsing the (mostly-but-not-all public) results of
https://bugzilla.redhat.com/buglist.cgi?quicksearch=getrandom today,
and unfortunately that backed up the results of Ted Ts'o's "what if
/dev/urandom blocked on Linux startup?" experiments [1]. That is,
Linux has the same problem at the distro level that we do at the
language runtime level: the historically permissive behaviour means
that Linux has existing use cases where it's legitimate to start the
init process without waiting for the kernel CSPRNG to be seeded, so
distros can't currently unilaterally prevent the entire OS from
starting just because that subsystem isn't ready yet.

We have a significant advantage that the kernel and distro devs don't
enjoy though, which is a *much* nicer mechanism for runtime error
reporting (in the form of exceptions and tracebacks) - by taking
advantage of that, I believe we can significantly improve the default
behaviour, while also writing a fairly straightforward "if you get
this exception when running on Python 3.6, assess your application's
needs, then apply one of these remedies" note for the Python 3.6
porting guide.

Regards,
Nick.

[1] https://mail.python.org/pipermail/python-dev/2016-June/145146.html

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Thu Jun 23 14:10:33 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 23 Jun 2016 11:10:33 -0700
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <CADiSq7e38cn7pvYNgsH=VxHAQpznHrn5ATOx4sDWvQwhArO1gQ@mail.gmail.com>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
 <20160623084843.5bbfe3bf@anarchist.wooz.org>
 <4282382E-473E-4C83-89D5-BF36B2D8F8D5@stufft.io>
 <CADiSq7e38cn7pvYNgsH=VxHAQpznHrn5ATOx4sDWvQwhArO1gQ@mail.gmail.com>
Message-ID: <CADiSq7fpqy-vJJv7VKWut2c-vybgNg3zWK0pNA_rXu7XW9v46w@mail.gmail.com>

On 23 June 2016 at 10:31, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Reasonable developer experience:
>
> * just keep using os.urandom(), Python will transparently upgrade your
> code to the best non-blocking-in-practice system interface the OS has
> to offer
> * if os.urandom() throws BlockingIOError, you may need to add
> application startup code to wait until the system random number
> generator is ready

Thinking about this some more, I realised applications can implement
the "Wait for system RNG" behaviour even without os.getrandom:

    # Busy loop, given PEP 522's BlockingIOError
    def wait_for_system_rng():
        while True:
            try:
                os.urandom(1)
                break
            except BlockingIOError:
               continue

    # An actual use case for reading /dev/random!
    def wait_for_system_rng():
        try:
            block_on_system_rng = open("/dev/random", "rb")
        except FileNotFoundError:
            return
        with block_on_system_rng:
            block_on_system_rng.read(1)

That second one has the added bonus of doing the right thing even on
older Linux kernels that don't provide the new getrandom() syscall,
creating the following virtuous feedback loop:

1. Start running an existing application/script on Python 3.6 and a
Linux kernel with getrandom()
2. Start getting "BlockingIOError: system random number generator not ready"
3. Add the /dev/random snippet to wait for the system RNG
4. Your code now does the right thing even on older Pythons and Linux versions

Given that realisation, I'm back to thinking "We don't need it" when
it comes to exposing os.getrandom() directly.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From donald at stufft.io  Thu Jun 23 14:13:05 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 23 Jun 2016 14:13:05 -0400
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <CADiSq7fpqy-vJJv7VKWut2c-vybgNg3zWK0pNA_rXu7XW9v46w@mail.gmail.com>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
 <20160623084843.5bbfe3bf@anarchist.wooz.org>
 <4282382E-473E-4C83-89D5-BF36B2D8F8D5@stufft.io>
 <CADiSq7e38cn7pvYNgsH=VxHAQpznHrn5ATOx4sDWvQwhArO1gQ@mail.gmail.com>
 <CADiSq7fpqy-vJJv7VKWut2c-vybgNg3zWK0pNA_rXu7XW9v46w@mail.gmail.com>
Message-ID: <FD538521-BF4C-43D6-8475-B9587957D0B3@stufft.io>


> On Jun 23, 2016, at 2:10 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> That second one has the added bonus of doing the right thing even on
> older Linux kernels that don't provide the new getrandom() syscall,
> creating the following virtuous feedback loop:


The second one also is not a good idea to use in the general case since it will also block randomly throughout the application. It?s OK to use if you know you?re only going to access it once on boot, but you wouldn?t want it to be a common idiom that software itself does. If I recall, there was major downtime on healthcare.gov because they used /dev/random in production.

?
Donald Stufft


From ncoghlan at gmail.com  Thu Jun 23 14:38:27 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 23 Jun 2016 11:38:27 -0700
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <FD538521-BF4C-43D6-8475-B9587957D0B3@stufft.io>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
 <20160623084843.5bbfe3bf@anarchist.wooz.org>
 <4282382E-473E-4C83-89D5-BF36B2D8F8D5@stufft.io>
 <CADiSq7e38cn7pvYNgsH=VxHAQpznHrn5ATOx4sDWvQwhArO1gQ@mail.gmail.com>
 <CADiSq7fpqy-vJJv7VKWut2c-vybgNg3zWK0pNA_rXu7XW9v46w@mail.gmail.com>
 <FD538521-BF4C-43D6-8475-B9587957D0B3@stufft.io>
Message-ID: <CADiSq7fgoW9Ycp5rX6wSSDra=-FuhgUkR_sQ1AA8v7gB1uZ2gQ@mail.gmail.com>

On 23 June 2016 at 11:13, Donald Stufft <donald at stufft.io> wrote:
>
>> On Jun 23, 2016, at 2:10 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> That second one has the added bonus of doing the right thing even on
>> older Linux kernels that don't provide the new getrandom() syscall,
>> creating the following virtuous feedback loop:
>
>
> The second one also is not a good idea to use in the general case since it will also block randomly throughout the application. It?s OK to use if you know you?re only going to access it once on boot, but you wouldn?t want it to be a common idiom that software itself does. If I recall, there was major downtime on healthcare.gov because they used /dev/random in production.

Right, the idiom I'd be recommending in PEP 522 is a "Do this once in
__main__ to categorically prevent BlockingIOError from os.urandom,
random.SystemRandom and the secrets module" application level
approach, while the guidance for libraries would be to just keep using
os.urandom() and let affected application developers worry about
whether to catch the BlockingIOError at point of use, or block the
application at startup to wait for the system RNG.

Although now I'm wondering whether it might be worth proposing a
"secrets.wait_for_system_rng()" API as part of PEP 522, with the
following implementation:

    def wait_for_system_rng():
        # Avoid the below busy loop if possible
        try:
            block_on_system_rng = open("/dev/random", "rb")
        except FileNotFoundError:
            pass
        else:
            with block_on_system_rng:
                block_on_system_rng.read(1)
        # Busy loop until the system RNG is ready
        while True:
            try:
                os.urandom(1)
                break
            except BlockingIOError:
                pass

Since this is an "at most once at application startup" kind of
problem, I like the way that having a separate function for waiting
helps to divide responsibilities between library API developers
("complain if you need the system RNG and it isn't ready") and
application developers ("ensure the system RNG is ready before calling
APIs that need it").

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From victor.stinner at gmail.com  Thu Jun 23 17:27:54 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Thu, 23 Jun 2016 23:27:54 +0200
Subject: [Security-sig] PEP: Make os.urandom() blocking on Linux (version 2)
Message-ID: <CAMpsgwbHRqFD3M13_EKncQA-LJporzXZAAXGUPQQsOF4=PvsMg@mail.gmail.com>

Hi,

I completed my PEP. Here is a second version of my PEP. Changes:

* I added new sections:

   - The bug
   - Use Cases
   - Fix system urandom
   - Denial-of-service when reading random

* I added alternatives:

  - Leave os.urandom() unchanged, add os.getrandom()
  - Raise BlockingIOError in os.urandom()
  - Add an optional block parameter to os.urandom()

I added 3 sections to try to describe the context of "the bug". For
example, I think that it's important to mention that all operating
systems loads entropy from the disk at the boot.

For me, the last tricky question is the use case 2 (run a web server)
on a VM or embedded when system urandom is not initialized yet and
there is no entropy on disk yet (ex: first boot, or maybe second boot,
of a VM).

I read quickly that a VM connected to a network should be able to
quickly initialized the system urandom. So I'm not sure that the use
case 2 (web server) is really an issue in practice.

Victor


HTML version:
https://haypo-notes.readthedocs.io/pep_random.html


++++++++++++++++++++++++++++++++++++++++
PEP: Make os.urandom() blocking on Linux
++++++++++++++++++++++++++++++++++++++++

Headers::

    PEP: xxx
    Title: Make os.urandom() blocking on Linux
    Version: $Revision$
    Last-Modified: $Date$
    Author: Victor Stinner <victor.stinner at gmail.com>
    Status: Draft
    Type: Standards Track
    Content-Type: text/x-rst
    Created: 20-June-2016
    Python-Version: 3.6


Abstract
========

Modify ``os.urandom()`` to block on Linux 3.17 and newer until the OS
urandom is initialized.


The bug
=======

Python 3.5.0 was enhanced to use the new ``getrandom()`` syscall
introduced in Linux 3.17 and Solaris 11.3. The problem is that users
started to complain that Python 3.5 blocks at startup on Linux in
virtual machines and embedded devices: see issues `#25420
<http://bugs.python.org/issue25420>`_ and `#26839
<http://bugs.python.org/issue26839>`_.

On Linux, ``getrandom(0)`` blocks until the kernel initialized urandom
with 128 bits of entropy. The issue #25420 describes a Linux build
platform blocking at ``import random``. The issue #26839 describes a
short Python script used to compute a MD5 hash, systemd-cron, script
called very early in the init process. The system initialization blocks
on this script which blocks on ``getrandom(0)`` to initialize Python.

The Python initilization requires random bytes to implement a
counter-measure against the hash denial-of-service (hash DoS), see:

* `Issue #13703: Hash collision security issue
  <http://bugs.python.org/issue13703>`_
* `PEP 456: Secure and interchangeable hash algorithm
  <https://www.python.org/dev/peps/pep-0456/>`_

Importing the ``random`` module creates an instance of
``random.Random``: ``random._inst``. On Python 3.5, random.Random
constructor reads 2500 bytes from ``os.urandom()`` to seed a Mersenne
Twister RNG (random number generator).

Other platforms may be affected by this bug, but in practice, only Linux
systems use Python scripts to initialize the system.


Use Cases
=========

The following use cases are used to help to choose the right compromise
between security and practicability.


Use Case 1: init script
-----------------------

Use a Python 3 script to initialize the system, like systemd-cron. If
the script blocks, the system initialize is stuck too.

The issue #26839 is a good example of this use case.


Use Case 2: web server
----------------------

Run a Python 3 web server serving web pages using HTTP and HTTPS
protocols. The server is started as soon as possible.

The first target of the hash DoS attack was web server: it's important
that the hash secret cannot be easily guessed by an attacker.

If serving a web page needs a secret to create a cookie, create an
encryption key, ..., the secret must be created with good entropy:
again, it must be hard to guess the secret.

A web server requires security. If a choice must be made between
security and running the server with weak entropy, security is more
important. If there is no good entropy: the server must block or fail
with an error.

The question is if it makes sense to start a web server on a host before
system urandom is initialized.

The issues #25420 and #26839 are restricted to the Python startup, not
to generate a secret before the system urandom is initialized.


Fix system urandom
==================

Load entropy from disk at boot
-------------------------------

Collecting entropy can take several minutes. To accelerate the system
initialization, operating systems store entropy on disk at shutdown, and
then reload entropy from disk at the boot.

If a system collects enough entropy at least once, the system urandom
will be initialized quickly, as soon as the entropy is reloaded from
disk.


Virtual machines
----------------

Virtual machines don't have a direct access to the hardware and so have
less sources of entropy than bare metal. A solution is to add a
`virtio-rng device
<https://fedoraproject.org/wiki/Features/Virtio_RNG>`_ to pass entropy
from the host to the virtual machine.


Embedded devices
----------------

A solution for embedded devices is to plug an hardware RNG.

For example, Raspberry Pi have an hardware RNG but it's not used by
default. See: `Hardware RNG on Raspberry Pi
<http://fios.sector16.net/hardware-rng-on-raspberry-pi/>`_.


Denial-of-service when reading random
=====================================

The ``/dev/random`` device should only used for very specific use cases.
Reading from ``/dev/random`` on Linux is likely to block. Users don't
like when an application blocks longer than 5 seconds to generate a
secret. It is only expected for specific cases like generating
explicitly an encryption key.

When the system has no available entropy, choosing between blocking
until entropy is available or falling back on lower quality entropy is a
matter of compromise between security and practicability. The choice
depends on the use case.

On Linux, ``/dev/urandom`` is secure, it should be used instead of
``/dev/random``:

* `Myths about /dev/urandom <http://www.2uo.de/myths-about-urandom/>`_
  by Thomas H?hn: "Fact: /dev/urandom is the preferred source of
  cryptographic randomness on UNIX-like systems"


Rationale
=========

On Linux, reading the ``/dev/urandom`` can return "weak" entropy before
urandom is fully initialized, before the kernel collected 128 bits of
entropy. Linux 3.17 adds a new ``getrandom()`` syscall which allows to
block until urandom is initialized.

On Python 3.5.2, os.urandom() uses the ``getrandom(GRND_NONBLOCK)``, but
falls back on reading the non-blocking ``/dev/urandom`` if
``getrandom(GRND_NONBLOCK)`` fails with ``EAGAIN``.

Security experts promotes ``os.urandom()`` to genereate cryptographic
keys. By the way, ``os.urandom()`` is preferred over
``ssl.RAND_bytes()`` for different reasons.

This PEP proposes to modify os.urandom() to use ``getrandom()`` in
blocking mode to not return weak entropy, but also ensure that Python
will not block at startup.


Changes
=======

All changes described in this section are specific to the Linux
platform.

* Initialize hash secret from non-blocking system urandom
* Initialize ``random._inst`` with non-blocking system urandom
* Modify os.urandom() to block (until system urandom is initialized)

A new ``_PyOS_URandom_Nonblocking()`` private method is added: try to
call ``getrandom(GRND_NONBLOCK)``, but falls back on reading
``/dev/urandom`` if it fails with ``EAGAIN``.

``_PyRandom_Init()`` is modified to call
``_PyOS_URandom_Nonblocking()``.  Moreover, a new ``random_inst_seed``
field is added to the ``_Py_HashSecret_t`` structure.

``random._inst`` (an instance of ``random.Random``) is initialized with
the new ``random_inst_seed`` secret. A ("fuse") flag is used to ensure
that this secret is only used once.

If a second instance of random.Random is created, blocking
``os.urandom()`` is used.

``os.urandom()`` (C function ``_PyOS_URandom()``) is modified to always
call ``getrandom(0)`` (blocking mode).


Alternative
===========

Never use blocking urandom in the random module
-----------------------------------------------

The random module can use ``random_inst_seed`` as a seed, but add other
sources of entropy like the process identifier (``os.getpid()``), the
current time (``time.time()``), memory addresses, etc.

Reading 2500 bytes from os.urandom() to initialize the Mersenne Twister
RNG in random.Random is a deliberate choice to get access to the full
range of the RNG. This PEP is a compromise between "security" and
"feature". Python should not block at startup before the OS collected
enough entropy. But on the regular use case (system urandom
iniitalized), the random module should continue to its code to
initialize the seed.

Python 3.5.0 was blocked on ``import random``, not on building a second
instance of ``random.Random``.


Leave os.urandom() unchanged, add os.getrandom()
------------------------------------------------

os.urandom() remains unchanged: never block, but it can return weak
entropy if system urandom is not initialized yet.

A new ``os.getrandom()`` function is added: thin wrapper to the
``getrandom()`` syscall.

Expected usage to write portable code::

    def my_random(n):
        if hasattr(os, 'getrandom'):
            return os.getrandom(n, 0)
        return os.urandom(n)

The problem with this change is that it expects that users understand
well security and know well each platforms. Python has the tradition of
hiding "implementation details". For example, ``os.urandom()`` is not a
thin wrapper to the ``/dev/urandom`` device: it uses
``CryptGenRandom()`` on Windows, it uses ``getentropy()`` on OpenBSD, it
tries ``getrandom()`` on Linux and Solaris or falls back on reading
``/dev/urandom``. Python already uses the best available system RNG
depending on the platform.

This PEP does not change the API which didn't change since the creation
of Python:

* ``os.urandom()``, ``random.SystemRandom`` and ``secrets`` for security
* ``random`` module (except ``random.SystemRandom``) for all other usages


Raise BlockingIOError in os.urandom()
-------------------------------------

This idea was proposed as a compromise to let developers decide themself
how to handle the case:

* catch the exception and uses another weaker entropy source: read
  ``/dev/urandom`` on Linux, the Python ``random`` module (which is not
  secure at all), time, process identifier, etc.
* don't catch the error, the whole program fails with this fatal
  exception

First of all, no user complained yet that ``os.urandom()`` blocks. This
point is currently theorical. The Python issues #25420 and #26839 were
restricted to the Python startup: users complained that Python was
blocked at startup.

Even if reading /dev/urandom block on OpenBSD, FreeBSD, Mac OS X, etc.
until urandom is initialized, no user complained yet because Python is
not used in the process initializing the system and /dev/urandom is
quickly initialized.  It looks like only Linux users hit the problem on
virtual machines or embedded devices, and only in some short Python
scripts used to initialize the the system. Again, ``os.urandom()`` is
not used in such script (at least, not yet).

As `Leave os.urandom() unchanged, add os.getrandom()`_, the problem is
that it makes the API more complex and so more error-prone.


Add an optional block parameter to os.urandom()
-----------------------------------------------

Add an optional block parameter to os.urandom(). The default value may
be ``True`` (block by default) or ``False`` (non-blocking).

The first technical issue is to implement ``os.urandom(block=False)`` on
all platforms. On Linux 3.17 and newer has a well defined non-blocking
API.

See the `issue #27250: Add os.urandom_block()
<http://bugs.python.org/issue27250>`_.

As `Raise BlockingIOError in os.urandom()`_, it doesn't seem worth it to
make the API more complex for a theorical (or at least very rare) use
case.

As `Leave os.urandom() unchanged, add os.getrandom()`_, the problem is
that it makes the API more complex and so more error-prone.


Annexes
=======

Operating system random functions
---------------------------------

``os.urandom()`` uses the following functions:

* OpenBSD: `getentropy()
  <http://man.openbsd.org/OpenBSD-current/man2/getentropy.2>`_
  (OpenBSD 5.6)
* Linux: `getrandom()
  <http://man7.org/linux/man-pages/man2/getrandom.2.html>`_ (Linux 3.17)
  -- see also `A system call for random numbers: getrandom()
  <https://lwn.net/Articles/606141/>`_
* Solaris: `getentropy()
  <https://docs.oracle.com/cd/E53394_01/html/E54765/getentropy-2.html#scrolltoc>`_,
  `getrandom()
  <https://docs.oracle.com/cd/E53394_01/html/E54765/getrandom-2.html>`_
  (both need Solaris 11.3)
* Windows: `CryptGenRandom()
  <https://msdn.microsoft.com/en-us/library/windows/desktop/aa379942%28v=vs.85%29.aspx>`_
  (Windows XP)
* UNIX, BSD: /dev/urandom, /dev/random
* OpenBSD: /dev/srandom

On Linux, commands to get the status of ``/dev/random`` (results are
number of bytes)::

    $ cat /proc/sys/kernel/random/entropy_avail
    2850
    $ cat /proc/sys/kernel/random/poolsize
    4096

Why using os.urandom()?
-----------------------

Since ``os.urandom()`` is implemented in the kernel, it doesn't have
some issues of user-space RNG. For example, it is much harder to get its
state. It is usually built on a CSPRNG, so even if its state is get, it
is hard to compute previously generated numbers. The kernel has a good
knowledge of entropy sources and feed regulary the entropy pool.


Links
=====

* `Cryptographically secure pseudo-random number generator (CSPRNG)
  <https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator>`_


Copyright
=========

This document has been placed in the public domain.

From victor.stinner at gmail.com  Thu Jun 23 17:37:56 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Thu, 23 Jun 2016 23:37:56 +0200
Subject: [Security-sig] PEP: Make os.urandom() blocking on Linux
 (version 2)
In-Reply-To: <CAMpsgwbHRqFD3M13_EKncQA-LJporzXZAAXGUPQQsOF4=PvsMg@mail.gmail.com>
References: <CAMpsgwbHRqFD3M13_EKncQA-LJporzXZAAXGUPQQsOF4=PvsMg@mail.gmail.com>
Message-ID: <CAMpsgwZnZGH9-ufbeOitqBfskvyVBfgLdBC8_sWhsmaObcLiBw@mail.gmail.com>

2016-06-23 23:27 GMT+02:00 Victor Stinner <victor.stinner at gmail.com>:
> Use Case 1: init script
> -----------------------
>
> Use a Python 3 script to initialize the system, like systemd-cron. If
> the script blocks, the system initialize is stuck too.
>
> The issue #26839 is a good example of this use case.

For me, such script must not require secure secret.

An application which require to generate a secure secret must run
later, when the system is fully initialized.

What do you think?


> Use Case 2: web server
> ----------------------
>
> Run a Python 3 web server serving web pages using HTTP and HTTPS
> protocols. The server is started as soon as possible.
>
> The first target of the hash DoS attack was web server: it's important
> that the hash secret cannot be easily guessed by an attacker.

Maybe I should elaborate this point to explain that the specific case
of hash secret is more in the practicability side than on the security
side.

*IMO* reading the non-blocking /dev/urandom is enough for the hash
secret. From what I read, even if the system urandom is not considered
as initialized, urandom is able to generate "good enough" entropy. So
the hash secret is not easily predictable.

Maybe I should read Ted Tso's emails to elaborate this point ;-)


> Embedded devices
> ----------------
>
> A solution for embedded devices is to plug an hardware RNG.

Honestly, I'm not fully convinced by my own solution :-) I'm not sure
that all embedded devices are "extensible".

Victor

From victor.stinner at gmail.com  Thu Jun 23 17:51:06 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Thu, 23 Jun 2016 23:51:06 +0200
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <20160621185709.3ab50572.barry@wooz.org>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
Message-ID: <CAMpsgwbXe12Ukhi3Oz2UySnQ1YfTaROr2WuFj1EOyuD4yPqxdw@mail.gmail.com>

2016-06-22 0:57 GMT+02:00 Barry Warsaw <barry at python.org>:
> I would like to ask for some changes to this proto-PEP.
>
> At a minimum, I think a proper treatment of the alternative where os.urandom()
> remains (on Linux at least) a thin wrapper around /dev/urandom.  We would add
> os.getrandom() as the low-level interface to the new C lib function,

Ok, done in the version 2 of my PEP


> and expose any higher level functionality in the secrets module if necessary.

I didn't add this point to the PEP. Tell me if it should be added.
Which kind of function do you imagine?

I wrote an example of a helper function to use os.getrandom() or falls
back on os.urandom():
https://haypo-notes.readthedocs.io/pep_random.html#leave-os-urandom-unchanged-add-os-getrandom

You may reply on my PEPv2 directly ;-)

Victor

From victor.stinner at gmail.com  Thu Jun 23 18:03:56 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Fri, 24 Jun 2016 00:03:56 +0200
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <20160622203515.12e20601@anarchist.wooz.org>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
Message-ID: <CAMpsgwb0k+F-BtP1c7bXRGi-z=xfdVo1HmSdJTuOUnQFoGk-5A@mail.gmail.com>

2016-06-23 2:35 GMT+02:00 Barry Warsaw <barry at python.org>:
> Because the os module has traditionally surfaced lower-level operating system
> functions,

Well, that's not true. os.urandom() is a bad example since it has many
implementations depending on the platform.

https://haypo-notes.readthedocs.io/pep_random.html#leave-os-urandom-unchanged-add-os-getrandom
or
https://haypo-notes.readthedocs.io/pep_random.html#operating-system-random-functions


>  That's also why I
> advocate simplifying os.urandom() so that it reverts more or less to exposing
> /dev/urandom to Python.  With perhaps a few exceptions, os doesn't provide
> higher level APIs.

Hum, I modified os.urandom() to use getrandom() to use the private
file descriptor and not require the /dev/urandom device. Using a file
descriptor has many issues. Tell me if you need more details on these
issues.

In Python 3.5.2, os.urandom() uses getrandom() on Linux, but only
falls back on reading /dev/urandom is getrandom(GRND_NONBLOCK) fails
with EAGAIN.

I'm not sure that I understand you. Do you want to stop using getrandom()?

What about getrandom() on Solaris? And getentropy() on OpenBSD?

(And Windows uses CryptGenRandom() ;-))


> The point here is that, let's say you're an experienced Linux developer and
> you know you want to use getrandom(2) in Python.  os.getrandom() is exactly
> that.  It's completely analogous to why we provide, e.g. os.chroot() and such.

Even if we modify os.urandom() to make it blocking, adding
os.getrandom() makes sense.

getrandom() allows also to read /dev/random (not /dev/urandom) without
using a FD, getrandom(GRND_NONBLOCK) also gives access to the
non-blocking mode.

Victor

From victor.stinner at gmail.com  Thu Jun 23 18:11:38 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Fri, 24 Jun 2016 00:11:38 +0200
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <20160623084843.5bbfe3bf@anarchist.wooz.org>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
 <20160623084843.5bbfe3bf@anarchist.wooz.org>
Message-ID: <CAMpsgwZzBD8nq4cuiq2wnpYQZvU=cPPEFDCxFveJD1KZZZgESg@mail.gmail.com>

2016-06-23 14:48 GMT+02:00 Barry Warsaw <barry at python.org>:
> Once Python 3.6 is widely available, and/or secrets is backported and
> available on PyPI, why would you ever do that rather than just get the best
> source of randomness out of the secrets module?

Once we modified Python 3.6 to handle correctly "the bug" and we
consider that the implementation is tested enough, I suggest to
backport it to Python 2.7 as well. Moreover, I would also suggest to
backport the change to Python 3.5, I would be sad if Python 2 is more
secure than the latest Python 3 release :-)

Victor

From victor.stinner at gmail.com  Thu Jun 23 18:54:38 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Fri, 24 Jun 2016 00:54:38 +0200
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
Message-ID: <CAMpsgwbRSgBON5-fEQem8tqg71ByEnO5_U7ACS1xSPY=AUuYFQ@mail.gmail.com>

> The new exception would potentially be encountered in the following situations:
>
> * Python code calling these APIs during Linux system initialization

I'm not sure that there is such use case in practice.

Can you please try to describe an use case where you would need
blocking system urandom *during the Python initialization*?

It looks like my use case 1, but I consider that os.urandom() is *not*
called on such use case:
https://haypo-notes.readthedocs.io/pep_random.html#use-case-1-init-script


> * Python code running on improperly initialized Linux systems (e.g. embedded
>   hardware without adequate sources of entropy to seed the system random number
>   generator, or Linux VMs that aren't configured to accept entropy from the
>   VM host)

If the program doesn't use os.urandom(), well, we don't care, there is
no issue :-)

IMO the interesting use case is when the application really requires
secure secret. That's my use case 2, a web server:
https://haypo-notes.readthedocs.io/pep_random.html#use-case-2-web-server

I chose to not give the choice to the developer and block on such
case. IMO it's accepable because the application should not have to
wait forever for urandom.

> Changing ``os.urandom()`` on Linux
> ----------------------------------
>
> This PEP proposes that in Python 3.6+, ``os.urandom()`` be updated to call
> the new Linux ``getrandom()`` syscall in non-blocking mode if available and
> raise ``BlockingIOError: system random number generator is not ready`` if
> the kernel reports that the call would block.

To be clear, the behaviour is unchanged on other platforms, right?

I'm just trying to understand the scope of the PEP. It looks like as
mine, it is written for Linux. (Even if other platforms may implement
the same behaviour later, if needed.)

If it's deliberate to restrict to Linux, you may be more explicit at
least in the abstract.

--

By the way, are you aware of other programming languages or
applications using an exception when random would block? (It's not a
requirement, I'm just curious.)


> By contrast, if ``BlockingIOError`` is raised in those situations, then
> developers using Python 3.6+ can easily choose their desired behaviour:
>
> 1. Loop until the call succeeds (security sensitive)

Is this case different from a blocking os.urandom()?


> 2. Switch to using the random module (non-security sensitive)

Hum, I disagree on this point. I don't think that you should start
with os.urandom() to fallback on random.

In fact, I only know *one* use case for this: create the random.Random
instance when the random module is imported.

In my PEP, I proposed to have a special case for random.Random
constructor, implemented in C (to not have to expose anything at the
Python level).


> 3. Switch to reading ``/dev/urandom`` directly (non-security sensitive)

It is what I propose for the random.Random constructor when the random
module is imported.

Again, the question is if there is a real use case for it. And if yes,
if the use case common enough to justify the change?


The extreme case is that all applications using os.urandom() would
need to be modifiy to add a try/except BlockingIOError. I only
exagerate to try to understand the impact of your PEP. I only that
only a few applications will use such try/except in practice.

As I tried to explain in my PEP, with Python 3.5.2, "the bug" (block
on random) became very unlikely.


> Issuing a warning for potentially predictable internal hash initialization

I don't recall Python logging warnings for similar issues. But I don't
recall similar issues neither :-)


> The challenge for internal hash initialization is that it might be very
> important to initialize SipHash with a reliably unpredictable random seed
> (for processes that are exposed to potentially hostile input) or it might be
> totally unimportant (for processes that never have to deal with untrusted data).

>From what I read, /dev/urandom is good even before it is considered as
initialized, because the kernel collects various data, but don't
increase the entropy estimator.

I'm not completely convinced that a warning is needed. I'm not against
it neither. I am doubtful. :-)

Well, let's say that we have a warning. What should the user do in
such case? Is it an advice to dig the urandom issue and try to get
more entropy?

The warning is for users, no? I imagine that an application can work
perfectly for the developer, but only emit the warning for some users
depending how the deploy their application.


> However, at the same time, since Python has no way to know whether any given
> invocation needs to handle untrusted data, when the default SipHash
> initialization fails this *might* indicate a genuine security problem, which
> should not be allowed to pass silently.

An alternative would be to provide a read-only flag which would
indicate if the hash secret is considered as "secure" or not.

Applications considered by security would check the flag and decide
themself to emit a warning or not.


> Accordingly, if internal hash initialization needs to fall back to a potentially
> predictable seed due to the system random number generator not being ready, it
> will also emit a warning message on ``stderr`` to say that the system random
> number generator is not available and that processing potentially hostile
> untrusted data should be avoided.

I know that many of you disagree with me, but I'm not sure that the
hash DoS is an important issue.

We should not overestimate the importance of this vulnerability.


> Affected security sensitive applications
> ----------------------------------------
>
> Security sensitive applications would need to either change their system
> configuration so the application is only started after the operating system
> random number generator is ready for security sensitive operations, or else
> change their code to busy loop until the operating system is ready::
>
>     def blocking_urandom(num_bytes):
>         while True:
>             try:
>                 return os.urandom(num_bytes)
>             except BlockingIOError:
>                 pass

Such busy-loop may use a lot of CPU :-/ You need a time.sleep() or
something like that, no?

A blocking os.urandom() doesn't have such issue ;-)

Is it possible that os.urandom() works, but the following os.urandom()
call raises a BlockingIOError? If yes, there is an issue with "partial
read", we should uses a dedicated exception to return partial data.

Hopefully, I understood that the issue doesn't occur in pratice.
os.urandom() starts with BlockingIOError. But once it "works", it will
work forever. Well, at least on Linux.

I don't know how Solaris behaves. I hope that it behaves as Linux
(once it works, it always works). At least, I see that Solaris
getrandom() can also fails with EAGAIN.


> Affected non-security sensitive applications
> --------------------------------------------
>
> Non-security sensitive applications that don't want to assume access to
> ``/dev/urandom`` (or assume a non-blocking implementation of that device)
> can be updated to use the ``random`` module as a fallback option::
>
>     def pseudorandom_fallback(num_bytes):
>         try:
>             return os.urandom(num_bytes)
>         except BlockingIOError:
>             random.getrandbits(num_bytes*8).to_bytes(num_bytes, "little")
>
> Depending on the application, it may also be appropriate to skip accessing
> ``os.urandom`` at all, and instead rely solely on the ``random`` module.

Hum, I dislike such change. It overcomplicates applications for a corner-case.

If you use os.urandom(), you already expect security. I prefer to
simplify use cases to two cases: (1) you really need security (2) you
really don't care of security. If you don't care, use directly the
random module. Don't bother with os.urandom() nor having to add
try/except BlockingIOError. No?

I *hope* that a regular application will never see BlockingIOError on
os.urandom() in the wild.


> Affected Linux specific non-security sensitive applications
> -----------------------------------------------------------
>
> Non-security sensitive applications that don't need to worry about cross
> platform compatibility and are willing to assume that ``/dev/urandom`` on
> Linux will always retain its current behaviour can be updated to access
> ``/dev/urandom`` directly::
>
>     def dev_urandom(num_bytes):
>         with open("/dev/urandom", "rb") as f:
>             return f.read(num_bytes)

Again, I'm against adding such complexity for a corner case. Just use
os.urandom().


> For additional background details beyond those captured in this PEP, also see
> Victor Stinner's summary at http://haypo-notes.readthedocs.io/pep_random.html

Oh, I didn't expect to have references to my document :-) I moved it to:
https://haypo-notes.readthedocs.io/summary_python_random_issue.html

http://haypo-notes.readthedocs.io/pep_random.html is now really a PEP ;-)

Victor

From ncoghlan at gmail.com  Thu Jun 23 20:33:22 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 23 Jun 2016 17:33:22 -0700
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <CAMpsgwbRSgBON5-fEQem8tqg71ByEnO5_U7ACS1xSPY=AUuYFQ@mail.gmail.com>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
 <CAMpsgwbRSgBON5-fEQem8tqg71ByEnO5_U7ACS1xSPY=AUuYFQ@mail.gmail.com>
Message-ID: <CADiSq7fOp6h55TyUNcj2CCeSnp60Jztt7ui=ZPqPD6iJMiq97w@mail.gmail.com>

On 23 June 2016 at 15:54, Victor Stinner <victor.stinner at gmail.com> wrote:
>> The new exception would potentially be encountered in the following situations:
>>
>> * Python code calling these APIs during Linux system initialization
>
> I'm not sure that there is such use case in practice.
>
> Can you please try to describe an use case where you would need
> blocking system urandom *during the Python initialization*?
>
> It looks like my use case 1, but I consider that os.urandom() is *not*
> called on such use case:
> https://haypo-notes.readthedocs.io/pep_random.html#use-case-1-init-script

My preference for an exception comes from the fact that we can never
prove the non-existence of proprietary software that does certain
things, but we *can* ensure that such code gets an easy to debug
exception rather than a potential deadlock if it does exist.

The argument chain runs:

- if such software doesn't exist, it doesn't matter which behaviour we choose
- if we're wrong and it does exist, we can choose how it fails:
  - blocking (with associated potential for init system deadlock)
  - throwing an exception

Given the choice between debugging an apparent system hang and an
unexpected exception when testing against a new version of a platform,
I'll choose the exception every time.

>> * Python code running on improperly initialized Linux systems (e.g. embedded
>>   hardware without adequate sources of entropy to seed the system random number
>>   generator, or Linux VMs that aren't configured to accept entropy from the
>>   VM host)
>
> If the program doesn't use os.urandom(), well, we don't care, there is
> no issue :-)
>
> IMO the interesting use case is when the application really requires
> secure secret. That's my use case 2, a web server:
> https://haypo-notes.readthedocs.io/pep_random.html#use-case-2-web-server
>
> I chose to not give the choice to the developer and block on such
> case. IMO it's accepable because the application should not have to
> wait forever for urandom.

Should not, but actually can, depending on the characteristics of the
underlying system and its runtime environment.

>> Changing ``os.urandom()`` on Linux
>> ----------------------------------
>>
>> This PEP proposes that in Python 3.6+, ``os.urandom()`` be updated to call
>> the new Linux ``getrandom()`` syscall in non-blocking mode if available and
>> raise ``BlockingIOError: system random number generator is not ready`` if
>> the kernel reports that the call would block.
>
> To be clear, the behaviour is unchanged on other platforms, right?

Cory Benfield pointed out that the proposal as currently written isn't
clear as to whether or not it applies to recent versions of Solaris
and Illumos, as they also provide a getrandom() syscall.

> I'm just trying to understand the scope of the PEP. It looks like as
> mine, it is written for Linux. (Even if other platforms may implement
> the same behaviour later, if needed.)
>
> If it's deliberate to restrict to Linux, you may be more explicit at
> least in the abstract.

It's in the PEP title: "Allow BlockingIOError in security sensitive
APIs on Linux"

However, I need to update it to indicate it applies to any system that
provides a non-blocking getrandom() syscall.

> --
>
> By the way, are you aware of other programming languages or
> applications using an exception when random would block? (It's not a
> requirement, I'm just curious.)

No, but I haven't really gone looking either. It's also worth keeping
in mind that it's only in the last 12 months folks have even had the
*option* of doing better than just reading from /dev/urandom and
hoping it's been initialised properly.

>> By contrast, if ``BlockingIOError`` is raised in those situations, then
>> developers using Python 3.6+ can easily choose their desired behaviour:
>>
>> 1. Loop until the call succeeds (security sensitive)
>
> Is this case different from a blocking os.urandom()?

Yes, as it's up to the application to decide when it wants to check
for the system RNG being ready, and how it wants to report that to the
user. For example, it may decide to emit a runtime warning before it
enters the busy loop (I'm actually having a discussion with Donald in
another thread regarding a possible design for a
"secrets.wait_for_system_rng()" API that meshes well with the other
changes proposed in PEP 522).

>> 2. Switch to using the random module (non-security sensitive)
>
> Hum, I disagree on this point. I don't think that you should start
> with os.urandom() to fallback on random.
>
> In fact, I only know *one* use case for this: create the random.Random
> instance when the random module is imported.
>
> In my PEP, I proposed to have a special case for random.Random
> constructor, implemented in C (to not have to expose anything at the
> Python level).

We have two use cases for a fallback just in the standard library
(SipHash initiliasition and random module initialisation). Rather than
assuming no other use cases for the feature exist, we can expose the
fallback mechanism we use ourselves and let people decide for
themselves whether or not they want to do something similar.

>> 3. Switch to reading ``/dev/urandom`` directly (non-security sensitive)
>
> It is what I propose for the random.Random constructor when the random
> module is imported.
>
> Again, the question is if there is a real use case for it. And if yes,
> if the use case common enough to justify the change?
>
> The extreme case is that all applications using os.urandom() would
> need to be modifiy to add a try/except BlockingIOError. I only
> exagerate to try to understand the impact of your PEP. I only that
> only a few applications will use such try/except in practice.

That's where the idea of also adding secrets.wait_for_system_rng()
comes, rather than having to wrap every library call in a try/except
block (or risk having those APIs become blocking ones such that async
developers feel obliged to call them in a separate thread)

> As I tried to explain in my PEP, with Python 3.5.2, "the bug" (block
> on random) became very unlikely.

Aye, I agree with that (hence the references to this being an obscure,
Linux-specific problem in PEP 522). However, I think it makes sense to
stipulate that someone porting to Python 3.6 *has* unexpectedly
encountered the new behaviour, and is trying to debug what has gone
wrong with their application/system when comparing the two designs for
usability.

>> Issuing a warning for potentially predictable internal hash initialization
>
> I don't recall Python logging warnings for similar issues. But I don't
> recall similar issues neither :-)

It's a pretty unique problem, and not one we've been able to detect it
in the past.

>> The challenge for internal hash initialization is that it might be very
>> important to initialize SipHash with a reliably unpredictable random seed
>> (for processes that are exposed to potentially hostile input) or it might be
>> totally unimportant (for processes that never have to deal with untrusted data).
>
> From what I read, /dev/urandom is good even before it is considered as
> initialized, because the kernel collects various data, but don't
> increase the entropy estimator.
>
> I'm not completely convinced that a warning is needed. I'm not against
> it neither. I am doubtful. :-)
>
> Well, let's say that we have a warning. What should the user do in
> such case? Is it an advice to dig the urandom issue and try to get
> more entropy?
>
> The warning is for users, no? I imagine that an application can work
> perfectly for the developer, but only emit the warning for some users
> depending how the deploy their application.

It's a warning primarily for system integrators (i.e. the folks
developing a distro, designing an embedded device or configuring a VM)
that they need to either:

- reconfigure the application to start later in the boot process (e.g.
after the network comes up)
- write a systemd PreExec snippet that waits for the system RNG to be
initialised (that will be particularly easy if it can be written as
"python3 -c 'import secrets; secrets.wait_for_system_rng()")
- add a better entropy source to their system

The kind of wording I'm thinking of is along the lines of:

"Python hash initialization: using potentially predictable fallback
hash seed; avoid handling untrusted potentially hostile data in this
process"

>> However, at the same time, since Python has no way to know whether any given
>> invocation needs to handle untrusted data, when the default SipHash
>> initialization fails this *might* indicate a genuine security problem, which
>> should not be allowed to pass silently.
>
> An alternative would be to provide a read-only flag which would
> indicate if the hash secret is considered as "secure" or not.
>
> Applications considered by security would check the flag and decide
> themself to emit a warning or not.

I really don't want to add any more knobs and dials that need to be
documented and learned if we can possibly avoid it (and I think we
can).

In this case, turning off hash randomisation entirely will suppress
the warning along with hash randomisation itself.

>> Accordingly, if internal hash initialization needs to fall back to a potentially
>> predictable seed due to the system random number generator not being ready, it
>> will also emit a warning message on ``stderr`` to say that the system random
>> number generator is not available and that processing potentially hostile
>> untrusted data should be avoided.
>
> I know that many of you disagree with me, but I'm not sure that the
> hash DoS is an important issue.
>
> We should not overestimate the importance of this vulnerability.

It was never particularly important (the payload multiplier on the
Denial-of-Service isn't that big), but it was high profile and
splashy, and it's relatively cheap to take into account (since folks
that know it doesn't apply to them can still turn randomization off
entirely)

>> Affected security sensitive applications
>> ----------------------------------------
>>
>> Security sensitive applications would need to either change their system
>> configuration so the application is only started after the operating system
>> random number generator is ready for security sensitive operations, or else
>> change their code to busy loop until the operating system is ready::
>>
>>     def blocking_urandom(num_bytes):
>>         while True:
>>             try:
>>                 return os.urandom(num_bytes)
>>             except BlockingIOError:
>>                 pass
>
> Such busy-loop may use a lot of CPU :-/ You need a time.sleep() or
> something like that, no?

Maybe - we can work out the exact details once I've added the
secrets.wait_for_system_rng() proposal to the PEP.

> A blocking os.urandom() doesn't have such issue ;-)

It also doesn't let an app fail gracefully if it opts not to support
running without a pre-initialised system RNG :)

> Is it possible that os.urandom() works, but the following os.urandom()
> call raises a BlockingIOError? If yes, there is an issue with "partial
> read", we should uses a dedicated exception to return partial data.

No, it's not possible with os.urandom(). (It *can* happen with
/dev/random and with getentropy() on OpenBSD and Solaris, which is why
folks say "don't use those for anything")

> Hopefully, I understood that the issue doesn't occur in pratice.
> os.urandom() starts with BlockingIOError. But once it "works", it will
> work forever. Well, at least on Linux.
>
> I don't know how Solaris behaves. I hope that it behaves as Linux
> (once it works, it always works). At least, I see that Solaris
> getrandom() can also fails with EAGAIN.

It's the same logic as Linux (once a CSPRNG is properly seeded it can
never run out of entropy, but seeding it in the first place does
require entropy collection)

>> Affected non-security sensitive applications
>> --------------------------------------------
>>
>> Non-security sensitive applications that don't want to assume access to
>> ``/dev/urandom`` (or assume a non-blocking implementation of that device)
>> can be updated to use the ``random`` module as a fallback option::
>>
>>     def pseudorandom_fallback(num_bytes):
>>         try:
>>             return os.urandom(num_bytes)
>>         except BlockingIOError:
>>             random.getrandbits(num_bytes*8).to_bytes(num_bytes, "little")
>>
>> Depending on the application, it may also be appropriate to skip accessing
>> ``os.urandom`` at all, and instead rely solely on the ``random`` module.
>
> Hum, I dislike such change. It overcomplicates applications for a corner-case.
>
> If you use os.urandom(), you already expect security. I prefer to
> simplify use cases to two cases: (1) you really need security (2) you
> really don't care of security. If you don't care, use directly the
> random module. Don't bother with os.urandom() nor having to add
> try/except BlockingIOError. No?
>
> I *hope* that a regular application will never see BlockingIOError on
> os.urandom() in the wild.

Yeah, hence why I'm shifting more in favour of the
secrets.wait_for_system_rng() idea (which folks can then use as
inspiration to write their own "wait for the system RNG" helpers for
earlier Python and operating system versions)

>> Affected Linux specific non-security sensitive applications
>> -----------------------------------------------------------
>>
>> Non-security sensitive applications that don't need to worry about cross
>> platform compatibility and are willing to assume that ``/dev/urandom`` on
>> Linux will always retain its current behaviour can be updated to access
>> ``/dev/urandom`` directly::
>>
>>     def dev_urandom(num_bytes):
>>         with open("/dev/urandom", "rb") as f:
>>             return f.read(num_bytes)
>
> Again, I'm against adding such complexity for a corner case. Just use
> os.urandom().

All of this would be triggered by *application* developers actually
hitting the BlockingIOError and decide it was the appropriate course
of application for *their* application. The point of this part of the
PEP is to highlight that there are some really simple 3-5 functions
that let developers get a wide variety of behaviours in ways that are
compatible with single-source Python 2/3 code.

>> For additional background details beyond those captured in this PEP, also see
>> Victor Stinner's summary at http://haypo-notes.readthedocs.io/pep_random.html
>
> Oh, I didn't expect to have references to my document :-) I moved it to:
> https://haypo-notes.readthedocs.io/summary_python_random_issue.html
>
> http://haypo-notes.readthedocs.io/pep_random.html is now really a PEP ;-)

Cool, I'll update the first reference and also and a reference to your
draft PEP.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From donald at stufft.io  Thu Jun 23 20:46:46 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 23 Jun 2016 20:46:46 -0400
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <CADiSq7fOp6h55TyUNcj2CCeSnp60Jztt7ui=ZPqPD6iJMiq97w@mail.gmail.com>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
 <CAMpsgwbRSgBON5-fEQem8tqg71ByEnO5_U7ACS1xSPY=AUuYFQ@mail.gmail.com>
 <CADiSq7fOp6h55TyUNcj2CCeSnp60Jztt7ui=ZPqPD6iJMiq97w@mail.gmail.com>
Message-ID: <866815ED-76B5-4E0C-95CB-265936D97C2A@stufft.io>


> On Jun 23, 2016, at 8:33 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> The argument chain runs:
> 
> - if such software doesn't exist, it doesn't matter which behaviour we choose
> - if we're wrong and it does exist, we can choose how it fails:
>  - blocking (with associated potential for init system deadlock)
>  - throwing an exception
> 
> Given the choice between debugging an apparent system hang and an
> unexpected exception when testing against a new version of a platform,
> I'll choose the exception every time.


I think the biggest argument to blocking is that there really exist two sort of situations that blocking can happen in:

* It blocks for a tiny amount (maybe <1s) and nobody ever notices and people feel like things ?just work?.
* It blocks for a long amount of time (possibly forever depending on where in the boot sequence Python is being used) and it hangs for a long time (or forever).

In the second case I think it?s pretty obvious that an exception is better than hanging forever, but in the first case an exception might actually cause people to go out of their way to do something bad to ?stop the pain?. My personal preference is waffling back and forth between them based on which of the two above I feel are more likely to occur in practice.

?
Donald Stufft


From ncoghlan at gmail.com  Thu Jun 23 20:56:21 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 23 Jun 2016 17:56:21 -0700
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <CADiSq7fgoW9Ycp5rX6wSSDra=-FuhgUkR_sQ1AA8v7gB1uZ2gQ@mail.gmail.com>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
 <20160623084843.5bbfe3bf@anarchist.wooz.org>
 <4282382E-473E-4C83-89D5-BF36B2D8F8D5@stufft.io>
 <CADiSq7e38cn7pvYNgsH=VxHAQpznHrn5ATOx4sDWvQwhArO1gQ@mail.gmail.com>
 <CADiSq7fpqy-vJJv7VKWut2c-vybgNg3zWK0pNA_rXu7XW9v46w@mail.gmail.com>
 <FD538521-BF4C-43D6-8475-B9587957D0B3@stufft.io>
 <CADiSq7fgoW9Ycp5rX6wSSDra=-FuhgUkR_sQ1AA8v7gB1uZ2gQ@mail.gmail.com>
Message-ID: <CADiSq7ehS0d003t3iQ9eoozSAZY=H5PzE1artY30ZONX9+3C5Q@mail.gmail.com>

On 23 June 2016 at 11:38, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Although now I'm wondering whether it might be worth proposing a
> "secrets.wait_for_system_rng()" API as part of PEP 522, with the
> following implementation:
>
>     def wait_for_system_rng():
>         # Avoid the below busy loop if possible
>         try:
>             block_on_system_rng = open("/dev/random", "rb")
>         except FileNotFoundError:
>             pass
>         else:
>             with block_on_system_rng:
>                 block_on_system_rng.read(1)
>         # Busy loop until the system RNG is ready
>         while True:
>             try:
>                 os.urandom(1)
>                 break
>             except BlockingIOError:
>                 pass

I realised even this more complex variant still has a subtle bug: due
to the way /dev/random works, it can block inappropriately if Python
is started after the system RNG has already been seeded. That means a
completely correct implementation (assuming the rest of PEP 522 was in
place) would look more like this:

    def wait_for_system_rng():
        # If the system RNG is already seeded, don't wait at all
        try:
            os.urandom(1)
            return
        except BlockingIOError:
            pass
        # Avoid the below busy loop if possible
        try:
            block_on_system_rng = open("/dev/random", "rb")
        except FileNotFoundError:
            pass
        else:
            with block_on_system_rng:
                block_on_system_rng.read(1)
        # Busy loop until the system RNG is ready
        while True:
            try:
                os.urandom(1)
                break
            except BlockingIOError:
                # Only check once per millisecond
                time.sleep(0.001)

So I'll update PEP 522 to include this as part of the proposal - it's
trickier to get right than I thought, and it provides an additional
hook to help explain that the system RNG is something that once
initialized, stays initialized, so waiting for it is best handled as
an application level and system configuration concern rather than on
each call to os.urandom().

It also enables a pretty neat ExecStartPre [1] trick in systemd unit files:

    ExecStartPre=/usr/bin/python3 -c "import secrets;
secrets.wait_for_system_rng()"

to make an arbitrary service wait until the system RNG is ready before it runs.

Cheers,
Nick.

[1] https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecStartPre=

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ethan at stoneleaf.us  Thu Jun 23 20:58:23 2016
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 23 Jun 2016 17:58:23 -0700
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <866815ED-76B5-4E0C-95CB-265936D97C2A@stufft.io>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
 <CAMpsgwbRSgBON5-fEQem8tqg71ByEnO5_U7ACS1xSPY=AUuYFQ@mail.gmail.com>
 <CADiSq7fOp6h55TyUNcj2CCeSnp60Jztt7ui=ZPqPD6iJMiq97w@mail.gmail.com>
 <866815ED-76B5-4E0C-95CB-265936D97C2A@stufft.io>
Message-ID: <576C85AF.7010506@stoneleaf.us>

On 06/23/2016 05:46 PM, Donald Stufft wrote:
>> On Jun 23, 2016, at 8:33 PM, Nick Coghlan wrote:
>>
>> The argument chain runs:
>>
>> - if such software doesn't exist, it doesn't matter which behaviour we choose
>> - if we're wrong and it does exist, we can choose how it fails:
>>   - blocking (with associated potential for init system deadlock)
>>   - throwing an exception
>>
>> Given the choice between debugging an apparent system hang and an
>> unexpected exception when testing against a new version of a platform,
>> I'll choose the exception every time.
>
> I think the biggest argument to blocking is that there really exist two sort of situations that blocking can happen in:
>
> * It blocks for a tiny amount (maybe <1s) and nobody ever notices and people feel like things ?just work?.
> * It blocks for a long amount of time (possibly forever depending on where in the boot sequence Python is being used) and it hangs for a long time (or forever).
>
> In the second case I think it?s pretty obvious that an exception is better than hanging forever, but in the first case an exception might actually cause people to go out of their way to do something bad to ?stop the pain?. My personal preference is waffling back and forth between them based on which of the two above I feel are more likely to occur in practice.

Can we build in a small wait?  As in, check every second for ten seconds 
and if we still don't have entropy then raise?

--
~Ethan~


From ncoghlan at gmail.com  Thu Jun 23 21:40:06 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 23 Jun 2016 18:40:06 -0700
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <866815ED-76B5-4E0C-95CB-265936D97C2A@stufft.io>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
 <CAMpsgwbRSgBON5-fEQem8tqg71ByEnO5_U7ACS1xSPY=AUuYFQ@mail.gmail.com>
 <CADiSq7fOp6h55TyUNcj2CCeSnp60Jztt7ui=ZPqPD6iJMiq97w@mail.gmail.com>
 <866815ED-76B5-4E0C-95CB-265936D97C2A@stufft.io>
Message-ID: <CADiSq7ei42jNfGtOKNCoaKc-BoZNscTQLK41+8E9EkvYsGONvA@mail.gmail.com>

On 23 June 2016 at 17:46, Donald Stufft <donald at stufft.io> wrote:
>
>> On Jun 23, 2016, at 8:33 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> The argument chain runs:
>>
>> - if such software doesn't exist, it doesn't matter which behaviour we choose
>> - if we're wrong and it does exist, we can choose how it fails:
>>  - blocking (with associated potential for init system deadlock)
>>  - throwing an exception
>>
>> Given the choice between debugging an apparent system hang and an
>> unexpected exception when testing against a new version of a platform,
>> I'll choose the exception every time.
>
>
> I think the biggest argument to blocking is that there really exist two sort of situations that blocking can happen in:
>
> * It blocks for a tiny amount (maybe <1s) and nobody ever notices and people feel like things ?just work?.
> * It blocks for a long amount of time (possibly forever depending on where in the boot sequence Python is being used) and it hangs for a long time (or forever).
>
> In the second case I think it?s pretty obvious that an exception is better than hanging forever, but in the first case an exception might actually cause people to go out of their way to do something bad to ?stop the pain?. My personal preference is waffling back and forth between them based on which of the two above I feel are more likely to occur in practice.

That's fair, and it's a large part of why I realised PEP 522 needed a
standard library answer for "just wait until the system RNG is ready,
please".

I'll also note that I'm open to being convinced that it's OK for
"import secrets" to be that answer - my main argument against it is
just a general principle that imports shouldn't have side effects, and
blocking waiting for an external state change is a side effect.

Standing against that is the argument that we wouldn't want the
recommended idiom for using the secrets module to become the
boilerplatish:

    import secrets
    secrets.wait_for_system_rng()

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From donald at stufft.io  Thu Jun 23 21:47:27 2016
From: donald at stufft.io (Donald Stufft)
Date: Thu, 23 Jun 2016 21:47:27 -0400
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <CADiSq7ei42jNfGtOKNCoaKc-BoZNscTQLK41+8E9EkvYsGONvA@mail.gmail.com>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
 <CAMpsgwbRSgBON5-fEQem8tqg71ByEnO5_U7ACS1xSPY=AUuYFQ@mail.gmail.com>
 <CADiSq7fOp6h55TyUNcj2CCeSnp60Jztt7ui=ZPqPD6iJMiq97w@mail.gmail.com>
 <866815ED-76B5-4E0C-95CB-265936D97C2A@stufft.io>
 <CADiSq7ei42jNfGtOKNCoaKc-BoZNscTQLK41+8E9EkvYsGONvA@mail.gmail.com>
Message-ID: <E827FA58-E60C-46FA-81F1-4E308BC41FD8@stufft.io>


> On Jun 23, 2016, at 9:40 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> On 23 June 2016 at 17:46, Donald Stufft <donald at stufft.io> wrote:
>> 
>>> On Jun 23, 2016, at 8:33 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>> 
>>> The argument chain runs:
>>> 
>>> - if such software doesn't exist, it doesn't matter which behaviour we choose
>>> - if we're wrong and it does exist, we can choose how it fails:
>>> - blocking (with associated potential for init system deadlock)
>>> - throwing an exception
>>> 
>>> Given the choice between debugging an apparent system hang and an
>>> unexpected exception when testing against a new version of a platform,
>>> I'll choose the exception every time.
>> 
>> 
>> I think the biggest argument to blocking is that there really exist two sort of situations that blocking can happen in:
>> 
>> * It blocks for a tiny amount (maybe <1s) and nobody ever notices and people feel like things ?just work?.
>> * It blocks for a long amount of time (possibly forever depending on where in the boot sequence Python is being used) and it hangs for a long time (or forever).
>> 
>> In the second case I think it?s pretty obvious that an exception is better than hanging forever, but in the first case an exception might actually cause people to go out of their way to do something bad to ?stop the pain?. My personal preference is waffling back and forth between them based on which of the two above I feel are more likely to occur in practice.
> 
> That's fair, and it's a large part of why I realised PEP 522 needed a
> standard library answer for "just wait until the system RNG is ready,
> please".
> 
> I'll also note that I'm open to being convinced that it's OK for
> "import secrets" to be that answer - my main argument against it is
> just a general principle that imports shouldn't have side effects, and
> blocking waiting for an external state change is a side effect.
> 
> Standing against that is the argument that we wouldn't want the
> recommended idiom for using the secrets module to become the
> boilerplatish:
> 
>    import secrets
>    secrets.wait_for_system_rng()
> 

Alternative here is to just make every function in secrets ensure it waits for the system RNG, possibly by calling said wait_for_system_rng() function if we still think it?s worth it to make it a public API with a global that gets set once it?s been recorded once.

The fallback to /dev/random may be a bad idea though, even if it?s only done once per process, I can imagine a case where someone is using emphereal processes so they end up hitting /dev/random regularly. Using getrandom() for this is fine because that state is per machine not per process, but the Python level ?has RNG been initialized? is per process so that could end up with an unintended side effect of hitting /dev/random a lot.


?
Donald Stufft


From ncoghlan at gmail.com  Thu Jun 23 22:01:49 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 23 Jun 2016 19:01:49 -0700
Subject: [Security-sig] PEP: Make os.urandom() blocking on Linux
 (version 2)
In-Reply-To: <CAMpsgwbHRqFD3M13_EKncQA-LJporzXZAAXGUPQQsOF4=PvsMg@mail.gmail.com>
References: <CAMpsgwbHRqFD3M13_EKncQA-LJporzXZAAXGUPQQsOF4=PvsMg@mail.gmail.com>
Message-ID: <CADiSq7fnwJOCyVAZwRDx4aT+QtnaGOFqDw2Dnuu9jhF_paNN4A@mail.gmail.com>

On 23 June 2016 at 14:27, Victor Stinner <victor.stinner at gmail.com> wrote:
> Raise BlockingIOError in os.urandom()
> -------------------------------------
>
> This idea was proposed as a compromise to let developers decide themself
> how to handle the case:
>
> * catch the exception and uses another weaker entropy source: read
>   ``/dev/urandom`` on Linux, the Python ``random`` module (which is not
>   secure at all), time, process identifier, etc.
> * don't catch the error, the whole program fails with this fatal
>   exception
>
> First of all, no user complained yet that ``os.urandom()`` blocks. This
> point is currently theorical. The Python issues #25420 and #26839 were
> restricted to the Python startup: users complained that Python was
> blocked at startup.
>
> Even if reading /dev/urandom block on OpenBSD, FreeBSD, Mac OS X, etc.
> until urandom is initialized, no user complained yet because Python is
> not used in the process initializing the system and /dev/urandom is
> quickly initialized.  It looks like only Linux users hit the problem on
> virtual machines or embedded devices, and only in some short Python
> scripts used to initialize the the system. Again, ``os.urandom()`` is
> not used in such script (at least, not yet).
>
> As `Leave os.urandom() unchanged, add os.getrandom()`_, the problem is
> that it makes the API more complex and so more error-prone.

I have to admit, this is a pretty solid argument, especially if you
supplement it with Donald's point that affected scripts and
applications will likely split into "doesn't even notice that implicit
delay" and "hangs the world after switching to Python 3.6, but the
developer/integrator sees 'calling os.urandom() may hang the world on
Linux system boot' in the Python 3.6 porting notes".

I'll still keep iterating on PEP 522, but I'm to the point of being +0
on this approach if Guido decides he prefers it :)

Cheers,
Nick.

P.S. DevNation/Red Hat Summit are on next week, so I'll try to get one
more version of PEP 522 done before I leave, but will likely be busy
for most of that time.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Thu Jun 23 22:08:02 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 23 Jun 2016 19:08:02 -0700
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <E827FA58-E60C-46FA-81F1-4E308BC41FD8@stufft.io>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
 <CAMpsgwbRSgBON5-fEQem8tqg71ByEnO5_U7ACS1xSPY=AUuYFQ@mail.gmail.com>
 <CADiSq7fOp6h55TyUNcj2CCeSnp60Jztt7ui=ZPqPD6iJMiq97w@mail.gmail.com>
 <866815ED-76B5-4E0C-95CB-265936D97C2A@stufft.io>
 <CADiSq7ei42jNfGtOKNCoaKc-BoZNscTQLK41+8E9EkvYsGONvA@mail.gmail.com>
 <E827FA58-E60C-46FA-81F1-4E308BC41FD8@stufft.io>
Message-ID: <CADiSq7ey=qnC8bwZmT0y3NHAveUpskZ1qCLN34CH+o-ioyO7jw@mail.gmail.com>

On 23 June 2016 at 18:47, Donald Stufft <donald at stufft.io> wrote:
>> Standing against that is the argument that we wouldn't want the
>> recommended idiom for using the secrets module to become the
>> boilerplatish:
>>
>>    import secrets
>>    secrets.wait_for_system_rng()
>>
>
> Alternative here is to just make every function in secrets ensure it waits for the system RNG, possibly by calling said wait_for_system_rng() function if we still think it?s worth it to make it a public API with a global that gets set once it?s been recorded once.

While we could definitely do that, I think the complexity of it would
push me towards Victor's "just make os.urandom potentially blocking at
system startup" proposal. If 522 is going to make sense, I think it
needs to be framed in a way that makes blocking for the system RNG
clearly an at-most-once-per-process activity.

> The fallback to /dev/random may be a bad idea though, even if it?s only done once per process, I can imagine a case where someone is using emphereal processes so they end up hitting /dev/random regularly. Using getrandom() for this is fine because that state is per machine not per process, but the Python level ?has RNG been initialized? is per process so that could end up with an unintended side effect of hitting /dev/random a lot.

That's the bug that lead to me changing the suggested code to try
os.urandom() once first, before falling back to blocking on
/dev/random. Once the system RNG is ready, that first call will always
succeed, no matter how many new processes you start.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From victor.stinner at gmail.com  Fri Jun 24 07:25:28 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Fri, 24 Jun 2016 13:25:28 +0200
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <CADiSq7fOp6h55TyUNcj2CCeSnp60Jztt7ui=ZPqPD6iJMiq97w@mail.gmail.com>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
 <CAMpsgwbRSgBON5-fEQem8tqg71ByEnO5_U7ACS1xSPY=AUuYFQ@mail.gmail.com>
 <CADiSq7fOp6h55TyUNcj2CCeSnp60Jztt7ui=ZPqPD6iJMiq97w@mail.gmail.com>
Message-ID: <CAMpsgwY+=H_Ppfo1KQ9G+_OA4Z5jMj97UpdFRXrPfkirHUWpPA@mail.gmail.com>

2016-06-24 2:33 GMT+02:00 Nick Coghlan <ncoghlan at gmail.com>:
>>> 3. Switch to reading ``/dev/urandom`` directly (non-security sensitive)
>>
>> It is what I propose for the random.Random constructor when the random
>> module is imported.
>>
>> Again, the question is if there is a real use case for it. And if yes,
>> if the use case common enough to justify the change?
>>
>> The extreme case is that all applications using os.urandom() would
>> need to be modifiy to add a try/except BlockingIOError. I only
>> exagerate to try to understand the impact of your PEP. I only that
>> only a few applications will use such try/except in practice.
>
> That's where the idea of also adding secrets.wait_for_system_rng()
> comes, rather than having to wrap every library call in a try/except
> block (or risk having those APIs become blocking ones such that async
> developers feel obliged to call them in a separate thread)

I expect that secrets.wait_for_system_rng() will be implemented as
consuming at least 1 byte of entropy, to check if urandom is
initialized, right?

I'm not a big fan of this API: os.urandom() never blocks,
secrets.wait_for_system_rng() helper.

If you say that some users need to call secrets.wait_for_system_rng()
first, for me there is an use case for blocking urandom. So I would
expect a blocking urandom function in the os module directly.

By the way, it would avoid "wasting" 1 random byte of entropy.

Victor

From victor.stinner at gmail.com  Fri Jun 24 07:34:15 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Fri, 24 Jun 2016 13:34:15 +0200
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <866815ED-76B5-4E0C-95CB-265936D97C2A@stufft.io>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
 <CAMpsgwbRSgBON5-fEQem8tqg71ByEnO5_U7ACS1xSPY=AUuYFQ@mail.gmail.com>
 <CADiSq7fOp6h55TyUNcj2CCeSnp60Jztt7ui=ZPqPD6iJMiq97w@mail.gmail.com>
 <866815ED-76B5-4E0C-95CB-265936D97C2A@stufft.io>
Message-ID: <CAMpsgwa6CTiEWLuhN8fKuX8n21wDJJ1=04WaH0RouQDG2gvgqg@mail.gmail.com>

2016-06-24 2:46 GMT+02:00 Donald Stufft <donald at stufft.io>:
> I think the biggest argument to blocking is that there really exist two sort of situations that blocking can happen in:
>
> * It blocks for a tiny amount (maybe <1s) and nobody ever notices and people feel like things ?just work?.
> * It blocks for a long amount of time (possibly forever depending on where in the boot sequence Python is being used) and it hangs for a long time (or forever).
>
> In the second case I think it?s pretty obvious that an exception is better than hanging forever, but in the first case an exception might actually cause people to go out of their way to do something bad to ?stop the pain?. My personal preference is waffling back and forth between them based on which of the two above I feel are more likely to occur in practice.

Maybe I'm wrong, but *starting* to raise BlockingIOError looks like
the opposite direction taken by Python with EINTR (PEP 475).

We had to add try/except InterruptedError in many modules (asyncio,
asyncio, io, multiprocessing, selectors, socket, socketserver,
subprocess), but it was decided to fix the root issue: retry the
syscal if it failed with EINTR directly in the C code, so you never
have to handle InterruptedError at the Python level anymore.

For EINTR, it was decided that the common case is to restart
automatically the syscall. The rare case is when the user expects that
the program is really interrupted, and this case requires to raise an
exception in the signal handler.

FYI The PEP 475 has a minor incompatible change: programs relying on
EINTR with a signal handler not raising a Python exceptions were
broken by this change. They had to modify their signal handler to
raise an exception. I recall to have to fix *one* library and
then..... nothing, nobody complained. I was suprised, I expected that
the "rare" case was more common than that :-)

To come back to urandom: the common case is to wait for random, the
exception is to want to be notified and run special code.

Maybe it's not worth to have to modify all libraries and applications
for the exception, but maybe add a special function for the exception.

In a different thread, I proposed to expose os.getrandom() even if my
PEP (blocking os.urandom) is accepted, because getrandom() provides
features not available only using os.urandom().

What do you think of making os.urandom() blocking on Linux but also
add os.getrandom() to handle the exceptional case?

Victor

From barry at python.org  Fri Jun 24 09:38:05 2016
From: barry at python.org (Barry Warsaw)
Date: Fri, 24 Jun 2016 09:38:05 -0400
Subject: [Security-sig] PEP: Make os.urandom() blocking on Linux
 (version 2)
In-Reply-To: <CAMpsgwbHRqFD3M13_EKncQA-LJporzXZAAXGUPQQsOF4=PvsMg@mail.gmail.com>
References: <CAMpsgwbHRqFD3M13_EKncQA-LJporzXZAAXGUPQQsOF4=PvsMg@mail.gmail.com>
Message-ID: <20160624093805.5d1f1893.barry@wooz.org>

On Jun 23, 2016, at 11:27 PM, Victor Stinner wrote:

>Alternative
>===========

>Leave os.urandom() unchanged, add os.getrandom()
>------------------------------------------------
>
>os.urandom() remains unchanged: never block, but it can return weak
>entropy if system urandom is not initialized yet.
>
>A new ``os.getrandom()`` function is added: thin wrapper to the
>``getrandom()`` syscall.
>
>Expected usage to write portable code::
>
>    def my_random(n):
>        if hasattr(os, 'getrandom'):
>            return os.getrandom(n, 0)
>        return os.urandom(n)

I would actually expect that this would be handled in the secrets module, so
the recommendation would be that most users wouldn't use os.urandom() or
os.getrandom() unless they specifically wanted the low-level functions and
knew what they were doing.  Thus, "expected usage to write portable code"
would be to use secrets.token_bytes().

Other than that, thanks for adding this alternative.

Cheers,
-Barry

From barry at python.org  Fri Jun 24 09:48:00 2016
From: barry at python.org (Barry Warsaw)
Date: Fri, 24 Jun 2016 09:48:00 -0400
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <4282382E-473E-4C83-89D5-BF36B2D8F8D5@stufft.io>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
 <20160623084843.5bbfe3bf@anarchist.wooz.org>
 <4282382E-473E-4C83-89D5-BF36B2D8F8D5@stufft.io>
Message-ID: <20160624094800.21e5d3ac@subdivisions.wooz.org>

On Jun 23, 2016, at 09:54 AM, Donald Stufft wrote:

>Because projects are likely going to be supporting things other than 3.6 for
>a very long time. The ?typical? support matrix for a project on PyPI
>currently looks roughly like 2.6, 2.7, and 3.3+. We?re seeing some projects
>dropping 2.6 finally on PyPI but it?s still a major source of downloads and
>2.7 itself is still ~86% of downloads initiated by pip across all of
>PyPI. There is the idea of a secrets module back port on PyPI, but without
>adding C code to that it?s going to basically just do the same thing as that
>try ? except and if the secrets backport requires C I think you won?t get a
>very large uptick since os.urandom exists already and the issues are subtle
>enough that I don?t think most people are going to grok them immediately and
>will just automatically avoid a C dependency where they don?t immediately see
>the need for one.
>
>Even if we pretend that 3.6+ only is something that?s going to happen in
>anything approaching a short timeline, we?re still going to be fighting
>against the tide for what the vast bulk of documentation out there states to
>do. So not only do we need to wait it out for pre 3.6 to die out, but we also
>need to wait it out for the copious amounts of third party documentation out
>there telling people to just use os.urandom dies.
>
>And even in the future, once we get to a 3.6+ only world, os.urandom and the
>try .. except shim will still ?work? for all anyone can tell (since the
>failure mode on os.urandom itself is practically silent in every way
>imaginable) so unless they already know about this issue and go out of their
>way to switch over to the secrets module, they?re likely to continue using
>something in the os module for a long time.
>
>IOW, I think secrets is great, but I think it mostly helps new code written
>targeting 3.6+ only, rather than being a solution for the vast bulk of
>software already out there or which doesn?t yet exist but is going to support
>older things than 3.6.

The proposed os.urandom() change is only going into Python 3.6, so older
Python users will still be "vulnerable" to the problem until they upgrade.
And without a backported secrets module, they won't have any way to benefit
from the entropy guarantees until they upgrade.

If secrets is backported and available in PyPI, then we can start immediately
changing the os.urandom() meme to something more secure.  Sure it takes a long
time to change minds, but I still think it's better to give users a blessed,
near universally agreed upon, secure alternative immediately.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/security-sig/attachments/20160624/f996e688/attachment.sig>

From barry at python.org  Fri Jun 24 10:01:27 2016
From: barry at python.org (Barry Warsaw)
Date: Fri, 24 Jun 2016 10:01:27 -0400
Subject: [Security-sig] Policy PEP (was Re: RFC: PEP: Make os.urandom()
 blocking on Linux)
In-Reply-To: <CAMpsgwZzBD8nq4cuiq2wnpYQZvU=cPPEFDCxFveJD1KZZZgESg@mail.gmail.com>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
 <20160623084843.5bbfe3bf@anarchist.wooz.org>
 <CAMpsgwZzBD8nq4cuiq2wnpYQZvU=cPPEFDCxFveJD1KZZZgESg@mail.gmail.com>
Message-ID: <20160624100127.3d3cdd75@subdivisions.wooz.org>

On Jun 24, 2016, at 12:11 AM, Victor Stinner wrote:

>Once we modified Python 3.6 to handle correctly "the bug" and we
>consider that the implementation is tested enough, I suggest to
>backport it to Python 2.7 as well. Moreover, I would also suggest to
>backport the change to Python 3.5, I would be sad if Python 2 is more
>secure than the latest Python 3 release :-)

This is the fundamental point of disagreement, and I think it points again to
a deficiency in our process.  Regardless of outcome of this specific case, I
think we should try to tighten up our definitions and codify our policy in an
informational PEP.

What criteria do we use to classify an issue as a security bug requiring a
fix, with backports, overriding any backward compatibility breaks?

I think we've been largely ad-hoc about this question.

One thing I think such an informational PEP must require is a rationale as to
why the issue is being classified as a security bug, a backporting rationale
and plan, and a "Backwards Compatibility Impact Assessment", which I'm very
glad to see in PEP 522.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/security-sig/attachments/20160624/6a0ef5b5/attachment.sig>

From ncoghlan at gmail.com  Fri Jun 24 16:05:40 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 24 Jun 2016 13:05:40 -0700
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <CAMpsgwa6CTiEWLuhN8fKuX8n21wDJJ1=04WaH0RouQDG2gvgqg@mail.gmail.com>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
 <CAMpsgwbRSgBON5-fEQem8tqg71ByEnO5_U7ACS1xSPY=AUuYFQ@mail.gmail.com>
 <CADiSq7fOp6h55TyUNcj2CCeSnp60Jztt7ui=ZPqPD6iJMiq97w@mail.gmail.com>
 <866815ED-76B5-4E0C-95CB-265936D97C2A@stufft.io>
 <CAMpsgwa6CTiEWLuhN8fKuX8n21wDJJ1=04WaH0RouQDG2gvgqg@mail.gmail.com>
Message-ID: <CADiSq7d1t=XYuvbNj+rFAyKccciBU2PL3jZZne6LdCn6K5q=5A@mail.gmail.com>

On 24 June 2016 at 04:34, Victor Stinner <victor.stinner at gmail.com> wrote:
> 2016-06-24 2:46 GMT+02:00 Donald Stufft <donald at stufft.io>:
>> I think the biggest argument to blocking is that there really exist two sort of situations that blocking can happen in:
>>
>> * It blocks for a tiny amount (maybe <1s) and nobody ever notices and people feel like things ?just work?.
>> * It blocks for a long amount of time (possibly forever depending on where in the boot sequence Python is being used) and it hangs for a long time (or forever).
>>
>> In the second case I think it?s pretty obvious that an exception is better than hanging forever, but in the first case an exception might actually cause people to go out of their way to do something bad to ?stop the pain?. My personal preference is waffling back and forth between them based on which of the two above I feel are more likely to occur in practice.
>
> Maybe I'm wrong, but *starting* to raise BlockingIOError looks like
> the opposite direction taken by Python with EINTR (PEP 475).

The difference I see here is that EINTR really can happen at any time,
while the transition from "system RNG is not ready" to "system RNG is
ready" is a once-per-boot deal (and in most cases, the operating
system itself handles making sure the RNG is initialised before it
starts running userspace processes).

As such, the idioms I currently have in PEP 522 are wrong - the "wait
for the system RNG or not" decision wouldn't be one to be made on a
per-call basis, but rather on a per-__main__ execution basis, with
developers choosing which user experience they want to support on
systems with a non-blocking /dev/urandom:

* this application will fail if you run it before the system RNG is
ready (so you may need to add "ExecStartPre=python3 -c 'import
secrets; secrets.wait_for_system_rng()'" in your systemd unit file)
* this application implicitly calls "secrets.wait_for_system_rng()"
and hence may block waiting for the system RNG if you run it before
the system RNG is ready

The default state of Python 3.6+ applications would be the first one,
and I think that's an entirely reasonable default - if you're writing
userspace code that runs before the system RNG is ready, you're out of
the world of normal software development and into the world of
operating system developers, system integrators and embedded system
designers.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From victor.stinner at gmail.com  Fri Jun 24 18:48:08 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sat, 25 Jun 2016 00:48:08 +0200
Subject: [Security-sig] PEP: Make os.urandom() blocking on Linux
 (version 2)
In-Reply-To: <20160624093805.5d1f1893.barry@wooz.org>
References: <CAMpsgwbHRqFD3M13_EKncQA-LJporzXZAAXGUPQQsOF4=PvsMg@mail.gmail.com>
 <20160624093805.5d1f1893.barry@wooz.org>
Message-ID: <CAMpsgwb3e2GYaubaVw7fL+ZS4o-kg_DTiZjV5DrQOYMcXyutyQ@mail.gmail.com>

2016-06-24 15:38 GMT+02:00 Barry Warsaw <barry at python.org>:
>>Expected usage to write portable code::
>>
>>    def my_random(n):
>>        if hasattr(os, 'getrandom'):
>>            return os.getrandom(n, 0)
>>        return os.urandom(n)
>
> I would actually expect that this would be handled in the secrets module, so
> the recommendation would be that most users wouldn't use os.urandom() or
> os.getrandom() unless they specifically wanted the low-level functions and
> knew what they were doing.  Thus, "expected usage to write portable code"
> would be to use secrets.token_bytes().

Oh ok. I will update this section.

Victor

From victor.stinner at gmail.com  Fri Jun 24 19:21:36 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sat, 25 Jun 2016 01:21:36 +0200
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <CADiSq7d1t=XYuvbNj+rFAyKccciBU2PL3jZZne6LdCn6K5q=5A@mail.gmail.com>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
 <CAMpsgwbRSgBON5-fEQem8tqg71ByEnO5_U7ACS1xSPY=AUuYFQ@mail.gmail.com>
 <CADiSq7fOp6h55TyUNcj2CCeSnp60Jztt7ui=ZPqPD6iJMiq97w@mail.gmail.com>
 <866815ED-76B5-4E0C-95CB-265936D97C2A@stufft.io>
 <CAMpsgwa6CTiEWLuhN8fKuX8n21wDJJ1=04WaH0RouQDG2gvgqg@mail.gmail.com>
 <CADiSq7d1t=XYuvbNj+rFAyKccciBU2PL3jZZne6LdCn6K5q=5A@mail.gmail.com>
Message-ID: <CAMpsgwai6sfRtzzF-mNhH3ODOv-vQD+zL8LyjwRN=UwH1pgAOg@mail.gmail.com>

2016-06-24 22:05 GMT+02:00 Nick Coghlan <ncoghlan at gmail.com>:
> As such, the idioms I currently have in PEP 522 are wrong - the "wait
> for the system RNG or not" decision wouldn't be one to be made on a
> per-call basis, but rather on a per-__main__ execution basis, with
> developers choosing which user experience they want to support on
> systems with a non-blocking /dev/urandom:
>
> * this application will fail if you run it before the system RNG is
> ready (so you may need to add "ExecStartPre=python3 -c 'import
> secrets; secrets.wait_for_system_rng()'" in your systemd unit file)

In short, if an application is not run using systemd but directly on
the command line, it *can* fail with a fatal BlockingIOError?

Wait, I don't think that it is an acceptable behaviour from the user
point of view.

Compared to Python 2.7, Python 3.4 and Python 3.5.2 where os.urandom()
never blocks nor raises an exception on Linux, such behaviour change
can be seen as a major regression.


> * this application implicitly calls "secrets.wait_for_system_rng()"
> and hence may block waiting for the system RNG if you run it before
> the system RNG is ready

It's hard to guess if os.urandom() is used in a third-party library.
Maybe it's not. What if a new library version starts to use
os.urandom()? Should you start to call secrets.wait_for_system_rng()?

To be safe, I expect that *all* applications should start with
secrets.wait_for_system_rng()... It doesn't make sense to have to put
such code in *all* applications.

The main advantage of the PEP 522 is to control how the "system
urandom not initialized yet" case is handled. But you are more and
more saying that secrets.wait_for_system_rng() should be used to not
get BlockingIOError in most cases. Am I wrong?

I expect that some libraries will start to use
secrets.wait_for_system_rng() in their own code.

... At the end, it looks you basically reimplemented a blocking
os.urandom(), no?

--

Why do we have to bother *all* users with
secrets.wait_for_system_rng(), while only a very few will really care
of the exceptional case?

Why not adding something for users who want to handle the exceptional
case, but make os.urandom() blocking?

Sorry, I'm repeating myself, but as I wrote, I don't know yet what is
the best option, so I'm "testing" each option.

Victor

From victor.stinner at gmail.com  Fri Jun 24 19:26:19 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sat, 25 Jun 2016 01:26:19 +0200
Subject: [Security-sig] Policy PEP (was Re: RFC: PEP: Make os.urandom()
 blocking on Linux)
In-Reply-To: <20160624100127.3d3cdd75@subdivisions.wooz.org>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
 <20160623084843.5bbfe3bf@anarchist.wooz.org>
 <CAMpsgwZzBD8nq4cuiq2wnpYQZvU=cPPEFDCxFveJD1KZZZgESg@mail.gmail.com>
 <20160624100127.3d3cdd75@subdivisions.wooz.org>
Message-ID: <CAMpsgwbPs7X4hdtY17S1VaL5ofPMcG8YH1bfxzdQTGzMQtdXfA@mail.gmail.com>

2016-06-24 16:01 GMT+02:00 Barry Warsaw <barry at python.org>:
> One thing I think such an informational PEP must require is a rationale as to
> why the issue is being classified as a security bug, a backporting rationale
> and plan, and a "Backwards Compatibility Impact Assessment", which I'm very
> glad to see in PEP 522.

Sorry, I didn't have time yet to think about Python 2.7 and Python
3.5. But it looks like my PEP (make os.urandom() blocking) and Nick's
PEP 522 (os.urandom() can raises BlockingIOError) introduce a backward
incompatible change. Applications which worked well on Python 3.5 may
block/fail with these changes.

I'm not sure that it's worth it to enhance Python 2.7 or 3.5. IMO
discussed changes make Python more secure, but they don't really fix a
critical vulnerability.

I don't think that it's a security vulnerability. I prefer to qualify
it as an enhancement, security "hardening" if you pefer.

Victor

From ncoghlan at gmail.com  Fri Jun 24 19:30:40 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 24 Jun 2016 16:30:40 -0700
Subject: [Security-sig] RFC: PEP: Make os.urandom() blocking on Linux
In-Reply-To: <20160624094800.21e5d3ac@subdivisions.wooz.org>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
 <20160623084843.5bbfe3bf@anarchist.wooz.org>
 <4282382E-473E-4C83-89D5-BF36B2D8F8D5@stufft.io>
 <20160624094800.21e5d3ac@subdivisions.wooz.org>
Message-ID: <CADiSq7d2u1Q===hbLPFLE5GgyRvfm0+vykubnS0GmYtcq5izHw@mail.gmail.com>

On 24 June 2016 at 06:48, Barry Warsaw <barry at python.org> wrote:
>
> If secrets is backported and available in PyPI, then we can start immediately
> changing the os.urandom() meme to something more secure.  Sure it takes a long
> time to change minds, but I still think it's better to give users a blessed,
> near universally agreed upon, secure alternative immediately.

It's not that simple, as secrets relies on the os module to provide
access to the getrandom() syscall (by way of an upgraded os.urandom).
Nothing changes from a security perspective without that additional
level of access to the underlying operating system capabilities. You
could potentially go the ctypes route in a PyPI module, but the
performance would be abysmal, so nobody would use it.

Going for a custom C extension doesn't really work either - you can't
use manylinux1 for it (as the baseline glibc ABI is way too old to
include getrandom), and nobody's going to want to introduce an install
time compiler dependency just to address this relatively obscure
concern.

Even if those problems could be resolved, it isn't really a problem
where I'd advocate for a "standard library only" project to add an
external dependency to address it - if they're going to do that, I'd
instead advocate for them to stop reinventing the wheel, and instead
reach for a third party library that solves their *actual* problem
(like cryptography, passlib, or one of the web frameworks).

This is why I think it makes sense to focus the immediate discussion
on "Given getrandom() as an operating system API, can we improve the
semantics of Python's os.urandom()?". The wider discussion around "How
do we educate Python developers on the difference between simulated
uncertainty and sensitive secrets?" that motivated the introduction of
the secrets module isn't really applicable.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Fri Jun 24 20:07:37 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 24 Jun 2016 17:07:37 -0700
Subject: [Security-sig] PEP 522: Allow BlockingIOError in security
 sensitive APIs on Linux
In-Reply-To: <CAMpsgwai6sfRtzzF-mNhH3ODOv-vQD+zL8LyjwRN=UwH1pgAOg@mail.gmail.com>
References: <CADiSq7dCVmHDo9KY+0iMgOKdsVQp6ZjQ3D=YbXZT2GHG81t2TA@mail.gmail.com>
 <CAMpsgwbRSgBON5-fEQem8tqg71ByEnO5_U7ACS1xSPY=AUuYFQ@mail.gmail.com>
 <CADiSq7fOp6h55TyUNcj2CCeSnp60Jztt7ui=ZPqPD6iJMiq97w@mail.gmail.com>
 <866815ED-76B5-4E0C-95CB-265936D97C2A@stufft.io>
 <CAMpsgwa6CTiEWLuhN8fKuX8n21wDJJ1=04WaH0RouQDG2gvgqg@mail.gmail.com>
 <CADiSq7d1t=XYuvbNj+rFAyKccciBU2PL3jZZne6LdCn6K5q=5A@mail.gmail.com>
 <CAMpsgwai6sfRtzzF-mNhH3ODOv-vQD+zL8LyjwRN=UwH1pgAOg@mail.gmail.com>
Message-ID: <CADiSq7e+XtH+nGEtzGWVVAwq6dpTmF8y2Ve_r8BMFPdazpohng@mail.gmail.com>

On 24 June 2016 at 16:21, Victor Stinner <victor.stinner at gmail.com> wrote:
> 2016-06-24 22:05 GMT+02:00 Nick Coghlan <ncoghlan at gmail.com>:
>> As such, the idioms I currently have in PEP 522 are wrong - the "wait
>> for the system RNG or not" decision wouldn't be one to be made on a
>> per-call basis, but rather on a per-__main__ execution basis, with
>> developers choosing which user experience they want to support on
>> systems with a non-blocking /dev/urandom:
>>
>> * this application will fail if you run it before the system RNG is
>> ready (so you may need to add "ExecStartPre=python3 -c 'import
>> secrets; secrets.wait_for_system_rng()'" in your systemd unit file)
>
> In short, if an application is not run using systemd but directly on
> the command line, it *can* fail with a fatal BlockingIOError?

>From the command line, the answer is equally simple: just run "python3
-c 'import secrets; secrets.wait_for_system_rng()'" before the command
you actually care about.

As an added bonus, that will work even if the command you care about
isn't written in Python 3, and even if it reads from /dev/urandom
rather than using the new syscall.

> Wait, I don't think that it is an acceptable behaviour from the user
> point of view.
>
> Compared to Python 2.7, Python 3.4 and Python 3.5.2 where os.urandom()
> never blocks nor raises an exception on Linux, such behaviour change
> can be seen as a major regression.

The *only* way to get it to block (your PEP) or raise an exception
(PEP 522) is to call os.urandom() (directly or indirectly) when the
kernel RNG isn't ready - I consider the relevant analogy to be to PEP
476, where we turned the silent security failure of accepting an
invalid or untrusted certificate (or one that didn't cover the named
host) into the noisy error of failing to make the connection.

>> * this application implicitly calls "secrets.wait_for_system_rng()"
>> and hence may block waiting for the system RNG if you run it before
>> the system RNG is ready
>
> It's hard to guess if os.urandom() is used in a third-party library.
> Maybe it's not. What if a new library version starts to use
> os.urandom()? Should you start to call secrets.wait_for_system_rng()?
>
> To be safe, I expect that *all* applications should start with
> secrets.wait_for_system_rng()... It doesn't make sense to have to put
> such code in *all* applications.

Application developers porting to Python 3.6 can wait and see what
their own testing reports and what their users report - they don't
need to guess.

> The main advantage of the PEP 522 is to control how the "system
> urandom not initialized yet" case is handled. But you are more and
> more saying that secrets.wait_for_system_rng() should be used to not
> get BlockingIOError in most cases. Am I wrong?

I'm saying I think it's an application level decision, not a library
level decision.

> I expect that some libraries will start to use
> secrets.wait_for_system_rng() in their own code.
>
> ... At the end, it looks you basically reimplemented a blocking
> os.urandom(), no?

Potentially, but one of the important aspects of PEP 522 is that we're
not imposing that outcome by fiat - we're letting developers choose
the behaviour they want on a case by case basis, and seeing what the
emergent consensus on correct behaviour turns out to be.

It's equally possible that the outcome will be that both Python and
Linux developers conclude that this is an operating system integration
issue, so systemd ends up adding a standard "kernelrng" target that
components can wait for, and that then gets included as a requirement
for getting to the singleuser state on most distros.

If we *do* reach a point where "always call
secrets.wait_for_system_rng() before using secrets,
random.SystemRandom or os.urandom" is the idiomatic advice for
Pythonistas, *then* we can make os.urandom() blocking, and
secrets.wait_for_system_rng() would reduced to:

    def wait_for_system_rng():
        os.urandom(1)

> --
>
> Why do we have to bother *all* users with
> secrets.wait_for_system_rng(), while only a very few will really care
> of the exceptional case?

We don't - only the ones that actually get the exception, since
they're necessarily the ones the problem is relevant to. Runtime
system configuration related exceptions aren't something to be avoided
at all costs - if they were, we'd never have made the changes we did
to the way Unicode handling works.

A good example of this at the library level is Armin Ronacher's click
command line helper - when you run that in the C locale under Python
3, it just fails immediately, since the actual problem is that
something has gone wrong and your system locale isn't configured
properly. The right answer is almost always to fix the locale
configuration settings, not to change anything in the Python code.

> Why not adding something for users who want to handle the exceptional
> case, but make os.urandom() blocking?

The main problem I have with the blocking solution is that if someone
hits it unexpectedly, they're left staring at a blinking cursor (at
best), and no helpful hints to get started on debugging the problem.
If it's a component they didn't write, they also can't really give a
good bug report beyond "It hangs when I try to run it".

By contrast, PEP 522 gives them an immediate exception and error
message: "BlockingIOError: system random number generator is not
ready".

If they're a developer themselves, they can plug that into Google and
hopefully find a relevant answer (which we can virtually guarantee by
preseeding Stack Overflow with a suitable response)

If they're *not* the application developer, they can paste the
traceback into a bug report or support ticket and say "Hey, what's
going on here?". At which point, the developer or support tech
handling the ticket can do the appropriate Google search and respond
accordingly.

Now, we could gain most of those debuggability benefits for a blocking
solution by trying in non-blocking mode first, then falling back to
blocking only if we get EAGAIN - that would let us print a
Google-friendly warning message before we implicitly block.

That's where the argument of adopting a consistent approach of "try
non-blocking first, then maybe fall back to something else if it
doesn't work" comes into play - if os.urandom() (and hence indirectly
the secrets module) is trying in non-blocking mode and falling back to
an alternative, *and* SipHash initialisation is doing that, *and*
importing the random module is doing that, it sends a strong message
to me that the base primitive here is actually "try to read the system
RNG, and maybe fail to do so", rather than "read the system RNG and
only return when the requested data is available"

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sat Jun 25 14:17:13 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 25 Jun 2016 11:17:13 -0700
Subject: [Security-sig] Policy PEP (was Re: RFC: PEP: Make os.urandom()
 blocking on Linux)
In-Reply-To: <20160624100127.3d3cdd75@subdivisions.wooz.org>
References: <CAMpsgwa2v54pjPtLohCyGUucNBHVMds8VSTsE_VpBvgw5FMh8w@mail.gmail.com>
 <20160621185709.3ab50572.barry@wooz.org>
 <5A6C6F76-6196-4AF8-9971-702794D4908F@lukasa.co.uk>
 <20160622203515.12e20601@anarchist.wooz.org>
 <CADiSq7esOnsZjCVTUj6jG+R=X0r_SnXp2aZT-EscjS=e13xo_A@mail.gmail.com>
 <20160623084843.5bbfe3bf@anarchist.wooz.org>
 <CAMpsgwZzBD8nq4cuiq2wnpYQZvU=cPPEFDCxFveJD1KZZZgESg@mail.gmail.com>
 <20160624100127.3d3cdd75@subdivisions.wooz.org>
Message-ID: <CADiSq7favUc6-smUnesq=1E69tZvf2jVQdhMPL5s59A0GiUiww@mail.gmail.com>

On 24 June 2016 at 07:01, Barry Warsaw <barry at python.org> wrote:
> On Jun 24, 2016, at 12:11 AM, Victor Stinner wrote:
>
>>Once we modified Python 3.6 to handle correctly "the bug" and we
>>consider that the implementation is tested enough, I suggest to
>>backport it to Python 2.7 as well. Moreover, I would also suggest to
>>backport the change to Python 3.5, I would be sad if Python 2 is more
>>secure than the latest Python 3 release :-)
>
> This is the fundamental point of disagreement, and I think it points again to
> a deficiency in our process.  Regardless of outcome of this specific case, I
> think we should try to tighten up our definitions and codify our policy in an
> informational PEP.
>
> What criteria do we use to classify an issue as a security bug requiring a
> fix, with backports, overriding any backward compatibility breaks?
>
> I think we've been largely ad-hoc about this question.

PEP 466 aimed to answer it:
https://www.python.org/dev/peps/pep-0466/#why-these-particular-changes

The most significant sentence in that section is this one: "The key
requirement for a feature to be considered for inclusion in this
proposal was that it must have security implications beyond the
specific application that is written in Python and the system that
application is running on."

Earlier drafts of the PEP did aim to define that as a standard policy,
but Guido nixed that idea, instead requesting that every such security
related backport proposal receive its own dedicated PEP.

For PEP 466, the limitations of the Python 2.7 standard library were
holding back the evolution of network security in general (e.g. by
acting as a brake on the adoption of Server-Name-Indication and on
servers forcing TLS-only secure connections). For PEP 476, the
mismatch between how people assumed the standard library handled HTTPS
connections and how it actually did handle them was causing real
security vulnerabilities in networked applications.

For the PEPs currently under consideration, I don't think the
situation is as critical as that - we're talking about a rare
situation specific to secret generation on Linux with poorly
configured entropy sources, not the core handling of SSL/TLS and HTTPS
used by a large proportion of networked applications. That means I
believe "Folks that genuinely care about secure secret generation
should upgrade to Python 3.6 and a Linux kernel with getrandom()
support" is an entirely reasonable position for us to take in this
case.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sat Jun 25 16:21:53 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 25 Jun 2016 13:21:53 -0700
Subject: [Security-sig] getrandom() syscalls and Python runtime binary
 portability
Message-ID: <CADiSq7douAXNQAbbPTmXmAxNu8K9=LdhvSkjRGsxAMGUKt=ETg@mail.gmail.com>

Hi folks,

Working on an update to PEP 522, I realised while poking around in
sysconfig for the HAVE_GETRANDOM_SYSCALL flag that only checking for
whether or not the syscall had been available at buildtime would be
potentially problematic - it means that a Python built against a newer
Linux kernel (e.g. Ubuntu 16.04, Fedora 24) may do the wrong thing
when run on an older kernel that hasn't had the new syscall backported
(e.g. Ubuntu 14.04, RHEL 7.2, CentOS 7.1511). That's something that
can easily happen with containers, or any other case of bundling the
language runtime with the application executable.

The actual code behind os.urandom already deals with this case
correctly (see the ENOSYS reference in py_getrandom at [1]), but it
means there really is no way for pure Python code running against an
older kernel to tell whether a successful os.urandom() call was
because the system RNG was ready or because the kernel is old.

So regardless of whether we go with the blocking-by-default or
raise-BlockingIOError strategy, we should also define what we want the
interpreter to do in the ENOSYS case (for PEP 522, I wanted to warn
about it in the new secrets.wait_for_system_rng() function, but at
least for now I'm going to settle for letting the SipHash
initialisation warn about it, the same way it would for a lack of
entropy)

This is actually the best argument I've seen so far for exposing
os.getrandom() directly: unlike os.urandom(), we could allow a new
os.getrandom() API to raise NotImplementedError if the running kernel
doesn't provide the getrandom() syscall.

Having such an API available would then let
secrets.wait_for_system_rng() more reliably check whether or not the
system RNG was ready before falling back on a potentially blocking
read of /dev/random.

Cheers,
Nick.

[1] https://hg.python.org/cpython/file/default/Python/random.c#l119

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sat Jun 25 16:36:55 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 25 Jun 2016 13:36:55 -0700
Subject: [Security-sig] Does the buildtime HAVE_GETRANDOM_SYSCALL check
 actually make sense?
Message-ID: <CADiSq7fwhvofS6ss=adVxr8E=NgwOB1yh0RRARbkmZstbtoR9Q@mail.gmail.com>

I think I've found a better explanation than mere timing or
coincidence for why we haven't had any problem reports regarding
Python 3.5.1 blocking in Fedora or in the Software Collections builds
for RHEL and CentOS: those builds currently aren't even trying to use
the new syscall, and are instead always using the older non-blocking
/dev/urandom behaviour.

My assumption in https://bugzilla.redhat.com/show_bug.cgi?id=1350123
is that this is due to those binaries being built against a version of
the kernel that doesn't have that syscall defined, which means the
config script doesn't define HAVE_GETRANDOM_SYSCALL, which means we
compile out the code that tries calling it at runtime.

Does skipping trying the new syscall at runtime just because the build
server is running an older kernel actually make sense? Or we would be
better off defining a different TRY_GETRANDOM_SYSCALL that looks for
some other indicator that this is a build for a platform where
getrandom() might be available at runtime, even if it's not available
at build time.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sat Jun 25 19:20:00 2016
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 25 Jun 2016 16:20:00 -0700
Subject: [Security-sig] PEP 522 (v2): Allow BlockingIOError in security
 sensitive APIs
Message-ID: <CADiSq7dsFj1EPJebU3n_DTW_axBqRjxbPU_Gtg04jO5X4YiACw@mail.gmail.com>

I've posted a new version of PEP 522 that adds
secrets.wait_for_system_rng(). Full version inline below for ease of
quoting, but if you just want to see the diff:
https://github.com/python/peps/commit/ace74fca2a9a0cf9a01d38247b15d37b7c0e76a0

The major flow on change from that is simplifying the "What to do?"
recommendations, which are now simply:

- security sensitive applications should either let the exception fly
or call secrets.wait_for_system_rng()
- non-security sensitive applications should switch to using the random module

I've also tried to document the impact on other operating systems that
also offer the getrandom() syscall, and explicitly referenced the fact
that the likely reason we haven't heard anything from the
Fedora/RHEL/CentOS worlds is because those builds haven't enabled the
new behaviour due to the way their respective build systems work.

Cheers,
Nick.

======================================
PEP: 522
Title: Allow BlockingIOError in security sensitive APIs
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan at gmail.com>, Nathaniel J. Smith <njs at pobox.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Requires: 506
Created: 16 June 2016
Python-Version: 3.6


Abstract
========

A number of APIs in the standard library that return random values nominally
suitable for use in security sensitive operations currently have an obscure
operating system dependent failure mode that allows them to return values that
are not, in fact, suitable for such operations.

This is due to some operating system kernels (most notably the Linux kernel)
permitting reads from ``/dev/urandom`` before the system random number
generator is fully initialized, whereas most other operating systems will
implicitly block on such reads until the random number generator is ready.

This PEP proposes changing such failures in Python 3.6 from the current silent,
hard to detect, and hard to debug, errors to easily detected and debugged errors
by raising ``BlockingIOError`` with a suitable error message, allowing
developers the opportunity to unambiguously specify their preferred approach
for handling the situation.

This change will impact any operating system that offers the ``getrandom()``
system call, regardless of whether the default behaviour of the
``/dev/urandom`` device is to return potentially predictable results when the
system random number generator is not ready (e.g. Linux, NetBSD) or to block
(e.g. FreeBSD, Solaris, Illumos). Operating systems that prevent execution of
userspace code prior to the initialization of the system random number
generator, or do not offer the ``getrandom()`` syscall, will be entirely
unaffected by the proposed change (e.g. Windows, Mac OS X, OpenBSD).

The APIs affected by this change would be:

* ``os.urandom``
* ``random.SystemRandom``
* the new ``secrets`` module added by PEP 506

A new ``secrets.wait_for_system_rng()`` API would be added to allow affected
applications, frameworks and scripts that encounter the new exception to
readily opt-in to blocking behaviour if they so choose, either by modifying
the affected application directly, or by running a preceding
``python3 -c "import secrets; secrets.wait_for_system_rng()"`` command.

The new exception would potentially be encountered in the following situations:

* Python code calling these APIs during Linux system initialization
* Python code running on improperly initialized Linux systems (e.g. embedded
  hardware without adequate sources of entropy to seed the system random number
  generator, or Linux VMs that aren't configured to accept entropy from the
  VM host)

CPython interpreter initialization and ``random`` module initialization would
also be updated to gracefully fall back to alternative seeding options if the
system random number generator is not ready.


Relationship with other PEPs
============================

This PEP depends on the Accepted PEP 506, which adds the ``secrets`` module.

This PEP competes with Victor Stinner's `currently unnumbered proposal
<http://haypo-notes.readthedocs.io/pep_random.html>`_ to make
``os.urandom`` implicitly block when the system RNG is not ready.


Proposal
========

Changing ``os.urandom()`` on platforms with the getrandom() system call
-----------------------------------------------------------------------

This PEP proposes that in Python 3.6+, ``os.urandom()`` be updated to call
the ``getrandom()`` syscall in non-blocking mode if available and
raise ``BlockingIOError: system random number generator is not ready`` if
the kernel reports that the call would block.

This behaviour will then propagate through to higher level standard library
APIs that depend on ``os.urandom`` (specifically ``random.SystemRandom`` and
the new ``secrets`` module introduced by PEP 506).

In all cases, as soon as a call to one of these security sensitive APIs
succeeds, all future calls to these APIs in that process will succeed (once
the operating system random number generator is ready after system boot, it
remains ready).

On Linux and NetBSD, this will replace the previous behaviour of returning
potentially predictable results read from ``/dev/urandom``.

On FreeBSD, Solaris, and Illumos, this will replace the previous behaviour of
implicitly blocking until the system random number generator is ready. However,
it is not clear if these operating systems actually allow userspace code (and
hence Python) to run before the system random number generator is ready.

Note that in all cases, if calling the ``getrandom()`` API reports ``ENOSYS``
rather than returning a successful response or reporting ``EAGAIN``, CPython
will continue to fall back to reading from ``/dev/urandom`` directly.


Adding ``secrets.wait_for_system_rng()``
----------------------------------------

A new exception shouldn't be added without a straightforward recommendation
for how to resolve that error when encountered (however rare encountering
the new error is expected to be in practice). For security sensitive code that
actually does need to use the system random number generator, and does receive
live bug reports indicating this is a real problem for the userbase of that
particular application rather than a theoretical one, this PEP's recommendation
will be to add the following snippet (directly or indirectly) to the
``__main__`` module::

    import secrets
    secrets.wait_for_system_rng()

Or, if compatibility with versions prior to Python 3.6 is needed::

    try:
        import secrets
    except ImportError:
        pass
    else:
        secrets.wait_for_system_rng()

Application frameworks covering use cases where access to the system random
number generator is almost certain to be needed (e.g. web frameworks) may
choose to incorporate this step implicitly into the commands that start the
application.

For cases where the error is encountered for an application which cannot be
modified directly, then the following command can be used to wait for the
system random number generator to initialize before starting that application::

    python3 -c "import secrets; secrets.wait_for_system_rng()"

For example, this snippet could be added to a shell script or a systemd
``ExecStartPre`` hook (and may prove useful in reliably waiting for the
system random number generator to be ready, even if the subsequent command
is not itself an application running under Python 3.6)

Given the changes proposed to ``os.urandom()`` above, the suggested
implementation of this function would be::

    def wait_for_system_rng():
        """Block waiting for system random number generator to be ready"""
        # If the system RNG is already seeded, don't wait at all
        try:
            os.urandom(1)
            return
        except BlockingIOError:
            pass
        # Avoid the below busy loop if possible
        try:
            block_on_system_rng = open("/dev/random", "rb")
        except FileNotFoundError:
            pass
        else:
            with block_on_system_rng:
                block_on_system_rng.read(1)
        # Busy loop until the system RNG is ready
        while True:
            try:
                os.urandom(1)
                break
            except BlockingIOError:
                # Only check once per millisecond
                time.sleep(0.001)

On systems where it is possible to wait for the system RNG to be ready, this
function will do so without a busy loop if either ``os.urandom()``
itself implicitly blocks, or the ``/dev/random`` device is available. If the
system random number generator is ready, this call is guaranteed to never
block, even if the system's ``/dev/random`` device uses a design that permits
it to block intermittently during normal system operation.


Related changes
---------------

Currently, SipHash initialization and ``random`` module initialization
both gather random bytes using the same code that underlies
``os.urandom``. This PEP proposes to modify these so that in situations where
``os.urandom`` would raise a ``BlockingIOError``, they automatically
fall back on potentially more predictable sources of randomness.

In the SipHash case, this will also print a warning message to ``stderr``
indicating that that particular Python process should not be used to process
untrusted data: "Python reverted to potentially predictable hash
initialization. Avoid handling untrusted data in this process.". This
warning would NOT be displayed when hash randomization is explicitly disabled
or set to a known value via ``PYTHONHASHSEED``.

To transparently accommodate a potential future where Linux adopts the same
"potentially blocking during system initialization" ``/dev/urandom`` behaviour
used by other \*nix systems, this fallback source of randomness will *not* be
the ``/dev/urandom`` device.


Limitations on scope
--------------------

No changes are proposed for Windows or Mac OS X systems, as neither of those
platforms provides any mechanism to run Python code before the operating
system random number generator has been initialized. Mac OS X goes so far as
to kernel panic and abort the boot process if it can't properly initialize the
random number generator (although Apple's restrictions on the supported
hardware platforms make that exceedingly unlikely in practice).

Similarly, no changes are proposed for other \*nix systems that do not offer
the ``getrandom()`` syscall. On these systems, ``os.urandom()`` will continue
to block waiting for the system random number generator to be initialized.

While other \*nix systems that offer a non-blocking API (other than
``getrandom()``) for requesting random numbers suitable for use in security
sensitive applications could potentially receive a similar update to the one
proposed for ``getrandom()`` in this PEP, such changes are out of scope for
this particular proposal.

Python's behaviour on older versions of affected platforms that do not offer
the new ``getrandom()`` syscall will also remain unchanged.


Rationale
=========

Raising ``BlockingIOError`` in ``os.urandom()`` on Linux
--------------------------------------------------------

For several years now, the security community's guidance has been to use
``os.urandom()`` (or the ``random.SystemRandom()`` wrapper) when implementing
security sensitive operations in Python.

To help improve API discoverability and make it clearer that secrecy and
simulation are not the same problem (even though they both involve
random numbers), PEP 506 collected several of the one line recipes based
on the lower level ``os.urandom()`` API into a new ``secrets`` module.

However, this guidance has also come with a longstanding caveat: developers
writing security sensitive software at least for Linux, and potentially for
some other \*BSD systems, may need to wait until the operating system's
random number generator is ready before relying on it for security sensitive
operations. This generally only occurs if ``os.urandom()`` is read very
early in the system initialization process, or on systems with few sources of
available entropy (e.g. some kinds of virtualized or embedded systems), but
unfortunately the exact conditions that trigger this are difficult to predict,
and when it occurs then there is no direct way for userspace to tell it has
happened without querying operating system specific interfaces.

On \*BSD systems (if the particular \*BSD variant allows the problem to occur
at all) and potentially also Solaris and Illumos, encountering this situation
means ``os.urandom()`` will either block waiting for the system random number
generator to be ready (the associated symptom would be for the affected script
to pause unexpectedly on the first call to ``os.urandom()``) or else will
behave the same way as it does on Linux.

On Linux, in Python versions up to and including Python 3.4, and in
Python 3.5 maintenance versions following Python 3.5.2, there's no clear
indicator to developers that their software may not be working as expected
when run early in the Linux boot process, or on hardware without good
sources of entropy to seed the operating system's random number generator: due
to the behaviour of the underlying ``/dev/urandom`` device, ``os.urandom()``
on Linux returns a result either way, and it takes extensive statistical
analysis to show that a security vulnerability exists.

By contrast, if ``BlockingIOError`` is raised in those situations, then
developers using Python 3.6+ can easily choose their desired behaviour:

1. Wait for the system RNG at or before application startup (security sensitive)
2. Switch to using the random module (non-security sensitive)


Adding ``secrets.wait_for_system_rng()``
----------------------------------------

Earlier versions of this PEP proposed a number of recipes for wrapping
``os.urandom()`` to make it suitable for use in security sensitive use cases.

Discussion of the proposal on the security-sig mailing list prompted the
realization [9]_ that the core assumption driving the API design in this PEP
was that choosing between letting the exception cause the application to fail,
blocking waiting for the system RNG to be ready and switching to using the
``random`` module instead of ``os.urandom`` is an application and use-case
specific decision that should take into account application and use-case
specific details.

There is no way for the interpreter runtime or support libraries to determine
whether a particular use case is security sensitive or not, and while it's
straightforward for application developer to decide how to handle an exception
thrown by a particular API, they can't readily workaround an API blocking when
they expected it to be non-blocking.

Accordingly, the PEP was updated to add ``secrets.wait_for_system_rng()`` as
an API for applications, scripts and frameworks to use to indicate that they
wanted to ensure the system RNG was available before continuing, while library
developers could continue to call ``os.urandom()`` without worrying that it
might unexpectedly start blocking waiting for the system RNG to be available.


Issuing a warning for potentially predictable internal hash initialization
--------------------------------------------------------------------------

The challenge for internal hash initialization is that it might be very
important to initialize SipHash with a reliably unpredictable random seed
(for processes that are exposed to potentially hostile input) or it might be
totally unimportant (for processes that never have to deal with untrusted data).

The Python runtime has no way to know which case a given invocation involves,
which means that if we allow SipHash initialization to block or error out,
then our intended security enhancement may break code that is already safe
and working fine, which is unacceptable -- especially since we are reasonably
confident that most Python invocations that might run during Linux system
initialization fall into this category (exposure to untrusted input tends to
involve network access, which typically isn't brought up until after the system
random number generator is initialized).

However, at the same time, since Python has no way to know whether any given
invocation needs to handle untrusted data, when the default SipHash
initialization fails this *might* indicate a genuine security problem, which
should not be allowed to pass silently.

Accordingly, if internal hash initialization needs to fall back to a potentially
predictable seed due to the system random number generator not being ready, it
will also emit a warning message on ``stderr`` to say that the system random
number generator is not available and that processing potentially hostile
untrusted data should be avoided.


Allowing potentially predictable ``random`` module initialization
-----------------------------------------------------------------

Other than for ``random.SystemRandom`` (which is a relatively thin
wrapper around ``os.urandom``), the ``random`` module has long documented
that the numbers it generates are not suitable for use in security sensitive
operations. Instead, the use of the system random number generator to seed the
default Mersenne Twister instance is primarily aimed at ensuring Python isn't
biased towards any particular starting states for simulation use cases.
However, this seeding approach has also turned out to be beneficial as a harm
mitigation measure for code that is using the ``random`` module inappropriately.

Since a single call to ``os.urandom()`` is cheap once the system random
number generator has been initialized it makes sense to retain that as the
default behaviour, but there's no need to issue a warning when falling back to
a potentially more predictable alternative when necessary (in such cases,
a warning will typically already have been issued as part of interpreter
startup, as the only way for the call when importing the random module to
fail without the implicit call during interpreter startup also failing if for
the latter to have been skipped by entirely disabling the hash randomization
mechanism).


Backwards Compatibility Impact Assessment
=========================================

Similar to PEP 476, this is a proposal to turn a previously silent security
failure into a noisy exception that requires the application developer to
make an explicit decision regarding the behaviour they desire.

As no changes are proposed for operating systems that don't provide the
``getrandom()`` syscall, ``os.urandom()`` retains its existing behaviour as
a nominally blocking API that is non-blocking in practice due to the difficulty
of scheduling Python code to run before the operating system random number
generator is ready. We believe it may be possible to encounter problems akin to
those described in this PEP on at least some \*BSD variants, but nobody has
explicitly demonstrated that. On Mac OS X and Windows, it appears to be
straight up impossible to even try to run a Python interpreter that early in
the boot process.

On Linux and other platforms with similar ``/dev/urandom`` behaviour,
``os.urandom()`` retains its status as a guaranteed non-blocking API.
However, the means of achieving that status changes in the specific case of
the operating system random number generator not being ready for use in security
sensitive operations: historically it would return potentially predictable
random data, with this PEP it would change to raise ``BlockingIOError``.

Developers of affected applications would then be required to make one of the
following changes to gain forward compatibility with Python 3.6, based on the
kind of application they're developing.


Unaffected Applications
-----------------------

The following kinds of applications would be entirely unaffected by the change,
regardless of whether or not they perform security sensitive operations:

- applications that don't support Linux
- applications that are only run on desktops or conventional servers
- applications that are only run after the system RNG is ready

Applications in this category simply won't encounter the new exception, so it
will be reasonable for developers to wait and see if they receive
Python 3.6 compatibility bugs related to the new runtime behaviour, rather than
attempting to pre-emptively determine whether or not they're affected.


Affected security sensitive applications
----------------------------------------

Security sensitive applications would need to either change their system
configuration so the application is only started after the operating system
random number generator is ready for security sensitive operations, or else
change the application startup code to invoke ``secrets.wait_for_system_rng()``

As an example for components started via a systemd unit file, the following
snippet would delay activation until the system RNG was ready:

    ExecStartPre=python3 -c "import secrets; secrets.wait_for_system_rng()"


Affected non-security sensitive applications
--------------------------------------------

Non-security sensitive applications should be updated to use the ``random``
module rather than ``os.urandom``::

    def pseudorandom_bytes(num_bytes):
        return random.getrandbits(num_bytes*8).to_bytes(num_bytes, "little")

Depending on the details of the application, the random module may offer
other APIs that can be used directly, rather than needing to emulate the
raw byte sequence produced by the ``os.urandom()`` API.


Additional Background
=====================

Why propose this now?
---------------------

The main reason is because the Python 3.5.0 release switched to using the new
Linux ``getrandom()`` syscall when available in order to avoid consuming a
file descriptor [1]_, and this had the side effect of making the following
operations block waiting for the system random number generator to be ready:

* ``os.urandom`` (and APIs that depend on it)
* importing the ``random`` module
* initializing the randomized hash algorithm used by some builtin types

While the first of those behaviours is arguably desirable (and consistent with
the existing behaviour of ``os.urandom`` on other operating systems), the
latter two behaviours are unnecessary and undesirable, and the last one is now
known to cause a system level deadlock when attempting to run Python scripts
during the Linux init process with Python 3.5.0 or 3.5.1 [2]_, while the second
one can cause problems when using virtual machines without robust entropy
sources configured [3]_.

Since decoupling these behaviours in CPython will involve a number of
implementation changes more appropriate for a feature release than a maintenance
release, the relatively simple resolution applied in Python 3.5.2 was to revert
all three of them to a behaviour similar to that of previous Python versions:
if the new Linux syscall indicates it will block, then Python 3.5.2 will
implicitly fall back on reading ``/dev/urandom`` directly [4]_.

However, this bug report *also* resulted in a range of proposals to add *new*
APIs like ``os.getrandom()`` [5]_, ``os.urandom_block()`` [6]_,
``os.pseudorandom()`` and ``os.cryptorandom()`` [7]_, or adding new optional
parameters to ``os.urandom()`` itself [8]_, and then attempting to educate
users on when they should call those APIs instead of just using a plain
``os.urandom()`` call.

These proposals arguably represent overreactions, as the question of reliably
obtaining random numbers suitable for security sensitive work on Linux is a
relatively obscure problem of interest mainly to operating system developers
and embedded systems programmers, that may not justify expanding the
Python standard library's cross-platform APIs with new Linux-specific concerns.
This is especially so with the ``secrets`` module already being added as the
"use this and don't worry about the low level details" option for developers
writing security sensitive software that for some reason can't rely on even
higher level domain specific APIs (like web frameworks) and also don't need to
worry about Python versions prior to Python 3.6.

That said, it's also the case that low cost ARM devices are becoming
increasingly prevalent, with a lot of them running Linux, and a lot of folks
writing Python applications that run on those devices. That creates an
opportunity to take an obscure security problem that currently requires a lot
of knowledge about Linux boot processes and provably unpredictable random
number generation to diagnose and resolve, and instead turn it into a
relatively mundane and easy-to-find-in-an-internet-search runtime exception.


The cross-platform behaviour of ``os.urandom()``
------------------------------------------------

On operating systems other than Linux and NetBSD, ``os.urandom()`` may already
block waiting for the operating system's random number generator to be ready.
This will happen at most once in the lifetime of the process, and the call is
subsequently guaranteed to be non-blocking.

Linux and NetBSD are outliers in that, even when the operating system's random
number generator doesn't consider itself ready for use in security sensitive
operations, reading from the ``/dev/urandom`` device will return random values
based on the entropy it has available.

This behaviour is potentially problematic, so Linux 3.17 added a new
``getrandom()`` syscall that (amongst other benefits) allows callers to
either block waiting for the random number generator to be ready, or
else request an error return if the random number generator is not ready.
Notably, the new API does *not* support the old behaviour of returning
data that is not suitable for security sensitive use cases.

Versions of Python prior up to and including Python 3.4 access the
Linux ``/dev/urandom`` device directly.

Python 3.5.0 and 3.5.1 (when build on a system that offered the new syscall)
called ``getrandom()`` in blocking mode in order to avoid the use of a file
descriptor to access ``/dev/urandom``. While there were no specific problems
reported due to ``os.urandom()`` blocking in user code, there *were* problems
due to CPython implicitly invoking the blocking behaviour during interpreter
startup and when importing the ``random`` module.

Rather than trying to decouple SipHash initialization from the
``os.urandom()`` implementation, Python 3.5.2 switched to calling
``getrandom()`` in non-blocking mode, and falling back to reading from
``/dev/urandom`` if the syscall indicates it will block.

As a result of the above, ``os.urandom()`` in all Python versions up to and
including Python 3.5 propagate the behaviour of the underling ``/dev/urandom``
device to Python code.


Problems with the behaviour of ``/dev/urandom`` on Linux
--------------------------------------------------------

The Python ``os`` module has largely co-evolved with Linux APIs, so having
``os`` module functions closely follow the behaviour of their Linux operating
system level counterparts when running on Linux is typically considered to be
a desirable feature.

However, ``/dev/urandom`` represents a case where the current behaviour is
acknowledged to be problematic, but fixing it unilaterally at the kernel level
has been shown to prevent some Linux distributions from booting (at least in
part due to components like Python currently using it for
non-security-sensitive purposes early in the system initialization process).

As an analogy, consider the following two functions::

    def generate_example_password():
        """Generates passwords solely for use in code examples"""
        return generate_unpredictable_password()

    def generate_actual_password():
        """Generates actual passwords for use in real applications"""
        return generate_unpredictable_password()

If you think of an operating system's random number generator as a method for
generating unpredictable, secret passwords, then you can think of Linux's
``/dev/urandom`` as being implemented like::

    # Oversimplified artist's conception of the kernel code
    # implementing /dev/urandom
    def generate_unpredictable_password():
        if system_rng_is_ready:
            return use_system_rng_to_generate_password()
        else:
            # we can't make an unpredictable password; silently return a
            # potentially predictable one instead:
            return "p4ssw0rd"

In this scenario, the author of ``generate_example_password`` is fine - even if
``"p4ssw0rd"`` shows up a bit more often than they expect, it's only used in
examples anyway. However, the author of ``generate_actual_password`` has a
problem - how do they prove that their calls to
``generate_unpredictable_password`` never follow the path that returns a
predictable answer?

In real life it's slightly more complicated than this, because there
might be some level of system entropy available -- so the fallback might
be more like ``return random.choice(["p4ssword", "passw0rd",
"p4ssw0rd"])`` or something even more variable and hence only statistically
predictable with better odds than the author of ``generate_actual_password``
was expecting. This doesn't really make things more provably secure, though;
mostly it just means that if you try to catch the problem in the obvious way --
``if returned_password == "p4ssw0rd": raise UhOh`` -- then it doesn't work,
because ``returned_password`` might instead be ``p4ssword`` or even
``pa55word``, or just an arbitrary 64 bit sequence selected from fewer than
2**64 possibilities. So this rough sketch does give the right general idea of
the consequences of the "more predictable than expected" fallback behaviour,
even though it's thoroughly unfair to the Linux kernel team's efforts to
mitigate the practical consequences of this problem without resorting to
breaking backwards compatibility.

This design is generally agreed to be a bad idea. As far as we can
tell, there are no use cases whatsoever in which this is the behavior
you actually want. It has led to the use of insecure ``ssh`` keys on
real systems, and many \*nix-like systems (including at least Mac OS
X, OpenBSD, and FreeBSD) have modified their ``/dev/urandom``
implementations so that they never return predictable outputs, either
by making reads block in this case, or by simply refusing to run any
userspace programs until the system RNG has been
initialized. Unfortunately, Linux has so far been unable to follow
suit, because it's been empirically determined that enabling the
blocking behavior causes some currently extant distributions to
fail to boot.

Instead, the new ``getrandom()`` syscall was introduced, making
it *possible* for userspace applications to access the system random number
generator safely, without introducing hard to debug deadlock problems into
the system initialization processes of existing Linux distros.


Consequences of ``getrandom()`` availability for Python
-------------------------------------------------------

Prior to the introduction of the ``getrandom()`` syscall, it simply wasn't
feasible to access the Linux system random number generator in a provably
safe way, so we were forced to settle for reading from ``/dev/urandom`` as the
best available option. However, with ``getrandom()`` insisting on raising an
error or blocking rather than returning predictable data, as well as having
other advantages, it is now the recommended method for accessing the kernel
RNG on Linux, with reading ``/dev/urandom`` directly relegated to "legacy"
status. This moves Linux into the same category as other operating systems
like Windows, which doesn't provide a ``/dev/urandom`` device at all: the
best available option for implementing ``os.urandom()`` is no longer simply
reading bytes from the ``/dev/urandom`` device.

This means that what used to be somebody else's problem (the Linux kernel
development team's) is now Python's problem -- given a way to detect that the
system RNG is not initialized, we have to choose how to handle this
situation whenever we try to use the system RNG.

It could simply block, as was somewhat inadvertently implemented in 3.5.0,
and as is proposed in Victor Stinner's competing PEP::

    # artist's impression of the CPython 3.5.0-3.5.1 behavior
    def generate_unpredictable_bytes_or_block(num_bytes):
        while not system_rng_is_ready:
            wait
        return unpredictable_bytes(num_bytes)

Or it could raise an error, as this PEP proposes (in *some* cases)::

    # artist's impression of the behavior proposed in this PEP
    def generate_unpredictable_bytes_or_raise(num_bytes):
        if system_rng_is_ready:
            return unpredictable_bytes(num_bytes)
        else:
            raise BlockingIOError

Or it could explicitly emulate the ``/dev/urandom`` fallback behavior,
as was implemented in 3.5.2rc1 and is expected to remain for the rest
of the 3.5.x cycle::

    # artist's impression of the CPython 3.5.2rc1+ behavior
    def generate_unpredictable_bytes_or_maybe_not(num_bytes):
        if system_rng_is_ready:
            return unpredictable_bytes(num_bytes)
        else:
            return (b"p4ssw0rd" * (num_bytes // 8 + 1))[:num_bytes]

(And the same caveats apply to this sketch as applied to the
``generate_unpredictable_password`` sketch of ``/dev/urandom`` above.)

There are five places where CPython and the standard library attempt to use the
operating system's random number generator, and thus five places where this
decision has to be made:

* initializing the SipHash used to protect ``str.__hash__`` and
  friends against DoS attacks (called unconditionally at startup)
* initializing the ``random`` module (called when ``random`` is
  imported)
* servicing user calls to the ``os.urandom`` public API
* the higher level ``random.SystemRandom`` public API
* the new ``secrets`` module public API added by PEP 506

Currently, these five places all use the same underlying code, and
thus make this decision in the same way.

This whole problem was first noticed because 3.5.0 switched that
underlying code to the ``generate_unpredictable_bytes_or_block`` behavior,
and it turns out that there are some rare cases where Linux boot
scripts attempted to run a Python program as part of system initialization, the
Python startup sequence blocked while trying to initialize SipHash,
and then this triggered a deadlock because the system stopped doing
anything -- including gathering new entropy -- until the Python script
was forcibly terminated by an external timer. This is particularly unfortunate
since the scripts in question never processed untrusted input, so there was no
need for SipHash to be initialized with provably unpredictable random data in
the first place. This motivated the change in 3.5.2rc1 to emulate the old
``/dev/urandom`` behavior in all cases (by calling ``getrandom()`` in
non-blocking mode, and then falling back to reading ``/dev/urandom``
if the syscall indicates that the ``/dev/urandom`` pool is not yet
fully initialized.)

We don't know whether such problems may also exist in the Fedora/RHEL/CentOS
ecosystem, as the build systems for those distributions use chroots on servers
running an older operating system kernel that doesn't offer the ``getrandom()``
syscall, which means CPython's current build configuration compiles out the
runtime check for that syscall [10]_.

A similar problem was found due to the ``random`` module calling
``os.urandom`` as a side-effect of import in order to seed the default
global ``random.Random()`` instance.

We have not received any specific complaints regarding direct calls to
``os.urandom()`` or ``random.SystemRandom()`` blocking with 3.5.0 or 3.5.1 -
only problem reports due to the implicit blocking on interpreter startup and
as a side-effect of importing the random module.

Accordingly, this PEP proposes providing consistent shared behaviour for the
latter three cases (ensuring that their behaviour is unequivocally suitable for
all security sensitive operations), while updating the first two cases to
account for that behavioural change.

This approach should mean that the vast majority of Python users never need to
even be aware that this change was made, while those few whom it affects will
receive an exception at runtime that they can look up online and find suitable
guidance on addressing.


References
==========

.. [1] os.urandom() should use Linux 3.17 getrandom() syscall
   (http://bugs.python.org/issue22181)

.. [2] Python 3.5 running on Linux kernel 3.17+ can block at startup or on
   importing the random module on getrandom()
   (http://bugs.python.org/issue26839)

.. [3] "import random" blocks on entropy collection on Linux with low entropy
   (http://bugs.python.org/issue25420)

.. [4] os.urandom() doesn't block on Linux anymore
   (https://hg.python.org/cpython/rev/9de508dc4837)

.. [5] Proposal to add os.getrandom()
   (http://bugs.python.org/issue26839#msg267803)

.. [6] Add os.urandom_block()
   (http://bugs.python.org/issue27250)

.. [7] Add random.cryptorandom() and random.pseudorandom, deprecate os.urandom()
   (http://bugs.python.org/issue27279)

.. [8] Always use getrandom() in os.random() on Linux and add
   block=False parameter to os.urandom()
   (http://bugs.python.org/issue27266)

.. [9] Application level vs library level design decisions
   (https://mail.python.org/pipermail/security-sig/2016-June/000057.html)

.. [10] Does the HAVE_GETRANDOM_SYSCALL config setting make sense?
   (https://mail.python.org/pipermail/security-sig/2016-June/000060.html)


For additional background details beyond those captured in this PEP and Victor's
competing PEP, also see Victor's prior collection of relevant information and
links at https://haypo-notes.readthedocs.io/summary_python_random_issue.html


Copyright
=========

This document has been placed into the public domain.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From victor.stinner at gmail.com  Sun Jun 26 05:30:15 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sun, 26 Jun 2016 11:30:15 +0200
Subject: [Security-sig] Does the buildtime HAVE_GETRANDOM_SYSCALL check
 actually make sense?
In-Reply-To: <CADiSq7fwhvofS6ss=adVxr8E=NgwOB1yh0RRARbkmZstbtoR9Q@mail.gmail.com>
References: <CADiSq7fwhvofS6ss=adVxr8E=NgwOB1yh0RRARbkmZstbtoR9Q@mail.gmail.com>
Message-ID: <CAMpsgwa2V8oQRhMwk+8F6HnWd-+YbDDA-WOe-NDT5Mcr1wVbbg@mail.gmail.com>

The configure check ensures that constants required to build random.c are
available. We can only run this check at the compilation. I don't want to
maintain hardcoded constants.

The proper fix is to add getrandom() to the libc:
https://sourceware.org/bugzilla/show_bug.cgi?id=17252

But you may have the same issue if you build the lib with "old" header
files.

Victor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/security-sig/attachments/20160626/a4ed955b/attachment.html>

From victor.stinner at gmail.com  Mon Jun 27 17:15:30 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Mon, 27 Jun 2016 23:15:30 +0200
Subject: [Security-sig] How to document changes related to security in
 Python changelog?
In-Reply-To: <CAMpsgwYNsb4ynvGbeGN-j5EMtMCqZsZMv91ENGuY+oDTqyjXmw@mail.gmail.com>
References: <CAMpsgwZtXPGPwVq5S=5p=H8dV06uxeAzudfvTCNQkxK3xBp3gQ@mail.gmail.com>
 <57695492.6060100@stoneleaf.us> <20160621184016.4ad487a3.barry@wooz.org>
 <CAMpsgwYNsb4ynvGbeGN-j5EMtMCqZsZMv91ENGuY+oDTqyjXmw@mail.gmail.com>
Message-ID: <CAMpsgwZ1en825j5mKMrj4tpMma1oq-a2OJDSkf+1LVukERcb0A@mail.gmail.com>

Ok, I wrote a first patch to mark changes related to security in
Python 3.5.2 changelog:
https://bugs.python.org/issue27404

Victor

From victor.stinner at gmail.com  Tue Jun 28 12:55:40 2016
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 28 Jun 2016 18:55:40 +0200
Subject: [Security-sig] How to document changes related to security in
 Python changelog?
In-Reply-To: <CAMpsgwZ1en825j5mKMrj4tpMma1oq-a2OJDSkf+1LVukERcb0A@mail.gmail.com>
References: <CAMpsgwZtXPGPwVq5S=5p=H8dV06uxeAzudfvTCNQkxK3xBp3gQ@mail.gmail.com>
 <57695492.6060100@stoneleaf.us> <20160621184016.4ad487a3.barry@wooz.org>
 <CAMpsgwYNsb4ynvGbeGN-j5EMtMCqZsZMv91ENGuY+oDTqyjXmw@mail.gmail.com>
 <CAMpsgwZ1en825j5mKMrj4tpMma1oq-a2OJDSkf+1LVukERcb0A@mail.gmail.com>
Message-ID: <CAMpsgwYcpxL_JpSSdkewZ95u=E0d-xcXV815K_jhPE7Aa62nyA@mail.gmail.com>

I also write a first draft of a document listing (3) recent Python
security vulnerabilities:
http://haypo-notes.readthedocs.io/python_security.html

It includes a list of fixed and vulnerable versions of Python.

What do you think of such table?

It will not be easy to maintain such table up to date :-/

Victor

2016-06-27 23:15 GMT+02:00 Victor Stinner <victor.stinner at gmail.com>:
> Ok, I wrote a first patch to mark changes related to security in
> Python 3.5.2 changelog:
> https://bugs.python.org/issue27404
>
> Victor