[Python-ideas] PEP 504: Using the system RNG by default

Wed Sep 16 17:47:30 CEST 2015

[Guido]
> There's still way too much chatter, and a lot that seems just rhetoric. This
> is not the republican primaries.

Which is a shame, since the chatter here is of much higher quality
than in the actual primaries ;-)

> Yes lots of companies got hacked. What's the evidence that a language's
> default RNG was involved?

Nobody cares whether there's evidence of actual harm.  Just that there
_might_ be, and even if none identifiable now, then maybe in the
future.

There is evidence of actual harm from RNGs doing poor _seeding_ by
default, but Python already fixed that (I know, you already know that
;-) ).

And this paper, from a few years ago, studying RNG vulnerabilities in
PHP apps, is really good:

     https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf

An interesting thing is that several of the apps already had a history
of trying to fix security-related holes related to RNG (largely due to
PHP's poor default seeding), but remained easily cracked.

The primary recommendation there wasn't to make PHP's various PRNGs
"crypto by magic", but for core PHP to supply "a standard" crypto RNG
for people to use instead.  As above, some of the app developers
already knew darned well they had a history of RNG-related holes, but
simply had no standard way to address it, and didn't have the _major_
expertise needed to roll their own.

> IIUC the best practice for password encryption (to
> make cracking using a large word list harder) is something called bcrypt;
> maybe next year something else will become popular, but the default RNG
> seems an unlikely candidate. I know that in the past the randomness of
> certain protocols was compromised because the seeding used a timestamp that
> an attacker could influence or guess. But random.py seeds MT from
> os.urandom(2500). So what's the class of vulnerabilities where the default
> RNG is implicated?

1. Users doing their own poor seeding.

2. A hypothetical MT state-deducer (seemingly needing to be
   considerably more sophisticated than the already mondo
   sophisticated one in the paper above) to be of general use
   against Python.

3. "Prove there can't be any in the future.  Ha!  You can't." ;-)

> Tim's proposal is simple: create a new module, e.g. safefandom, with the
> same API as random (less seed/state). That's it. Then it's a simple import
> change away to do the right thing, and we have years to seed StackOverflow
> with better information before that code even hits the road. (But a backport
> to Python 2.7 could be on PyPI tomorrow!)

Which would obviously be fine by me:  make the distinction obvious at
import time, make "the safe way" dead easy and convenient to use, give
it anew name engineered to nudge newbies away from the "unsafe" (by
contrast) `random`, and a new name easily discoverable by web search.

There's something else here:  some of these messages gave pointers to
web pages where "security wonks" conceded that specific uses of
SystemRandom were fine, but they couldn't recommend it anyway because
it's too hard to explain what is or isn't "safe".  "Therefore" users
should only use urandom() directly.  Which is insane, if for no other
reason than that users would then invent their own algorithms to
convert urandom() results into floats and ints, etc.  Then they'll
screw up _that_ part.

But if "saferandom" were its own module, then over time it could
implement its own "security wonk certified" higher level (than raw
bytes) methods.  I suspect it would never need to change anything from
what the SystemRandom class does, but I'm not a security wonk, so I
know nothing.  Regardless, _whatever_ changes certified wonks deemed
necessary in the future could be confined to the new module, where
incompatibilities would only annoy apps using that module.  Ditto
whatever doc changes were needed.  Also gone would be the inherent
confusion from needing to draw distinctions between "safe" and
"unsafe" in a single module's docs (which any by-magic scheme would
only make worse).

However, supplying a powerful and dead-simple-to-use new module would
indeed do nothing to help old code entirely by magic.  That's a
non-goal to me, but appears to be the _only_ deal-breaker goal for the
advocates.

Which is why none of us is the BDFL ;-)