[Python-ideas] Python's Source of Randomness and the random.py module Redux

Mon Sep 14 17:32:48 CEST 2015

On Mon, Sep 14, 2015 at 10:01 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 14 September 2015 at 14:29, Cory Benfield <cory at lukasa.co.uk> wrote:
>> Is your argument that there are lots of ways to get security wrong,
>> and for that reason we shouldn't try to fix any of them?
>
> This debate seems to repeatedly degenerate into this type of accusation.
>
> Why is backward compatibility not being taken into account here? To be
> clear, the proposed change *breaks backward compatibility* and while
> that's allowed in 3.6, just because it is allowed, doesn't mean we
> have free rein to break compatibility - any change needs a good
> justification. The arguments presented here are valid up to a point,
> but every time anyone tries to suggest a weak area in the argument,
> the "we should fix security issues" trump card gets pulled out.
>
> For example, as this is a compatibility break, it'll only be allowed
> into 3.6+ (I've not seen anyone suggest that this is sufficiently
> serious to warrant breaking compatibility on older versions). Almost
> all of those SO questions, and google hits, are probably going to be
> referenced by people who are using 2.7, or maybe some version of 3.x
> earlier than 3.6 (at what stage do we allow for the possibility of 3.x
> users who are *not* on the latest release?) So is a solution which
> won't impact most of the people making the mistake, worth it?

So people who are arguing that the defaults shouldn't be fixed on
Python 2.7 are likely the same people who also argued that PEP 466 was
a terrible, awful, end-of-the-world type change. Yes it broke things
(like eventlet) but the net benefit for users who can get onto Python
2.7.9 (and later) is immense.

Now I'm not arguing that we should do the same to the random module,
but a backport (that is part of the stdlib) would probably be a good
idea under the same idea of allowing users to opt into security early.

> I fully expect the response to this to be "just because it'll take
> time, doesn't mean we should do nothing". Or "even if it just fixes it
> for one or two people, it's still worth it". But *that's* the argument
> I don't find compelling - not that a fix won't help some situations,
> but that because it's security, (a) all the usual trade-off
> calculations are irrelevant, and (b) other proposed solutions (such as
> education, adding specialised modules like a "shared secret" library,
> etc) are off the table.

They're not irrelevant. I personally think they're of a lower impact
to the discussion, but the reality is that the people who are
educating others are few and far between. If there are public domain
works, free tutorials, etc. that all advocate using a module in the
standard library and no one can update those, they still exist and are
still recommendations. People prefer free to correct when possible
because there's nothing free to correct them (until they get hacked or
worse). Do we have a team in the Python community that goes out to
educate for free people on security related best practices? I haven't
seen them. The best we have is a few people on crufty mailing lists
like this one trying to make an impact because education is a much
larger and harder to solve problem than making something secure by
default.

Perhaps instead of bickering like fools on a mailing list, we could
all be spending our time better educating others. That said, I can't
make that decision for you just like you can't make that for me.

> Honestly, this type of debate doesn't do the security community much
> good - there's too little willingness to compromise, and as a result
> the more neutral participants (which, frankly, is pretty much anyone
> who doesn't have a security agenda to promote) end up pushed into a
> "reject everything" stance simply as a reaction to the black and white
> argument style.

Except you seem to have missed much of the compromises being discussed
and conceded by the security minded folks. Personally, names that
describe the outputs of the algorithms make much more sense to me than
"Seedless" and "Seeded" but no one has really bothered to shave that
yak further out of a desire to compromise and make things better as a
whole. Much of the lack of gradation has come from the opponents to
this change who seem to think of security as a step function where a
subjective measurement of "good enough for me" counts as secure.