[Python-ideas] Pre-PEP Adding A Secrets Module To The Standard Library

Mon Sep 21 19:51:13 CEST 2015

[Tim]
>> ...
>> No attempt to be minimal here.  More-than-less "obvious" is more important:
>>
>> Bound methods of a SystemRandom instance
>>     .randrange()
>>     .randint()
>>     .randbits()
>>         renamed from .getrandbits()
>>     .randbelow(exclusive_upper_bound)
>>         renamed from private ._randbelow()
>>     .choice()

[Steven D'Aprano <steve at pearwood.info>]
> While we're bike-shedding,

I refuse to bikeshed on this.  I posted a concrete proposal just to
enrage others into it ;-)  So I'll just sketch my thinking:

> I don't know that I like the name randbits, since that always
> makes me expect a sequence of 0, 1 bits. But that's a minor
> point.

Had in mind multiple audiences, including those who know a lot about
Python, and those who know little.  The _lack_ of randbits() would
surprise the former.

> When would somebody use randbelow(n) rather than randrange(n)?

For the same reason they'd use randbits(n) instead of randrange(1 <<
n) ;-)  That is, familiarity and obviousness.  randrange() has a
complicated signature, with 1 to 3 arguments, and endlessly surprises
newbies who _expect_, e.g., randrange(3) to return 3 at times.  That's
why randint() was created.  "randbelow(n)" has a dirt-simple
signature, and its name makes it hard to mistakenly believe `n` is a
possible return value.  It's exactly what's needed most often to avoid
_statistical_ bias (as opposed to security weaknesses) in higher-level
functions - that's why _randbelow() is a fundamental primitive in
Random.

So, yes, it's redundant, but I don't care.  randrange(n) itself is
just a needlessly expensive way to call _randbelow(n) today.

> Apart from the possible redundancy between rand[below|range], all the
> above seem reasonable to me.

If people want minimal, just expose os.urandom() under a friendlier
name, and call it done ;-)

> Are there use-cases for a strong random float between 0 and 1? If
> so, is it sufficient to say secrets.randbelow(sys.maxsize)/sys.maxsize,
> or should we offer secrets.random() and/or secrets.uniform(a, b)?

I don't know of any "security use" for random floats.  But if you want
to add a recipe to the docs, point them to SystemRandom.random
instead.  That gets it right.  `sys.maxsize` doesn't really have
anything to do with floats, and the snippet you gave would produce
poor-quality floats on a 32-bit box (wouldn't get anywhere near
randomizing all 53 bits of float precision).  On a 64-bit box, it
could, e.g., return 1.0 (which random() should never return).

>>  Token functions
>>     .token_bytes(nbytes)
>>         another name for os.urandom()
>>     .token_hex(nbytes)
>>         same, but return string of ASCII hex digits
>>     .token_url(nbytes)
>>         same, but return URL-safe base64-encoded ASCII
>
> I suggest adding a default length, say nbytes=32, with a note that the
> default length is expected to increase in the future. Otherwise, how
> will the naive user know what counts as a good, hard-to-attack length?

Fine by me!

> All of the above look good to me.
>
>
>>     .token_alpha(alphabet, nchars)
>>         string of `nchars` characters drawn uniformly
>>         from `alphabet`
>
> What is the intention for this function? To use as passwords? Other than
> that, it's not obvious to me what that would be used for.

I just noted that several of the examples in the PHP paper appeared to
want to use their own alphabet.  But, since that paper was about
exposing security holes in PHP apps, perhaps that wasn't such a good
idea to begin with ;-)  Fine by me if it's dropped.