[Python-Dev] os.urandom API

Nick Mathewson nickm at alum.mit.edu
Mon Aug 30 18:50:27 CEST 2004


On Sun Aug 29 22:37:57 2004, Raymond Hettinger wrote:
> I would like to change the API for the new os.urandom(n) function to
> return a long integer instead of a string.  The former better serves
> more use cases and fits better with existing modules.

With all respect, I disagree.  As a potential user, and as lead
developer of a particularly crypto-heavy Python app
(http://mixminion.net/), I'd have to say an interface that returned a
long integer would not serve my purposes very well at all.

For most crypto apps, you never use the output of your strong entropy
source directly.  Instead, you use your strong entropy source to
generate seeds for (cryptographically strong) PRNGs, to generate keys
for your block ciphers, and so on.  Nearly all of these modules expect
their keys as a sequence of bits, which in Python corresponds more
closely to a character-string than to an arbitrary long.

> In favor of a long integer:
> 
> 1) The call random.seed(os.random(100)) is a likely use case.  If the
> intermediate value is a string, then random.seed() will hash it and
> only use 32 bits.  If the intermediate value is a long integer, all
> bits are used.  In the given example, the latter is clearly what the
> user expects (otherwise, they would only request 4 bytes).

Plausible, but as others indicate, it isn't hard to write:

    random.seed(long(hexlify(os.urandom(100)), 16)

And if you think it's really important, you could change
random.seed(None) to get the default seed in this way, instead of
looking at time.time.

But this isn't the primary use case of cryptographically strong
entropy: The Mersenne Twister algorithm isn't cryptographically
secure.  If the developer wants a cryptographically strong PRNG, she
shouldn't be using random.random().  If she doesn't want a
cryptographically strong PRNG, it's overkill for her to use
os.urandom(), and overkill for her to want more than 32 bits of
entropy anyway.

> 2) Another likely use case is accessing all the tools in the random
> module with a subclass that overrides random() and getrandbits().
> Both can be done easier and faster if os.random() returns long
> integers.  If the starting point is a string, the code gets ugly and
> slow.

As above, this isn't the way people use strong entropy in
well-designed crypto applications.  You use your strong entropy to
seed a strong PRNG, and you plug your strong PRNG into a subclass
overriding random() and getrandbits().

I agree that somebody might decide to just use os.urandom directly as
a shortcut, but such a person isn't likely to care about whether her
code is slow -- a good PRNG should outperform calls to os.urandom by
an order of magnitude or two.

 [....]
> 
> In favor of a string of bytes:
> 
> 1) This form is handy for cyptoweenies to xor with other byte strings
> (perhaps for a one-time pad).


2) By returning the result of the OS's random function directly, we
   make it easier for cryptoweenies to assure themselves that their
   entropy is good.  If it gets massaged into a long, then skeptical
   cryptoweenies will have that much more code to audit.

3) Most crypto libraries don't currently support keying from longs,
   and (as noted in other posts) the long->string conversion isn't as
   easy or clean as the string->long conversion.

yrs,
-- 
Nick Mathewson
(PGP key changed on 15Aug2004; see http://wangafu.net/key.txt)


More information about the Python-Dev mailing list