[issue9025] Non-uniformity in randrange for large arguments.

Wed Jun 23 21:37:55 CEST 2010

Mark Dickinson <dickinsm at gmail.com> added the comment:

> * possibly providing a C version of rnd2()

If recoding in C is acceptable, I think there may be better ( = simpler and faster) ways than doing a direct translation of rnd2.

For example, for small k, the following algorithm for randrange(k) suffices:

 - take a single 32-bit deviate (generated using genrand_int32)
 - multiply by k (a 32-by-32 to 64-bit widening multiply) and return
   the high 32-bits of the result, provided that the bottom half
   of the product is <= 2**32 - k (almost always true, for small k).
 - consume extra random words as necessary in the case that the bottom
   half of the product is > 2**32 - k.

I can provide code (with that 3rd step fully expanded) if you're interested in this approach.

This is likely to be significantly faster than a direct translation of rnd32, since in the common case it requires only: one 32-bit deviate from MT, one integer multiplication, one subtraction, and one comparison.  By comparison, rnd2 uses (at least) two 32-bit deviates and massages them into a float, before doing arithmetic with that float.

Though it's possible (even probable) that any speed gain would be insignificant in comparison to the rest of the Python machinery involved in a single randrange call.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9025>
_______________________________________