Random and whrandom

Alex Martelli aleaxit at yahoo.com
Wed Jan 24 08:05:41 EST 2001


<cpsoct at my-deja.com> wrote in message news:94m9bf$nge$1 at nnrp1.deja.com...
> but to get back to my original question. I should import random and not
> whrandom?  But then i notice that seed is in whrandom? Hmmm. so i need
> to import both? And why are there 3 seed values? In C, there was just
> one. and, what then is rand? id that depreciated?

What is 'rand', I don't know: the Python docs don't mention it
at all.  If it ever existed, it's surely deprecated, and, maybe,
also depreciated (depends on how much time has passed, and what
the applicable inflation rate may be).

Whether in C, in Python, in Fortran, or in *whatever* language, ANY
implementation of the Wichman-Hill algorithm will always use 3 (small:
15 bits each would suffice) integers as the generator's state.  That
has zilch to do with whatever language is used for the implementation:
it's a characteristic of the *algorithm*!

The 'random' module exposes "the methods which are available for all
random number generators" (as well as several specific distributions
such as Beta, Exponential, Gaussian, etc, which "are expected to become
part of the Random Number Generator interface in a future release" --
said interface, quoth the docs, "will be enhanced in future releases
of Python".

whrandom is one specific implementation of the RNG interface (it
happens to be the only one that currently comes standard with
Python, and thus the one that module random uses internally, but,
again, that's an implementation, version-specific detail).

Unfortunately, the RNG interface does not (yet?) support 'seeding'
(or the also-crucial concept of dumping and restoring the whole
state of a generator, e.g. for checkpoint/restart; this may be
more general than 'seeding', as a 'seed' is typically a single
integer value while a generator's state may be arbitrary; and
'seeding' does not necessarily support the RNG _giving out_ the
whole of its internal-state, so it can be restored exactly).  Or,
rather, this support is not spelled out *in the docs*.

Actually, module random() *DOES* expose a function 'seed', which
takes any hashable Python value (an integer, for example, will do
just fine) and uses it to reseed the generator.  So, you don't
really need to worry about whrandom or other implementation
details (unless you need checkpoint/restart, like I do).

You can use random's module-level functions directly, or make
as many independent generators as you need by instantiating
class random.generator; the latter is most conveniently done
by calling factory function random.new_generator, which takes
an (optional) seed argument (any hashable will do) and
returns the generator instance.  You may also call method
seed (again, taking an optional hashable argument; if missing,
re-seeds from current time) on such a generator instance.

The only truly important missing feature from this reasonably
abstract interface, as I said, is exactly a documented
checkpoint/restart facility.


> i want to get random numbers, but seeded ones (so i can duplicate my
> run). Frankly i don't understand why python has two random modules in
> its standard library and how these namespaces might overlap.

Python namespaces don't "overlap" -- just avoid "from module import name"
and there can be no name conflicts between separate modules!

It seems reasonable to me that Python exposes both an _abstract_
random number generator (which will meet the needs of 99.44% of
Python users) AND specific (concrete) RNG implementations for
users with highly peculiar needs (the remaining 0.56% of us).

For example, Python users who are calling random.shuffle on a
sequence of substantial length have a need for an underlying
random generator with a *HUGE* period, as a necessary although
not sufficient condition towards ensuring that all permutations
of the shuffled sequence get generated with equal likelihood.

Wichman-Hill has a period of less than 7,000 billions (7e12),
insufficient to account even for the permutations of a sequence
whose length is just 16 (16! > 2e13).  A Mersenne Twister, with
its period of about 1e5982, could thus be very attractive for
users of random-*shuffling* (while maybe not offering much added
value to other users of random-*generation*).  Just for example...

The situation might be clearer if Python had MANY random
number generation modules in its standard library -- one
as the abstract/high-level thingy, and the other N-1 as
specific implementations for specific needs.  I understand
that the Python core developers don't want to burden the
standard library that way, though; after all, people who DO
have special needs regarding random-number generators are
likely to be able to understand and remedy the situation
anyway (although perhaps the specific issue of *shuffling*
is not so fortunate; but I guess that most Pythonistas who
_are_ random-shuffling sequences probably don't have very
high requirements for 'quality of randomness', so this might
be a purely theoretical issue for [by far] most such users).

I personally consider the checkpoint/restart issue a more
serious one, since it IS (IMHO) frequent for a program
that is doing simulations using RNG's to be a very long
running one, at need of reasonably easy checkpointing; ANY
component used in such a program should make it easy to
persist (and later, if need be, depersist) its state (the
'Memento' design pattern, in Gang-of-Four terms, being
quite applicable here -- as long as the Memento object is
itself easily persistable:-).


> It would be great if the random whrandom poop could be cleaned up in
> future versions of python. One clear module.

ONE module, I think, would not be enough -- the distinction
between abstract and concrete IS important enough, after all,
to be worth a little bit of effort.  Clarity will be enhanced
when the Python docs DO document all of the random module's
functionality (and even more when said functionality is
better presented, as methods rather than just as module
level functions -- a "todo" regarding this IS in the sources).

Ideally, too, whrandom should be renamed to _whrandom, the
leading underline indicating it's meant for internal use
and not of direct user interest; before this 'deprecation'
is done, though, I dearly hope checkpoint/restart needs
will be considered in the abstract RNG interface.


Alex






More information about the Python-list mailing list