Random string of digits?

Sun Dec 25 22:00:08 EST 2011

On Sun, 25 Dec 2011 12:41:29 -0500, Roy Smith wrote:

> On Mon, 26 Dec 2011 03:11:56 +1100, Chris Angelico wrote:
>> > I prefer not to rely on the source. That tells me what happens, not
>> > what's guaranteed to happen.
> 
> Steven D'Aprano <steve+comp.lang.python at pearwood.info> wrote:
>> In this case, the source explicitly tells you that the API includes
>> support for arbitrary large ranges if you include a getrandbits()
>> method:
>> 
>>      Optionally, implement a getrandbits() method so that randrange()
>>      can cover arbitrarily large ranges.
>> 
>> I call that a pretty strong guarantee.
> 
> I think you mis-understood Chris's point.

And I'm afraid that you have missed my point. The above comment from the 
source is from a docstring: it *is* public, official documentation. See 
help(random.Random).

> The documentation is the
> specification of how something behaves.  If the documentation doesn't
> say it, you can't rely on it.

A nice platitude, but not true. Documentation is often incomplete or even 
inaccurate. We rely on many things that aren't documented anywhere. For 
example, we can rely on the fact that 

x = 3917
print x+1

will print 3918, even though that specific fact isn't documented 
anywhere. Nevertheless, we can absolutely bank on it -- if it happened to 
do something else, we would report it as a bug and not expect to be told 
"implementation detail, will not fix". We make a number of undocumented 
assumptions:

* we assume that when the documentation talks about "adding" two 
  numbers, it means the standard mathematical definition of addition 
  and not some other meaning;

* we assume that the result of such addition must be *correct*, 
  without that assumption being guaranteed anywhere;

* we assume that addition of two ints will return an int, as opposed
  to some other numerically equal value such as a float or Fraction;

* we assume that not only will it be an int, but it will be *exactly*
  an int, and not a subclass of int;

and no doubt there are others.

And by the way, in case you think I'm being ridiculously pedantic, 
consider the first assumption listed above: the standard mathematical 
definition of addition. That assumption is violated for floats.

py> 0.7 + 0.1 == 0.8
False

It is rather amusing the trust we put in documentation, when 
documentation can be changed just as easily as code. Just because the 
Fine Manual says that x+1 performs addition today, doesn't mean that it 
will continue to say the same thing tomorrow. So if you trust the 
language designers not to arbitrarily change the documentation, why not 
trust them not to arbitrarily change the code?

Hint: as far as I can tell, nowhere does Python promise that printing 
3918 will show those exact digits, but it's a pretty safe bet that Python 
won't suddenly change to displaying integers in balanced-ternary notation 
instead of decimal.

> The user should never have to read the
> source to know how to use a function, or what they can depend on.

Well, that's one school of thought. Another school of thought is that 
documentation and comments lie, and the only thing that you can trust is 
the code itself.

At the end of the day, the only thing that matters is the mindset of the 
developer(s) of the software. You must try to get into their head and 
determine whether they are the sort of people whose API promises can be 
believed.

> Now,
> I'm not saying that reading the source isn't useful for a deeper
> understanding, but it should be understood that any insights you glean
> from doing that are strictly implementation details.

I couldn't care less about the specific details of *how* getrandbits is 
used to put together an arbitrarily large random number. That "how" is 
what people mean when they talk about "mere implementation details".

But the fact that getrandbits is used by randrange (as opposed to how it 
is used) is not a mere implementation detail, but a public part of the 
interface.

> If you're saying that there are guarantees made by the implementation of
> getrandbits() which are not documented, then one of two things are true:

The implementation of getrandbits is not documented at all. You would 
have to read the C source of the _random module to find out how the 
default pseudo-random number generator produces random bits. But note the 
leading underscore: _random is a private implementation detail.

However the existence and use of random.Random.getrandbits is public, 
documented, and guaranteed.

-- 
Steven