tough-to-explain Python

Fri Jul 10 16:27:12 EDT 2009

Steven D'Aprano wrote:
> On Thu, 09 Jul 2009 23:07:34 -0400, Terry Reedy wrote:

>>
>> The is *not* a bug is Bentley program.

This is *not* a bug in Bentley's program.

> Wow. That's an impressive set of typos :)

3. Way beneath my usual standards ;-)

>> It is a bug in bad, buggy, insane
>> integer arithmetic implementations. low + high should either return the
>> right answer, as it will in Python, or raise an overflow error.
> 
> Is binary search *supposed* to raise an overflow error if given more than 
> 2**30 elements? 

No. it is supposed to work, and Bentley's program will work if lo and hi 
are actually integers, as his program presupposes, and not residue 
classes mislabeled 'int'.

> Is there something special about OverflowError that is "better" than 
> ArrayIndexOutOfBoundsException?

Wrong comparison. The actual alternative to OverflowError is a negative 
number as the sum of two positive numbers. If one claims that a and b 
are positive ints, then returning a negative is a bug. The index 
exception was a side effect, in Java, of using the negative as an index. 
In C, the side effect was a segmentation fault. Do you seriously 
question that OverflowError is better than seg faulting? A different 
program could go on to silently return a wrong answer. Perhaps it would 
say to go west instead of east, or to reduce a medical dosage instead of 
raising it.

Note that negative ints are legal indexes in Python, so that if any 
Python version ever had ints that wrapped instead of overflowing or 
returning a long, then the program would not necessarily stop.

> but there's no reason to imagine that the book will assume -- or even 
> state as a requirement for binary search -- that addition will never 
> overflow.

Bentley assumed that if (lo+hi)/2 were successfully calculated, then it 
would be a positive number between lo and hi, and therefore a legal index.

The dirty secret of computing is that some languages do not have integer 
types. They do not even have restricted-range integer types, which would 
noisily fail when they cannot perform an operation. They only have 
residue class types, which are something else, and which can give 
bizarre results if one mindlessly uses them as if they really were 
restricted-range integers. When one wants integer operations, every 
operation with two residue-class variables has a potential *silent* bug. 
Being relieved of the burden of constantly keeping this in mind is one 
reason I use Python.

Float types are usually restricted range types. A huge_float + 
huge_float may raise overflow, but never returns a negative float.

Bentley assumed that indexes hi and lo are integers. If one uses 
restricted range integers that are too large, then lo+hi could fail 
gracefully with an appropriate effor message. I would not consider that 
a bug, bug a limitation due to using limited-range numbers. If one uses 
residue classes instead of integers, and makes no adjustment, I consider 
it wrong to blame Bentley.

Terry Jan Reedy