int/long unification hides bugs

Tue Oct 26 14:50:01 EDT 2004

kartik wrote:
>>The question is how small is small? Less than 2**7? Less than 2**15? 
>>Less than 2**31? Less than 2**63? And what's the significance of powers 
>>of two? And what happens if you move from a 32 bit machine to a 64 bit 
>>one? (or a 1024 bit one in a hundred years time?)
> 
> less than 2**31 most of the time & hardly ever greater than 2**63 - no
> matter if my machine is 32-bit, 64-bit or 1024-bit. the required range
> depends on the data   u want 2 store in the variable & not on the
> hardware.

Yes. My point exactly. Very rarely will the platform limit reflect the 
algorithmic limit. If you want to limit the range if your numbers, you 
need to have knowledge of your particular use case - something that 
can't be done with a predefined language limit.

>> > PEP 237 says, "It will give new Python programmers [...] one less
>> > thing to learn [...]". i feel this is not so important as the quality
>> > of code a programmer writes once he does learn the language.
>>
>>The thing is, the int/long cutoff is arbitrary, determined soley by 
>>implemetation detail. 
> 
> agreed, but it need not be that way. ints can be defined to be 32-bit
> (or 64-bit) on all architectures.

But again, even though consistent, the limit is still arbitrary. Which 
one will it be? How do we decide? If we're platform independent, why 
bother with hardware based sizes anyway? Why not use a base 10 limit 
like 10**10? As mentioned above, the choice of limit depends on the 
particular algorithm, which can't be know by the language designers a 
priori.

>>A much better idea is the judicious use of assertions.
>>
>>assert x < 15000
>>
>>Not only does it protect you from runaway numbers, it also documents 
>>what the expected range is, resulting in a much better "quality of code"
> 
> such an assertion must be placed before avery assignment to the
> variable - & that's tedious. moreover, it can give u a false sense of
> security when u think u have it wherever needed but u've forgotten it
> somewhere.

I was thinking of judicious use for local variables inside of a loop. 
But if you want general, just subclass int (with 2.3.4):

 >>> class limint(long):
	def __add__(self, other):
		value = long.__add__(self, other)
		if value > 100:
			raise OverflowError
		return limint(value)
 >>> a = limint(10)
 >>> a
10L
 >>> b = a+90
 >>> b
100L
 >>> c = b+1

Traceback (most recent call last):
   File "<pyshell#24>", line 1, in -toplevel-
     c = b+1
   File "<pyshell#18>", line 5, in __add__
     raise OverflowError
OverflowError
 >>>

A bit crude, but it will get you started. If it's too slow, there is 
nothing stopping you from making a C extension module with the 
appropriate types.

I think that one of the problems we're having in this conversation is 
that we are talking across each other. Nobody is denying that finding 
bugs is a good thing. It's just that, for the bugs which the overflow 
catches, there are much better ways of discovering them. (I'm surprised 
no one has mentioned unit testing yet.)

Any decision is always has a cost/benefit analysis. For long/int 
unification, the benefits have been pointed out by others, and your 
proposed costs are minor, and can be ameliorated by other practices, 
which most here would argue are the better way of going about it in the 
first place.