[Python-Dev] Bad interaction of __index__ and sequence repeat

Nick Coghlan ncoghlan at gmail.com
Fri Jul 28 17:29:19 CEST 2006


David Hopwood wrote:
> Armin Rigo wrote:
>> Hi,
>>
>> There is an oversight in the design of __index__() that only just
>> surfaced :-(  It is responsible for the following behavior, on a 32-bit
>> machine with >= 2GB of RAM:
>>
>>     >>> s = 'x' * (2**100)       # works!
>>     >>> len(s)
>>     2147483647
>>
>> This is because PySequence_Repeat(v, w) works by applying w.__index__ in
>> order to call v->sq_repeat.  However, __index__ is defined to clip the
>> result to fit in a Py_ssize_t.
> 
> Clipping the result sounds like it would *never* be a good idea. What was
> the rationale for that? It should throw an exception.

A simple demonstration of the clipping behaviour that works on machines with 
limited memory:

 >>> (2**100).__index__()
2147483647
 >>> (-2**100).__index__()
-2147483648

PEP 357 doesn't even mention the issue, and the comment on long_index in the 
code doesn't give a rationale - it just notes that the function clips the result.

Neither the PyNumber_AsIndex nor the __index__ documentation mention anything 
about the possibility of clipping, and there's no test case to verify this 
behaviour.

I'm inclined to call it a bug, too, but I've cc'ed Travis to see if he can 
shed some light on the question - the implementation of long_index explicitly 
suppresses the overflow error generated by _long_as_ssize_t, so the current 
behaviour appears to be deliberate.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org


More information about the Python-Dev mailing list