Is there no compression support for large sized strings in Python?

Christopher Subich csubich.spam.block at spam.subich.block.com
Fri Dec 2 11:37:10 EST 2005


Fredrik Lundh wrote:
> Harald Karner wrote:

>>>python -c "print len('m' * ((2048*1024*1024)-1))"
>>
>>2147483647
> 
> 
> the string type uses the ob_size field to hold the string length, and
> ob_size is an integer:
> 
> $ more Include/object.h
>     ...
>     int ob_size; /* Number of items in variable part */
>     ...
> 
> anyone out there with an ILP64 system?

I have access to an itanium system with a metric ton of memory.  I 
-think- that the Python version is still only a 32-bit python, though 
(any easy way of checking?).  Old version of Python, but I'm not the 
sysadmin and "I want to play around with python" isn't a good enough 
reason for an upgrade. :)


Python 2.2.3 (#1, Nov 12 2004, 13:02:04)
[GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-42)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> str = 'm'*2047*1024*1024 + 'n'*2047*1024*1024
 >>> len(str)
-2097152

Yes, that's a negative length.  And I don't really care about rebinding 
str for this demo. :)

 >>> str[0]
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
IndexError: string index out of range
 >>> str[1]
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
IndexError: string index out of range
 >>> str[-1]
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
SystemError: error return without exception set
 >>> len(str[:])
-2097152
 >>> l = list(str)
 >>> len(l)
0
 >>> l
[]

The string is actually created -- top reports 4.0GB of memory usage.




More information about the Python-list mailing list