[New-bugs-announce] [issue7551] SystemError/MemoryError/OverflowErrors on encode() a unicode string
Andreas Jung
report at bugs.python.org
Sun Dec 20 17:39:52 CET 2009
New submission from Andreas Jung <ajung at users.sourceforge.net>:
We encountered a pretty bizarre behavior of Python 2.4.6 while decoding a 600MB long unicode string
'data':
Python 2.4.6 (8GB RAM, 64 bit)
(Pdb) type(data)
<type 'unicode'>
(Pdb) len(data)
601794657
(Pdb) data2=data.encode('utf-8')
*** SystemError: Negative size passed to PyString_FromStringAndSize
Assuming that this has something to do with a 512MB limit:
(Pdb) data2=data[:512*1024*1024].encode('utf-8')
*** SystemError: Negative size passed to PyString_FromStringAndSize
Same bug...now with 512MB - 1 byte:
(Pdb) data2=data[:(256*1024*1024)-1].encode('utf-8')
OverflowError
Cross-check on a different Linux box (4GB RAM, 4 GB Swap, 64 bit)
ajung at blackmoon:~> python2.4
Python 2.4.5 (#1, Jun 9 2008, 10:35:12)
[GCC 4.2.1 (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> data = u'x'*601794657
>>> data2= data.encode('utf-8')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
MemoryError
Where is this different behavior coming from?
----------
messages: 96695
nosy: ajung
severity: normal
status: open
title: SystemError/MemoryError/OverflowErrors on encode() a unicode string
versions: Python 2.4
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue7551>
_______________________________________
More information about the New-bugs-announce
mailing list