[Python-Dev] Disabling string interning for null and single-char causes segfaults

Amaury Forgeot d'Arc amauryfa at gmail.com
Mon Mar 4 20:20:00 CET 2013


2013/3/4 Guido van Rossum <guido at python.org>

> >>>> x = u'\xe9'.encode('ascii', 'ignore')
> >>>> x == '', x is ''
> > (True, False)
>
> Code that relies on this is incorrect (the language doesn't guarantee
> interning) but nevertheless given the intention of the implementation,
> that behavior of encode() is also a bug.
>

The example above is obviously from python2.7; there is a similar example
with python3.2:
>>> x = b'\xe9\xe9'.decode('ascii', 'ignore')
>>> x == '', x is ''
(True, False)

...but this bug has been fixed in 3.3: PyUnicode_Resize() always returns
the unicode_empty singleton.

-- 
Amaury Forgeot d'Arc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20130304/9bf5a2d8/attachment.html>


More information about the Python-Dev mailing list