Blog "about python 3"

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Jan 4 21:41:20 EST 2014


wxjmfauth at gmail.com wrote:

> The very interesting aspect in the way you are holding
> unicodes (strings). By comparing Python 2 with Python 3.3,
> you are comparing utf-8 with the the internal "representation"
> of Python 3.3 (the flexible string represenation).

This is incorrect. Python 2 has never used UTF-8 internally for Unicode
strings. In narrow builds, it uses UTF-16, but makes no allowance for
surrogate pairs in strings. In wide builds, it uses UTF-32.

Other implementations, such as Jython or IronPython, may do something else.


-- 
Steven




More information about the Python-list mailing list