Blog "about python 3"

Sun Jan 5 05:39:52 EST 2014

Le dimanche 5 janvier 2014 03:54:29 UTC+1, Chris Angelico a écrit :
> On Sun, Jan 5, 2014 at 1:41 PM, Steven D'Aprano
> 
> <steve+comp.lang.python at pearwood.info> wrote:
> 
> > wxjmfauth at gmail.com wrote:
> 
> >
> 
> >> The very interesting aspect in the way you are holding
> 
> >> unicodes (strings). By comparing Python 2 with Python 3.3,
> 
> >> you are comparing utf-8 with the the internal "representation"
> 
> >> of Python 3.3 (the flexible string represenation).
> 
> >
> 
> > This is incorrect. Python 2 has never used UTF-8 internally for Unicode
> 
> > strings. In narrow builds, it uses UTF-16, but makes no allowance for
> 
> > surrogate pairs in strings. In wide builds, it uses UTF-32.
> 
> 
> 
> That's for Python's unicode type. What Robin said was that they were
> 
> using either a byte string ("str") with UTF-8 data, or a Unicode
> 
> string ("unicode") with character data. So jmf was right, except that
> 
> it's not specifically to do with Py2 vs Py3.3.
> 
> 

Yes, the key point is the preparation of the "unicode
text" for the PDF producer.

This is at this level the different flavours of Python
may be relevant.

I see four possibilites, I do not know what
the PDF producer API is expecting.

- Py2 with utf-8 byte string (ev. utf-16, utf-32)
- Py2 with its internal unicode
- Py3.2 with its internal unicode
- Py3.3 with its internal unicode

jmf