[Python-Dev] PEP 393 Summer of Code Project

"Martin v. Löwis" martin at v.loewis.de
Wed Aug 24 19:50:13 CEST 2011


>  > PEP 393 abolishes narrow builds as we now know them and changes
>  > semantics. I was answering a complaint about that change. If you do
>  > not like the PEP, fine.
> 
> No, I do like the PEP.  However, it is only a step, a rather
> conservative one in some ways, toward conformance to the Unicode
> character model.

I'd like to point out that the improved compatibility is only a side
effect, not the primary objective of the PEP. The primary objective
is the reduction in memory usage. (any changes in runtime are also
side effects, and it's not really clear yet whether you get speedups
or slowdowns on average, or no effect).

>  > Given that 3.0 unicode (string) objects are defined as Unicode character 
>  > strings, I do not see the opposition.
> 
> I think they're not, I think they're defined as Unicode code unit
> arrays, and that the documentation is in error.

That's just a description of the implementation, and not part of the
language, though. My understanding is that the "abstract Python language
definition" considers this aspect implementation-defined: PyPy,
Jython, IronPython etc. would be free to do things differently
(and I understand that there are plans to do PEP-393 style Unicode
 objects in PyPy).

> Martin has long claimed that the fact that I/O is done in terms of
> UTF-16 means that the internal representation is UTF-16, so I could be
> wrong.  But when issues of slicing, len() values and so on have come
> up in the past, Guido has always said "no, there will be no change in
> semantics of builtins here".

Not with these words, though. As I recall, it's rather like (still
with different words) "len() will stay O(1) forever, regardless of
any perceived incorrectness of this choice". An attempt to change
the builtins to introduce higher complexity for the sake of correctness
is what he rejects. I think PEP 393 balances this well, keeping
the O(1) operations in that complexity, while improving the cross-
platform "correctness" of these functions.

Regards,
Martin


More information about the Python-Dev mailing list