What encoding does u'...' syntax use?
"Martin v. Löwis"
martin at v.loewis.de
Sat Feb 21 15:10:30 EST 2009
>> I'm pretty much sure it is UCS-2 or UCS-4. (Yes, I know there is only a
>> slight difference to UTF-16/UTF-32).
>
> I wouldn't call the difference that slight, especially between UTF-16
> and UCS-2, since the former can encode all Unicode code points, while
> the latter can only encode those in the BMP.
Indeed. As Python *can* encode all characters even in 2-byte mode
(since PEP 261), it seems clear that Python's Unicode representation
is *not* strictly UCS-2 anymore.
Regards,
Martin
More information about the Python-list
mailing list