New Features in Python 1.6

Philip 'Yes, that's my address' Newton nospam.newton at gmx.li
Tue Apr 4 15:37:35 EDT 2000


On Tue, 4 Apr 2000 14:30:06 -0400, "Terry Reedy" <tjreedy at udel.edu>
wrote:

>> Python strings can now be stored as Unicode strings.  To make it easier
>> to type Unicode strings, the single-quote character defaults to creating
>> a Unicode string, while the double-quote character defaults to ASCII
>> strings.
>
>' = 1 byte/char, " = 2 bytes/ char is more straightforward.

Yes, but this depends on what you mean by "Unicode". You have to
represent those Unicode characters in bytes, somehow, and how many bytes
you need depends on the format. Also, full Unicode is 32-bit (not just
the 16 bits of the Basic Multilingual Plane). UTF-16 is two bytes (but
can be four for surrogates), UTF-8 is a *variable* number of bytes (1 to
6, I believe), etc. It's not always 2 bytes per Unicode character.

Cheers,
Philip
-- 
Philip Newton <nospam.newton at gmx.li>



More information about the Python-list mailing list