Trouble saving unicode text to file

"Martin v. Löwis" martin at v.loewis.de
Wed May 11 01:47:17 EDT 2005


Thomas Bellman wrote:
> Fixed-with characters *do* have advantages, even in the external
> representation.  With fixed-with characters you don't have to
> parse the entire file or stream in order to read the Nth character;
> instead you can skip or seek to an octet position that can be
> calculated directly from N.

OTOH, encodings that are free of null bytes and ASCII compatible
also have advantages.

> And not the least, UTF-32 is *beautiful* compared to UTF-16.

But ugly compared to UTF-8. Not only does it have the null byte
and the ASCII incompatibility problem, but it also has the
endianness problem. So for exchanging Unicode between systems,
I can see no reason to use anything but UTF-8 (unless, of course,
one end, or the protocol, already dictates a different encoding).

Regards,
Martin



More information about the Python-list mailing list