Unicode/ascii encoding nightmare

John Machin sjmachin at lexicon.net
Mon Nov 6 15:56:04 EST 2006


Robert Kern wrote:

> However, I don't know of an encoding that takes u"fødselsdag" to
> 'f\xc3\x83\xc2\xb8dselsdag'.

There isn't one.

C3 and C2 hint at UTF-8.
The fact that C3 and C2 are both present, plus the fact that one
non-ASCII byte has morphoploded into 4 bytes indicate a double whammy.

Cheers,
John




More information about the Python-list mailing list