Unicode/ascii encoding nightmare

Mon Nov 6 18:07:57 EST 2006

Andrea Griffini wrote:
> John Machin wrote:
>
> > The fact that C3 and C2 are both present, plus the fact that one
> > non-ASCII byte has morphoploded into 4 bytes indicate a double whammy.
>
> Indeed...
>
>  >>> x = u"fødselsdag"
>  >>> x.encode('utf-8').decode('iso-8859-1').encode('utf-8')
> 'f\xc3\x83\xc2\xb8dselsdag'
>

Indeed yourself. Have you ever considered reading posts in
chronological order, or reading all posts in a thread? It might help
you avoid writing posts with non-zero information content.

Cheers,
John