John Machin wrote: > The fact that C3 and C2 are both present, plus the fact that one > non-ASCII byte has morphoploded into 4 bytes indicate a double whammy. Indeed... >>> x = u"fødselsdag" >>> x.encode('utf-8').decode('iso-8859-1').encode('utf-8') 'f\xc3\x83\xc2\xb8dselsdag' Andrea