encoding problems (é and è)

John Machin sjmachin at lexicon.net
Thu Mar 23 17:33:19 EST 2006


On 24/03/2006 8:36 AM, Peter Otten wrote:
> John Machin wrote:
> 
>>You can replace ALL of this upshifting and accent removal in one blow by
>>using the string translate() method with a suitable table.
> 
> Only if you convert to unicode first or if your data maintains 1 byte == 1
> character, in particular it is not UTF-8. 
> 

I'm sorry, I forgot that there were people who are unaware that 
variable-length gizmos like UTF-8 and various legacy CJK encodings are 
for storage & transmission, and are better changed to a 
one-character-per-storage-unit representation before *ANY* data 
processing is attempted.

:-)
Unicode? I'm just a benighted Anglo from the a**-end of the globe; who 
am I to be preaching Unicode to a European?
(-:



More information about the Python-list mailing list