[OT] does the charset lie?
Skip Montanaro
skip at pobox.com
Sun May 2 14:52:59 EDT 2004
>> data = unicode(data, "iso-8859-1").encode("utf-8")
>> data = map_entities_to_utf_8(data)
>> data = unicode(data, "utf-8")
David> Or, even simpler, skip the intermediate step:
David> data = unicode(data, "iso-8859-1")
David> data = map_entities_to_unicode(data)
David> map_entities_to_unicode() could use htmlentitydefs.name2codepoint
David> from the stdlib.
Thanks, I always forget there's an htmlentitydefs module.
Skip
More information about the Python-list
mailing list