[Baypiggies] Handling unwanted Unicode \u2019 characters in XML
Terry Carroll
carroll at tjc.com
Wed Jul 2 02:30:38 CEST 2008
Sorry, meant to send this to the list....
On Tue, 1 Jul 2008, Stephen McInerney wrote:
> Check that URL again: string.translate() IS deprecated, but
> string.maketrans() is not. unicode.translate() is not deprecated.
But can you set up the translate table, though?
>>> import string
>>> trantab = string.maketrans(u"u\2019", u"'")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\x81' in
position 1: ordinal not in range(128)
I also note that the docs for the translate() string method suggest:
Note, a more flexible approach is to create a custom character mapping
codec using the codecs module (see encodings.cp1251 for an example).
But reading the codecs docs raised more questions for me than they
answered; it certainly isn't as straightforward as the ascii translation
was.
More information about the Baypiggies
mailing list