Trouble fixing a broken ASCII string - "replace" mode in codec not working.
John Nagle
nagle at animats.com
Tue Feb 6 15:24:17 EST 2007
I'm trying to clean up a bad ASCII string, one read from a
web page that is supposedly in the ASCII character set but has some
characters above 127. And I get this:
File "D:\projects\sitetruth\InfoSitePage.py", line 285, in httpfetch
sitetext = sitetext.encode('ascii','replace') # force to clean ASCII
UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 29151:
ordinal not in range(128)
Why is that exception being raised when the codec was told 'replace'?
(And no, just converting it to Unicode with "sitetext = unicode(sitetext)"
won't work either; that correctly raises a Unicode conversion exception.)
[Python 2.4, Win32]
JohnNagle
More information about the Python-list
mailing list