Trouble fixing a broken ASCII string - "replace" mode in codec not working.

John Nagle nagle at animats.com
Tue Feb 6 15:24:17 EST 2007


    I'm trying to clean up a bad ASCII string, one read from a
web page that is supposedly in the ASCII character set but has some
characters above 127.  And I get this:

  File "D:\projects\sitetruth\InfoSitePage.py", line 285, in httpfetch
     sitetext = sitetext.encode('ascii','replace')  # force to clean ASCII

UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 29151: 
ordinal not in range(128)

    Why is that exception being raised when the codec was told 'replace'?

(And no, just converting it to Unicode with "sitetext = unicode(sitetext)"
won't work either; that correctly raises a Unicode conversion exception.)

[Python 2.4, Win32]

				JohnNagle



More information about the Python-list mailing list