[Tutor] trying to convert pycurl/html to ascii
bruce
badouglas at gmail.com
Mon Mar 30 03:49:23 CEST 2015
Hi.
Doing a quick/basic pycurl test on a site and trying to convert the
returned page to pure ascii.
The page has the encoding line
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
The test uses pycurl, and the StringIO to fetch the page into a str.
pycurl stuff
.
.
.
foo=gg.getBuffer()
-at this point, foo has the page in a str buffer.
What's happening, is that the test is getting the following kind of error/
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position 20:
invalid start byte
The test is using python 2.6 on redhat.
I've tried different decode functions based on different
sites/articles/stackoverflow but can't quite seem to resolve the issue.
Any thoughts/pointers would be useful!
Thanks
More information about the Tutor
mailing list