HTMLParser can't read japanese

Stefan Behnel stefan_ml at behnel.de
Tue Apr 13 07:56:17 EDT 2010


Dodo, 13.04.2010 13:40:
> Here's a small script to generate again the error
> running windows 7 with python 3.1
>
> FILE : parseShift.py
>
> import urllib.request as url
> from html.parser import HTMLParser
>
> class myParser(HTMLParser):
>   def handle_starttag(self, tag, attrs):
>     print("Start of %s tag : %s" % (tag, attrs))

You problem is the last line. Your terminal does not support printing the 
text, so you get an exception here.

Either change your terminal encoding to a suitable encoding, or write the 
text to an encoded file instead (see the 'encoding' option of the open() 
function for that).

Stefan




More information about the Python-list mailing list