UTF8 & HTMLParser

Jan Danielsson jan.danielsson at gmail.com
Thu Nov 30 23:32:26 EST 2006


Hello all,

   I'm writing a python script which fetches a HTML-page (using wget),
and then parses the retrieved page using a custom htmllib HTMLParser.

   The page I fetch is encoded in utf8, and my text-handler currently
looks like this:

   def handle_data(self, text):
      if self.inOption:
         self.currentName = text

   However, I would like to convert the "text" (which is utf8) to
latin-1. How do I do that? I've been trying to figure it out for some
time now, and I'm just getting frustrated. :-(

-- 
Kind Regards,
Jan Danielsson
Te audire non possum. Musa sapientum fixa est in aure.



More information about the Python-list mailing list