[Baypiggies] HTML code sets
Max Slimmer
max at theslimmers.net
Fri Oct 26 21:35:02 CEST 2007
Interesting, I tried this (was aware that ISO-8859-1 is latin-1), and it doesn't work on my system. But treating it as
A utf-8 string as suggested by Chris Clark did work. I wonder if you have some encoding set in your machine different from mine. Cutting and pasting the lines you indicated return
>>> print s.encode("latin-1")
enforcing the nation’s laws
> -----Original Message-----
> From: Paul McNett [mailto:p at ulmcnett.com]
> Sent: Friday, October 26, 2007 11:46 AM
> To: Max Slimmer
> Cc: 'Python'
> Subject: Re: [Baypiggies] HTML code sets
>
> Hi Max,
>
> > I am reading some raw HTML that contains things like:
> >
> > "enforcing the nation\xe2\x80\x99s laws"
> >
> > and I need to know what incantation to apply to translate the
> > xe2,x80,x99 into some kind of apostrophe char. I can
> initialize this
> > string as str or unicode.
> >
> > The headers are:
> > '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
> > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n<html
> > xmlns="http://www.w3.org/1999/xhtml">\n<head>\n<meta
> > http-equiv="Content-Type" content="text/html;
> charset=ISO-8859-1" />\n
>
>
> ISO-8859-1 is also known as latin-1.
>
> >>> s = u"enforcing the nation\xe2\x80\x99s laws"
> >>> print s.encode("latin-1")
> enforcing the nation’s laws
>
> --
> pkm ~ http://paulmcnett.com
>
More information about the Baypiggies
mailing list