html codes

Peter Otten __peter__ at web.de
Tue Dec 9 03:19:23 EST 2008


Daniel Fetchinson wrote:

> I came across a javascript library that returns all sorts of html
> codes in the cookies it sets and I need my web framework (written in
> python :)) to decode them. I'm aware of htmlentitydefs but
> htmlentitydefs.entitydefs.keys( ) are of the form '&#xxx' but this
> javascript library uses stuff like '%3A' for the ':' for example. The
> conversion is here:
> 
> http://www.ascii.cl/htmlcodes.htm
> 
> Is there a python package/module/whatever that does the conversion for
> me or do I have to write a little wrapper myself (and introduce bugs
> while doing so :))?

>>> import urllib
>>> urllib.quote("Löblich ähnlich üblich")
'L%C3%B6blich%20%C3%A4hnlich%20%C3%BCblich'
>>> urllib.unquote(_)
'L\xc3\xb6blich \xc3\xa4hnlich \xc3\xbcblich'
>>> print _
Löblich ähnlich üblich

If you care about the encoding you have to encode/decode explicitly:

>>> urllib.quote(u"Löblich ähnlich üblich".encode("latin1"))
'L%F6blich%20%E4hnlich%20%FCblich'
>>> urllib.unquote(_).decode("latin1")
u'L\xf6blich \xe4hnlich \xfcblich'

Peter



More information about the Python-list mailing list