Unicode to HTML entities

Tommy Nordgren tommy.nordgren at comhem.se
Wed May 30 07:53:30 EDT 2007


On 29 maj 2007, at 17.52, Clodoaldo wrote:

> I was looking for a function to transform a unicode string into
> htmlentities. Not only the usual html escaping thing but all
> characters.
>
> As I didn't find I wrote my own:
>
> # -*- coding: utf-8 -*-
> from htmlentitydefs import codepoint2name
>
> def unicode2htmlentities(u):
>
>    htmlentities = list()
>
>    for c in u:
>       if ord(c) < 128:
>          htmlentities.append(c)
>       else:
>          htmlentities.append('&%s;' % codepoint2name[ord(c)])
>
>    return ''.join(htmlentities)
>
> print unicode2htmlentities(u'São Paulo')
>
> Is there a function like that in one of python builtin modules? If not
> is there a better way to do it?
>
> Regards, Clodoaldo Pinto Neto
>
	In many cases, the need to use html/xhtml entities can be avoided by  
generating
utf8- coded pages.
------------------------------------------------------
"Home is not where you are born, but where your heart finds peace" -
Tommy Nordgren, "The dying old crone"
tommy.nordgren at comhem.se





More information about the Python-list mailing list