Unicode to HTML entities

Clodoaldo clodoaldo.pinto at gmail.com
Tue May 29 11:52:01 EDT 2007


I was looking for a function to transform a unicode string into
htmlentities. Not only the usual html escaping thing but all
characters.

As I didn't find I wrote my own:

# -*- coding: utf-8 -*-
from htmlentitydefs import codepoint2name

def unicode2htmlentities(u):

   htmlentities = list()

   for c in u:
      if ord(c) < 128:
         htmlentities.append(c)
      else:
         htmlentities.append('&%s;' % codepoint2name[ord(c)])

   return ''.join(htmlentities)

print unicode2htmlentities(u'São Paulo')

Is there a function like that in one of python builtin modules? If not
is there a better way to do it?

Regards, Clodoaldo Pinto Neto




More information about the Python-list mailing list