Converting foreign characters to HTML characters entities

Paul Boddie paul at boddie.net
Tue May 22 06:18:23 EDT 2001


Lutz.Schroeer at tu-clausthal.de (Lutz Schroeer) wrote in message news:<Xns90A8B4310B8E7Latzikatz at 139.174.2.56>...
> My program reads strings out of a database and converts them to an HTML-
> page. The string contains German Umlauts which I would like to convert to 
> HTML character entities.
> 
> Is there any simple method to do this without explicitly writing a function 
> by myself? Are there any standard modules which I didn't find or any third 
> party objects?

In the Python Library Reference [1] there seem to be a few modules
which might help, such as 'htmlentitydefs' and possibly 'cgi'. Webware
[2] might include some functions in its 'WebUtils' package. Then,
there's always the PyXML package [3] which is included in Python 2.1
as far as I can tell.

Of course, the function wouldn't be hard to write. I think it is
permitted to use entities based on the ISO 8869-1 character values in
HTML, although you would need to check with the applicable
specifications [4]. Thus, your function would produce entities of the
form &#ddd; where d is a decimal digit.

Regards,

Paul

[1] http://www.python.org/doc/current/lib/lib.html
[2] http://webware.sourceforge.net
[3] http://pyxml.sourceforge.net
[4] http://www.w3.org



More information about the Python-list mailing list