xmlentities

Duncan Booth duncan.booth at invalid.invalid
Mon Feb 7 10:53:43 EST 2005


Ola Natvig wrote:

> Does anyone know a good library for transfering non standard characters 
> to enity characters in html. I want characters like < and > to be 
> transformed to < and >. And the norwegian ø to ø
> 

You could use cgi.escape to handle &, <, and > and then use error handling 
on unicode.encode to handle the other characters. That doesn't do quite 
what you ask since your ø will become &#248:

>>> s = u'<ø>'
>>> cgi.escape(s).encode('ascii', 'xmlcharrefreplace')
'<ø>'


If you really want named entities, then have a look at 
lib/test/test_codeccallbacks which has a test called 
test_xmlcharnamereplace that registers another codec error handler 
'test.xmlcharnamereplace'. I think you could probably extract that and use 
it as above.



More information about the Python-list mailing list