Html entities
Fredrik Lundh
fredrik at pythonware.com
Wed Mar 21 10:29:36 EST 2001
Syver Enstad wrote:
> Is there an easy way to convert ISO Latin-1 characters that are above 127
> ascii to their HTML, XML entity form?
something like this might work:
# htmlentitydefs-example-3.py
# from (the eff-bot guide to) the standard python library
import htmlentitydefs
import re, string
# this pattern matches substrings of reserved and non-ASCII characters
pattern = re.compile(r"[&<>\"\x80-\xff]+")
# create character map
entity_map = {}
for i in range(256):
entity_map[chr(i)] = "&%d;" % i
for entity, char in htmlentitydefs.entitydefs.items():
if entity_map.has_key(char):
entity_map[char] = "&%s;" % entity
def escape_entity(m, get=entity_map.get):
return string.join(map(get, m.group()), "")
def escape(string):
return pattern.sub(escape_entity, string)
print escape("<spam&eggs>")
print escape("å i åa ä e ö")
## prints:
## <spam&eggs>
## å i åa ä e ö
Cheers /F
<!-- (the eff-bot guide to) the standard python library:
http://www.pythonware.com/people/fredrik/librarybook.htm
-->
More information about the Python-list
mailing list