Decode html, or is it unicode, how?

Ben C spamspam at spam.eggs
Mon Apr 17 14:09:40 EDT 2006


On 2006-04-17, <brandon.mcginty at gmail.com> <brandon.mcginty at gmail.com> wrote:
> Hi All,
> I've done a bunch of searching in google and in python's help, but,
> I haven't found any function to decode a string like:
> Refresh! (ihenvyr)
> In to plain english.
> [...]

I needed to do that the other day, and did it like this:

def decode(line):
	pat = re.compile(r'&#(\d+);')
	def sub(mo):
		return unichr(int(mo.group(1)))
	return pat.sub(sub, unicode(line))

there may well be a better way though.



More information about the Python-list mailing list