html entity to unicode

Peter Maas peter.maas at somewhere.com
Fri Feb 10 14:02:11 EST 2006


zunbeltz at gmail.com schrieb:
> Hi,
> 
> I'm parsing html. I have a page with a lot of html enitties for hebrew
> characters. When i print what i get are blanks, dots and commas. How
> can i decode this entities to unicode charachters?

Python doc

13.4 htmlentitydefs -- Definitions of HTML general entities

[...]

name2codepoint
A dictionary that maps HTML entity names to the Unicode codepoints. New in version 2.3.

codepoint2name
A dictionary that maps Unicode codepoints to HTML entity names. New in version 2.3.

Peter Maas Aachen



More information about the Python-list mailing list