Bug in htmlentitydefs.py with Python 3.0?

"Martin v. Löwis" martin at v.loewis.de
Thu Dec 27 09:07:38 EST 2007


>     entity_map = htmlentitydefs.entitydefs.copy()
>     for name, entity in entity_map.items():
>         if len(entity) != 1:
>             entity_map[name] = unichr(int(entity[2:-1]))
> 
> (entitydefs is pretty unusable as it is, but it was added to Python
> before Python got Unicode strings, and changing it would break ancient
> code...)

I would not write it this way, but as

for name,codepoint in htmlentitydefs.name2codepoint:
  entity_map[name] = unichr(codepoint)

I don't find that too unusable, although having yet another dictionary
name2char might be more convenient, for use with ElementTree.

(side note: I think it would be better if ElementTree treated the
.entity mapping similar to DTD ENTITY declarations, assuming internal
entities. The the OP's code might have worked out of the box. That
would be an incompatible change also, of course.)

Regards,
Martin



More information about the Python-list mailing list