Should HTML entity translation accept "&amp"?

Ben Finney bignose+hates-spam at benfinney.id.au
Sun Jan 6 20:25:07 EST 2008


John Nagle <nagle at animats.com> writes:

> For our own purposes, I rewrote "htmldecode" to require a sequence
> ending in ";", which means some bogus HTML escapes won't be
> recognized, but correct HTML will be processed correctly. What's
> general opinion of this behavior? Too strict, or OK?

I think it's fine. In the face of ambiguity (and deviation from the
published standards), refuse the temptation to guess.

More specifically, I don't see any reason to contort your code to
understand some non-entity sequence that would be flagged as invalid
by HTML validator tools.

-- 
 \          "Those who write software only for pay should go hurt some |
  `\              other field."  -- Erik Naggum, in _gnu.misc.discuss_ |
_o__)                                                                  |
Ben Finney



More information about the Python-list mailing list