[issue25258] HtmlParser doesn't handle void element tags correctly

Ezio Melotti report at bugs.python.org
Fri Oct 2 21:42:35 CEST 2015


Ezio Melotti added the comment:

Note that HTMLParser tries to follow the HTML5 specs, and for this case they say [0]:
"Set the self-closing flag of the current tag token. Switch to the data state. Emit the current tag token."

So it seems that for <img />, only the <img> (and not the closing </img>) should be emitted.  HTMLParser has no way to set the self-closing flag, so calling handle_startendtag seems the most reasonable things to do, since it allows tree-builders to set the flag themselves.  That said, the default implementation of handle_startendtag should indeed just call handle_starttag, however this would be a backward-incompatible change.

[0]: http://www.w3.org/TR/html5/syntax.html#self-closing-start-tag-state

----------
type:  -> behavior

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue25258>
_______________________________________


More information about the Python-bugs-list mailing list