[issue25258] HtmlParser doesn't handle void element tags correctly
Ezio Melotti
report at bugs.python.org
Fri Oct 2 21:42:35 CEST 2015
Ezio Melotti added the comment:
Note that HTMLParser tries to follow the HTML5 specs, and for this case they say [0]:
"Set the self-closing flag of the current tag token. Switch to the data state. Emit the current tag token."
So it seems that for <img />, only the <img> (and not the closing </img>) should be emitted. HTMLParser has no way to set the self-closing flag, so calling handle_startendtag seems the most reasonable things to do, since it allows tree-builders to set the flag themselves. That said, the default implementation of handle_startendtag should indeed just call handle_starttag, however this would be a backward-incompatible change.
[0]: http://www.w3.org/TR/html5/syntax.html#self-closing-start-tag-state
----------
type: -> behavior
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue25258>
_______________________________________
More information about the Python-bugs-list
mailing list