[issue25258] HtmlParser doesn't handle void element tags correctly

Martin Panter report at bugs.python.org
Thu Oct 1 04:05:15 CEST 2015


Martin Panter added the comment:

My thinking is that the knowledge that <img> does not have a closing tag is at a higher level than the current HTMLParser class. It is similar to knowing where the following HTML implicitly closes the <li> elements:

<ul><li>Item A<li>Item B</ul>

In both cases I would not expect the HTMLParser to report “virtual” empty or closing tags. I don’t think it should report an empty <img/> or closing </img> tag just because that is easy to do, because it would be inconsistent with other implied HTML tags. But maybe see what other people say.

I don’t know your particular use case, but I would suggest if you need to parse non-XML HTML <img> tags, use the handle_starttag() method and don’t rely on the end tag :)

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue25258>
_______________________________________


More information about the Python-bugs-list mailing list