[issue29276] HTMLParser in Python 2.7 doesn't recognize image tags wrapped up in link tags

Ari report at bugs.python.org
Sat Jan 14 10:52:38 EST 2017


New submission from Ari:

The following code produces incorrect results under Python 2.7.13. One would expect it to print 2 lines, "Encountered a start tag: a" and "Encountered a start tag: img". Yet it prints only "Encountered a start tag: a".

from HTMLParser import HTMLParser
class MyHTMLParser(HTMLParser):
    def handle_starttag(self, tag, attrs):
        print 'Encountered a start tag: %s' % tag
parser = MyHTMLParser()
parser.feed('<a href="http://somesite.com/large_image.jpg"><img src="http://somesite.com/small_image.jpg" width="800px" /></a>')


Python 3.5.2 produces correct results on the same input and prints the expected "Encountered a start tag: a" and "Encountered a start tag: img".

from html.parser import HTMLParser
class MyHTMLParser(HTMLParser):
    def handle_starttag(self, tag, attrs):
        print("Encountered a start tag:", tag)
parser = MyHTMLParser()
parser.feed('<a href="http://somesite.com/large_image.jpg"><img src="http://somesite.com/small_image.jpg" width="800px" /></a>')

----------
components: Library (Lib)
messages: 285490
nosy: Ari
priority: normal
severity: normal
status: open
title: HTMLParser in Python 2.7 doesn't recognize image tags wrapped up in link tags
type: behavior
versions: Python 2.7

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue29276>
_______________________________________


More information about the Python-bugs-list mailing list