[issue670664] HTMLParser.py - more robust SCRIPT tag parsing

Chris Palmer report at bugs.python.org
Tue Dec 2 03:06:04 CET 2008


Chris Palmer <chris at isecpartners.com> added the comment:

Here is an additional test case. I have a super simple HTML "minifier"
that burps when given this test file:

========
$ cat test.html 
'foo <sc'+'ript>'
========

The explosion is:

========
$ ./minify.py test.html 
Warning: malformed start tag
'foo Traceback (most recent call last):
  File "./minify.py", line 84, in <module>
    m.feed(f.read())
  File "/usr/local/lib/python2.5/HTMLParser.py", line 108, in feed
    self.goahead(0)
  File "/usr/local/lib/python2.5/HTMLParser.py", line 148, in goahead
    k = self.parse_starttag(i)
  File "/usr/local/lib/python2.5/HTMLParser.py", line 226, in parse_starttag
    endpos = self.check_for_whole_start_tag(i)
  File "/usr/local/lib/python2.5/HTMLParser.py", line 302, in
check_for_whole_start_tag
    raise AssertionError("we should not get here!")
AssertionError: we should not get here!
========

----------
nosy: +cpalmer
versions: +Python 2.5 -Python 2.3
Added file: http://bugs.python.org/file12183/minify.py

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue670664>
_______________________________________


More information about the Python-bugs-list mailing list