[Python-checkins] r81503 - in python/branches/release26-maint: Lib/HTMLParser.py Lib/test/test_htmlparser.py Misc/ACKS Misc/NEWS

victor.stinner python-checkins at python.org
Mon May 24 23:42:59 CEST 2010


Author: victor.stinner
Date: Mon May 24 23:42:59 2010
New Revision: 81503

Log:
Merged revisions 81500-81501 via svnmerge from 
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81500 | victor.stinner | 2010-05-24 23:33:24 +0200 (lun., 24 mai 2010) | 2 lines
  
  Issue #6662: Fix parsing of malformatted charref (&#bad;)
........
  r81501 | victor.stinner | 2010-05-24 23:37:28 +0200 (lun., 24 mai 2010) | 2 lines
  
  Add the author of the last fix (Issue #6662)
........


Modified:
   python/branches/release26-maint/   (props changed)
   python/branches/release26-maint/Lib/HTMLParser.py
   python/branches/release26-maint/Lib/test/test_htmlparser.py
   python/branches/release26-maint/Misc/ACKS
   python/branches/release26-maint/Misc/NEWS

Modified: python/branches/release26-maint/Lib/HTMLParser.py
==============================================================================
--- python/branches/release26-maint/Lib/HTMLParser.py	(original)
+++ python/branches/release26-maint/Lib/HTMLParser.py	Mon May 24 23:42:59 2010
@@ -175,6 +175,9 @@
                     i = self.updatepos(i, k)
                     continue
                 else:
+                    if ";" in rawdata[i:]: #bail by consuming &#
+                        self.handle_data(rawdata[0:2])
+                        i = self.updatepos(i, 2)
                     break
             elif startswith('&', i):
                 match = entityref.match(rawdata, i)

Modified: python/branches/release26-maint/Lib/test/test_htmlparser.py
==============================================================================
--- python/branches/release26-maint/Lib/test/test_htmlparser.py	(original)
+++ python/branches/release26-maint/Lib/test/test_htmlparser.py	Mon May 24 23:42:59 2010
@@ -313,6 +313,13 @@
                 ("starttag", "html", [("foo", u"\u20AC&aa&unsupported;")])
                 ])
 
+    def test_malformatted_charref(self):
+        self._run_check("<p>&#bad;</p>", [
+            ("starttag", "p", []),
+            ("data", "&#bad;"),
+            ("endtag", "p"),
+        ])
+
 
 def test_main():
     test_support.run_unittest(HTMLParserTestCase)

Modified: python/branches/release26-maint/Misc/ACKS
==============================================================================
--- python/branches/release26-maint/Misc/ACKS	(original)
+++ python/branches/release26-maint/Misc/ACKS	Mon May 24 23:42:59 2010
@@ -191,7 +191,7 @@
 Andy Dustman
 Gary Duzan
 Eugene Dvurechenski
-Josip Dzolonga 
+Josip Dzolonga
 Maxim Dzumanenko
 Walter Dörwald
 Hans Eckardt
@@ -812,3 +812,4 @@
 Tarek ZiadŽ
 Peter Åstrand
 Jesse Noller
+Fredrik Håård

Modified: python/branches/release26-maint/Misc/NEWS
==============================================================================
--- python/branches/release26-maint/Misc/NEWS	(original)
+++ python/branches/release26-maint/Misc/NEWS	Mon May 24 23:42:59 2010
@@ -55,6 +55,9 @@
 Library
 -------
 
+- Issue #6662: Fix parsing of malformatted charref (&#bad;), patch written by
+  Fredrik Håård
+
 - Issue #1628205: Socket file objects returned by socket.socket.makefile() now
   properly handles EINTR within the read, readline, write & flush methods.
   The socket.sendall() method now properly handles interrupted system calls.


More information about the Python-checkins mailing list