[Python-bugs-list] [ python-Bugs-741029 ] HTMLParser -- possible bug in handle_comment

Wed, 21 May 2003 13:04:01 -0700

Bugs item #741029, was opened at 2003-05-21 05:35
Message generated for change (Comment added) made by logistix
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=741029&group_id=5470

Category: Python Library
Group: Python 2.2.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Scott Israel (scott_israel)
Assigned to: Nobody/Anonymous (nobody)
Summary: HTMLParser -- possible bug in handle_comment

Initial Comment:
>>> import HTMLParser
>>> class Parser(HTMLParser.HTMLParser):
	def __init__(self):
		HTMLParser.HTMLParser.__init__
(self)
	def handle_data(self,data):
		print 'DATA: %s' % data
	def handle_comment(self,comment):
		print 'COMMENT: %s' % comment

>>> test3='<STYLE><!-- This is a comment --> 
</STYLE>'
>>> p=Parser()
>>> p.feed(test3)
DATA: <!-- This is a comment -->

Is this a bug?

----------------------------------------------------------------------

Comment By: logistix (logistix)
Date: 2003-05-21 15:04

Message:
Logged In: YES 
user_id=699438

No, <style> is one of the tags that uses CDATA to make 
comments irrelevant.  This was done so to 'enable legacy 
support' by allowing authors to write:

<style>
<!--
body{dd:00;}
-->
</style>

Without the comments, most legacy browsers would display 
the text "body{dd:00;}" on the rendered webpage.

HTML Spec reference is here:
http://www.w3.org/TR/html4/present/styles.html#h-14.5

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=741029&group_id=5470