[Python-bugs-list] [ python-Bugs-741029 ] HTMLParser -- possible bug in handle_comment
SourceForge.net
noreply@sourceforge.net
Wed, 21 May 2003 13:04:01 -0700
Bugs item #741029, was opened at 2003-05-21 05:35
Message generated for change (Comment added) made by logistix
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=741029&group_id=5470
Category: Python Library
Group: Python 2.2.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Scott Israel (scott_israel)
Assigned to: Nobody/Anonymous (nobody)
Summary: HTMLParser -- possible bug in handle_comment
Initial Comment:
>>> import HTMLParser
>>> class Parser(HTMLParser.HTMLParser):
def __init__(self):
HTMLParser.HTMLParser.__init__
(self)
def handle_data(self,data):
print 'DATA: %s' % data
def handle_comment(self,comment):
print 'COMMENT: %s' % comment
>>> test3='<STYLE><!-- This is a comment -->
</STYLE>'
>>> p=Parser()
>>> p.feed(test3)
DATA: <!-- This is a comment -->
Is this a bug?
----------------------------------------------------------------------
Comment By: logistix (logistix)
Date: 2003-05-21 15:04
Message:
Logged In: YES
user_id=699438
No, <style> is one of the tags that uses CDATA to make
comments irrelevant. This was done so to 'enable legacy
support' by allowing authors to write:
<style>
<!--
body{dd:00;}
-->
</style>
Without the comments, most legacy browsers would display
the text "body{dd:00;}" on the rendered webpage.
HTML Spec reference is here:
http://www.w3.org/TR/html4/present/styles.html#h-14.5
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=741029&group_id=5470