sgmllib.SGMLParseError: unexpected ':' char in declaration

Carl Banks imbosol-1045933914 at aerojockey.com
Sat Feb 22 12:20:34 EST 2003


Alessio Pace wrote:
> I can't figure out how to solve this raised error in my application, I'm
> trying to use the htmllib.HTMLParser, it usually works fine but in some
> cases(I don't know why, I am processing hundreds of html texts, I will
> debug eventually case per case later..) it raises this:
> 
> Traceback (most recent call last):
> [......... ]
>  File "Html2Txt.py", line 45, in convertToTxt          # my class Html2Txt
>    parser.close()                  # close the htmllib.HTMLParser
>  File "/usr/lib/python2.2/sgmllib.py", line 99, in close
>    self.goahead(1)
>  File "/usr/lib/python2.2/sgmllib.py", line 161, in goahead
>    k = self.parse_declaration(i)
>  File "/usr/lib/python2.2/markupbase.py", line 96, in parse_declaration
>    self.error(
>  File "/usr/lib/python2.2/sgmllib.py", line 102, in error
>    raise SGMLParseError(message)
> sgmllib.SGMLParseError: unexpected ':' char in declaration
> 
> Thanks if some one can help me, I am a newbie of python.


I'm guessing there's a comment in your HTML files that is spelled like
this:

<! --  blah blah blah : blah blah blah -- >

I'm not an expert in SGML, but I do know that it has an oft
misunderstood definition of a comment.  I think the above is a valid
comment, but SGMLlib (of course) doesn't parse it right, resulting in
your error.

The other possibility is your file has one of those silly <!DOCTYPE
"blah blah blah"> things at the top that no one knows what the hell it
is.  Maybe it erroneously (in the opinion of SGMLlib) has a colon in
it.


-- 
CARL BANKS




More information about the Python-list mailing list