"<!" in SGMLParser - an error ?

Martin von Loewis loewis at informatik.hu-berlin.de
Sun Nov 11 08:01:57 EST 2001


Amit Weisman <weismann at netvision.net.il> writes:

> But with a string that contains "< bla bla bla>" or "<! bla bla> "
> I get an error message ->

Please, PLEASE, report the actual content of the document you are
trying to parse (exact URL, if possible), instead of making an example
up.

sgmllib has no problems with parsing "< bla bla bla>", since this is
treated as opening the "bla" element, with two bla attributes with no
value. I don't believe you when you say that you get an error in this
case.

It *does* have a problem with "<! bla bla> ", since it does not allow
a space character between the exclamation mark and the first character
of the directive. I believe you must not have a space in there, in
SGML, either, so this is not a problem with Python, but an error in
the document.

If you still want to process this very document, it is likely that you
have to redefine parse_declaration. I recommend to study the source of
sgmllib.py to find out how to do this, since parsing non-SGML
documents with sgmllib is not supported.

Regards,
Martin



More information about the Python-list mailing list