"<!" in SGMLParser - an error ?

David Bolen db3l at fitlinxx.com
Mon Nov 12 17:06:37 EST 2001


David Eppstein <eppstein at ics.uci.edu> writes:

> In article <j4pu6p796i.fsf at informatik.hu-berlin.de>,
>  Martin von Loewis <loewis at informatik.hu-berlin.de> wrote:
> 
> > It *does* have a problem with "<! bla bla> ", since it does not allow
> > a space character between the exclamation mark and the first character
> > of the directive. I believe you must not have a space in there, in
> > SGML, either, so this is not a problem with Python, but an error in
> > the document.
> 
> He is parsing HTML.  Of course he is going to have errors in the documents.

I don't believe that's true for "HTML" - which is by definition a
proper SGML document (if you accept the SGML DTD for HTML with its
associated declarations, e.g., optional end tags and so on).

However, if you mean that today's browsers are often capable of
parsing malformed HTML and thus people get away with invalid HTML in
web pages, I'd agree.  But that's not the same as saying that HTML
implies errors in the documents from an SGML perspective.  Such errors
should imply that it wasn't valid HTML either.

--
-- David
-- 
/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l at fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/



More information about the Python-list mailing list