[XML-SIG] yet another question.

Andrew M. Kuchling akuchlin@cnri.reston.va.us
Tue, 8 Sep 1998 15:33:24 -0400 (EDT)


Michael Sobolev writes:
>    <foo>
>        <bar>1</bar>
>        <bar>2</bar>
>    </foo>
>This is fine since expat is not a validating parser.  What should I
>expect from a validating one?  After the declaration, foo cannot have
>any pcdata at all. 

Consult the annotated XML spec at www.xml.com.  Section 2.10 discusses
this:

	An XML processor must always pass all characters in a document
	that are not markup through to the application. A validating
	XML processor must also inform the application which of these
	characters constitute white space appearing in element
	content.

"Element content" is defined in section 3.2.1 as:

	An element type has element content when elements of that type must
	contain only child elements (no character data), optionally
	separated by white space (characters matching the nonterminal
	S).

So, a validating parser must still tell the application that this
whitespace is present, though it might not use the same mechanism it
uses for #PCDATA content.  For example, in the SAX interface there's a
method called ignorableWhitespace that would be used.  I'd imagine
that few applications will care about this, since few will treat
<bar>1</bar>\n<bar>2</bar> differently from <bar>1</bar><bar>2</bar>.
XML editors are probably the big exception to this, since an editor
would want to preserve whitespace when editing a document.

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
Most of my ideas were rejected and I got used to it. One can get fond of
almost anything, even rejection.
    -- Tom Baker, in his autobiography