[XML-SIG] xml.dom.minidom.toprettyxml whitespace question
Malcolm Tredinnick
malcolm at commsecure.com.au
Tue Jan 4 04:23:31 CET 2005
On Mon, 2005-01-03 at 13:29 -0500, Andy Meyer wrote:
> Hello all,
>
> I have a question about the xml.dom.minidom.toprettyxml method's
> insertion of whitespace into text elements, e.g.
> '<foo><bar>Hello!</bar></foo>' getting transformed by toprettyxml to:
>
> <foo>
> <bar>
> Hello!
> </bar>
> </foo>
>
> with the addition of tabs and newlines around 'Hello!', instead of:
>
> <foo>
> <bar>Hello!</bar>
> </foo>
>
> Since a SAX-style parser would read the second example as identical to
> the raw XML, to me the second way is more correct than the first, but
> I'm new to XML and handling whitespace seems to be an unresolved issue.
> Is this behavior by design?
Rich has already answered your question, but I thought I would just
point out that, in fact, the second example would not generally produce
the same SAX events as the raw XML. For the raw XML, you would see
(using a bad summary of SAX events):
- start "foo" element
- start "bar" element
- characters ("Hello!")
- end "bar" element
- end "foo" element
whereas the second layout will produce
- start "foo" element
- characters (newline + tabs or spaces)
- start "bar" element
- characters ("Hello!")
- end "bar" element
- characters (newline)
- end "foo" element
In other words, the SAX parser will not normally discard things your eye
will gloss over; it cannot tell that they are not significant.
Cheers,
Malcolm
More information about the XML-SIG
mailing list