toprettyxml messes up with whitespaces
Marc 'BlackJack' Rintsch
bj_666 at gmx.net
Wed Oct 3 08:49:51 EDT 2007
On Wed, 03 Oct 2007 12:18:45 +0200, Jorgen Bodde wrote:
>> Which part of the standard is this? Here's the XML 1.0 specification's
>> section on whitespace:
>>
>> http://www.w3.org/TR/2006/REC-xml-20060816/#sec-white-space
>
> Well 2.10 if I quote:
>
> <quote>
> Such white space is typically not intended for inclusion in the
> delivered version of the document. On the other hand, "significant"
> white space that should be preserved in the delivered version is
> common, for example in poetry and source code.
> </quote>
>
> I interpret "significant" whitespaces as the ones between the words,
> if whitespaces occur at the beginning of a line due to an indent like
Significant whitespace is all whitespace in nodes that may contain text.
You need a DTD or schema to decide this, that's why all pretty printing
without a DTD or schema is broken IMHO. Because you then simply don't
know if it is safe to strip or add whitespace.
> <value>
> This is indented text
> </value>
>
> We can assume that the spaces in front of it are not significant
> whitespaces.
I can't. You are just guessing.
> Because when I read the text node in python and it is not
> included, I see no reason why it should be preserved.
But it should be included.
Ciao,
Marc 'BlackJack' Rintsch
More information about the Python-list
mailing list