parsing XML - getting lots of white space
Robert Roy
rjroy at takingcontrol.com
Sun Nov 5 09:12:49 EST 2000
On Wed, 1 Nov 2000 19:22:30 -0000, "Tom wright"
<thomas.wright1 at ntlworld.REMOVETHIS.com> wrote:
>hi all,
>
>when parsing the following message
>
><?xml version="1.0"?>
><ServerMessage>
> <Command fromDirection="North">AddUser</Command>
> <User>
> <UserName>$userName</UserName>
> <UserId>$userId</UserId>
> </User>
></ServerMessage>
>
snip
>Is there a way to loose the '\012' strings and the space filled string ?
>and why is everything in unicode ??
The XML standard specifies that all whitespace is to be passed on to
the processing application. This differs from SGML where extraneous
whitespace is not.
from section 2.10 of the XML spec:
"An XML processor must always pass all characters in a document that
are not markup through to the application. A validating XML processor
must also inform the application which of these characters constitute
white space appearing in element content. "
In the Annotated XML spec, Tim Bray discusses this.
http://www.xml.com/axml/testaxml.htm
More information about the Python-list
mailing list