XML and newlines

Hans Nowak wurmy at earthlink.net
Sat Feb 16 18:02:27 EST 2002


"Martin v. Loewis" wrote:
> 
> Hans Nowak <wurmy at earthlink.net> writes:
> 
> > It was part of a previous problem; the XML supposedly only contained \n,
> > where \r\n was desirable. While investigating it, I stumbled upon a
> > different, more serious problem-- no newlines at all.
> 
> This is all intentional. In content, the XML processor *must*
> normalize newline characters, replacing different variants of newline
> with #xA. In *attribute* only, newline is further normalized to space
> characters.  So if you have content where newline characters are
> relevant, don't put that into attributes. If you have data which are
> that sensible to their binary representation, you may consider not
> using XML at all, or chosing a binary-safe representation, such as
> base64-encoded strings.

I don't have much choice, I'm afraid... since I don't have an SQL
Server 2000 client, SQLXML is the only way I can talk to this
database. On top of that, I'm dependent on the XML returned by 
the server. If you have a table X with fields A, B and C, you'll
get the contents of a record like this:

  <X A="foo" B="bar" C="baz" />

and since this string-with-newlines is a value of a field in a
table, it is returned as an attribute, whether I want that or
not. :-(

I guess I'll use regular expressions to extract the data. That
does work, although it's not pretty.

Thanks for the clarifications,

-- 
Hans (base64.decodestring('d3VybXlAZWFydGhsaW5rLm5ldA==')) 
# decode for email address ;-)
The Pythonic Quarter:: http://www.awaretek.com/nowak/



More information about the Python-list mailing list