XML and newlines

Hans Nowak wurmy at earthlink.net
Fri Feb 15 20:51:28 EST 2002


Howdy y'all,

Recently I've been dabbling with XML and the xml.dom.minidom 
module, with good results so far. Except for one problem I
encountered, that got me stumped. 

For my current project I need to store the values of an
instance's attributes in XML. We're using SQLXML to talk to
an SQL Server 2000 database, and changes in tables are made
by sending updategrams, in XML formats. To store an instance
of an object in the database, all its attributes need to be
captured in an XML "expression". This works well for all
attributes (mostly just strings and numbers) except one, a
long multiline string containing newlines.

I manage to store that string just fine; only when I retrieve
the record, all the newlines are gone. Since I wasn't sure
whether the newlines were sent in the first place, I used
the <?lb?> tag, as seen in the book "Learning XML". :-) So
that part of the XML may look like this:

  one
  <?lb?>
  two
  <?lb?>
  three

This at least stores the newlines, and when I get back the 
raw XML from the database (when retrieving a record), the
newlines are there. There's also a great deal of unwanted
tabs, but those are easy to get rid of. But as soon as I
parse the XML with xml.dom.minidom.parseString, all these
newlines are gone.

Right now I have a temporary solution that extracts the
desired data (including the newlines) from the raw XML string
by using a regular expression. This is not so elegant 
though, and I still have to manually convert things like
"&" and friends. Is there a better way to do this? What
am I missing?

TIA,

-- 
Hans (base64.decodestring('d3VybXlAZWFydGhsaW5rLm5ldA==')) 
# decode for email address ;-)
The Pythonic Quarter:: http://www.awaretek.com/nowak/



More information about the Python-list mailing list