Idempotent XML processing

Michael Ekstrand mekstran at scl.ameslab.gov
Fri Aug 19 12:21:54 EDT 2005


Hello all,

In my current project, I am working with XML data in a protocol that has
checksum/signature verification of a portion of the document. There is 
an envelope with a header element, containing signature data; following 
the header is a body. The signatures are computed as cryptographic 
checksums of the entire Body element, including start and end tags, 
exactly as it appears in the data transmission.

Therefore, I need to extract the entire text of an element of an XML 
document. I have a function that scans an XML string and does this, but 
it seems like a rather clumsy way to accomplish this task. I've been 
playing with xml.dom.minidom and its toxml() method, but to no avail - 
the server sends me XML with empty elements as full open/close tags, 
but toxml() serializes them to the XML empty element (<Element/>), so 
the checksum winds up not matching.

Is there some parsing mechanism (using PyXML or any other freely usable 
3rd party library is an option) that will allow me to accomplish this? 
Or am I best off sticking with my little string scanning function?

TIA,
Michael




More information about the Python-list mailing list