[XML-SIG] Re: Working with non-compliant XML utilities

Fredrik Lundh fredrik at pythonware.com
Thu Dec 11 02:31:52 EST 2003


Lowell Alleman wrote:

> I'm working with an application that is very picky about the XML it accepts
> (basically it's non-compliant).  The company's support team isn't giving me
> many options.  Certain things that the XML spec say the parser shouldn't
> care about, this utility cares about.  Things like the order of attributes
> and whether an empty element is written as "<a></a>" or "</a>" need to be
> presented in a specific way.
>
> Any ideas on how to work around some of these issues.  Python XML tools
> would be preferred, but at this point all ideas and/or tools are welcome.
> All I need is to be able to dictate the order in which the attributes appear
> and whether or not empty elements should be written using the shortcut
> ('<a/>') form.

sounds like you need a custom XML writer.

a quick solution is to take a copy of the writexml() method from the
minidom's Element class and make it into a function (i.e. operate on
element nodes instead of self, change the recursive writexml method
call to a recursive function call, and use the _write_data from the
minidom module).

from xml.dom import minidom
from xml.dom import Node

def writexml(node, writer, indent="", addindent="", newl=""):

    if node.nodeType != Node.ELEMENT_NODE:
        # use standard serializer for everything but elements
        node.writexml(writer, indent, addindent, newl)
        return

    writer.write(indent+"<" + node.tagName)

    attrs = node._get_attributes()
    a_names = attrs.keys()
    a_names.sort()

    for a_name in a_names:
        writer.write(" %s=\"" % a_name)
        minidom._write_data(writer, attrs[a_name].value)
        writer.write("\"")
    if node.childNodes:
        writer.write(">%s"%(newl))
        for node in node.childNodes:
            writexml(node,writer,indent+addindent,addindent,newl)
        writer.write("%s</%s>%s" % (indent,node.tagName,newl))
    else:
        writer.write("/>%s"%(newl))

usage example:

    import sys

    node = minidom.parseString("<foo><bar/>hello</foo>")
    writexml(node, sys.stdout)

when this works, tweak the code (it's trivial) until it does exactly
what you want.

hope this helps!

</F>






More information about the XML-SIG mailing list