Problem round-tripping with xml.dom.minidom pretty-printer

Ben Butler-Cole ben.butlercole at gmail.com
Fri Feb 29 11:12:47 EST 2008


Hello

I have run into a problem using minidom. I have an HTML file that I
want to make occasional, automated changes to (adding new links). My
strategy is to parse it with minidom, add a node, pretty print it and
write it back to disk.

However I find that every time I do a round trip minidom's pretty
printer puts extra blank lines around every element, so my file grows
without limit. I have found that normalizing the document doesn't make
any difference. Obviously I can fix the problem by doing without the
pretty-printing, but I don't really like producing non-human readable
HTML.

Here is some code that shows the behaviour:

    import xml.dom.minidom as dom
    def p(t):
        d = dom.parseString(t)
        d.normalize()
        t2 = d.toprettyxml()
        print t2
        p(t2)
    p('<a><b><c/></b></a>')

Does anyone know how to fix this behaviour? If not, can anyone
recommend an alternative XML tool for simple tasks like this?

Thanks
Ben



More information about the Python-list mailing list