Python script to optimize XML text

Robert Dailey rcdailey at gmail.com
Mon Sep 24 16:36:05 EDT 2007


Hi,

I'm currently seeking a python script that provides a way of optimizing out
useless characters in an XML document to provide the optimal size for the
file. For example, assume the following XML script:

<root>
    <Test></Test>
    <!-- <CommentedOutElement/> -->

    <!-- Do Something Else -->
</root>

By running this through an XML optimizer, the file would appear as:

<root><Test/></root>

Note that the following were changed:
- All comments were stripped from the XML
- All spaces, tabs, carriage returns, and other forms of unimportant
whitespace are removed
- Elements that contain no text or children that are in the form of
<Empty></Empty> use the short-hand method for ending an element body:
<Empty/>

Anyone know of a tool or python script that can perform optimizations like
explained above? I realize I could probably do this with regular expressions
in python, however I was hoping someone already did this work.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20070924/9a44339f/attachment.html>


More information about the Python-list mailing list