Python script to optimize XML text
Gabriel Genellina
gagsl-py2 at yahoo.com.ar
Mon Sep 24 22:19:13 EDT 2007
En Mon, 24 Sep 2007 17:36:05 -0300, Robert Dailey <rcdailey at gmail.com>
escribi�:
> I'm currently seeking a python script that provides a way of optimizing
> out
> useless characters in an XML document to provide the optimal size for the
> file. For example, assume the following XML script:
>
> <root>
> <Test></Test>
> <!-- <CommentedOutElement/> -->
>
> <!-- Do Something Else -->
> </root>
>
> By running this through an XML optimizer, the file would appear as:
>
> <root><Test/></root>
ElementTree does that almost for free. I've just posted an example.
source = """<root>
<Test></Test>
<!-- <CommentedOutElement/> -->
<!-- Do Something Else -->
</root>"""
import xml.etree.ElementTree as ET
tree = ET.XML(source)
print ET.tostring(tree)
Output:
<root>
<Test />
</root>
If you still want to remove all whitespace:
def stripws(node):
if node.text:
node.text = node.text.strip()
if node.tail:
node.tail = node.tail.strip()
for child in node.getchildren():
stripws(child)
stripws(tree)
print ET.tostring(tree)
Output:
<root><Test /></root>
--
Gabriel Genellina
More information about the Python-list
mailing list