minidom questions

xtian xtian at toysinabag.com
Thu Nov 20 04:27:45 EST 2003


Hi -

I'm doing some data conversion with minidom (turning a csv file into a
specific xml format), and I've hit a couple of small problems.

1: The output format has a header with some xml that looks something
like this:
<item xmlns="" xmlns:thing="http://www.blah.com">
    <thing:child name="smith"/>
</item>

As I understand it, this is a valid use of namespaces.
If I add this to the start of the document, when I do a .toxml(), I
get an exception. Here's a small example:

>>> s = """<item xmlns=""
xmlns:thing="http://www.blah.com"><thing:child
name="smith"/></item>"""
>>> doc = minidom.parseString(s)
>>> print doc.toxml()

Traceback (most recent call last):
  File "<pyshell#26>", line 1, in -toplevel-
    print doc.toxml()
  File "C:\PYTHON23\lib\xml\dom\minidom.py", line 47, in toxml
    return self.toprettyxml("", "", encoding)
  File "C:\PYTHON23\lib\xml\dom\minidom.py", line 59, in toprettyxml
    self.writexml(writer, "", indent, newl, encoding)
  File "C:\PYTHON23\lib\xml\dom\minidom.py", line 1746, in writexml
    node.writexml(writer, indent, addindent, newl)
  File "C:\PYTHON23\lib\xml\dom\minidom.py", line 811, in writexml
    _write_data(writer, attrs[a_name].value)
  File "C:\PYTHON23\lib\xml\dom\minidom.py", line 301, in _write_data
    data = data.replace("&", "&").replace("<", "<")
AttributeError: 'NoneType' object has no attribute 'replace'

Doing some debugging, the xmlns attribute (is it really an attribute?)
has a value of None, rather than "".
I can work around this by replacing the implementation of
Element.writexml with one including:

value = attrs[a_name].value
if value is None:
    value = ""

Is this a bug? Am I doing something wrong?

2: Formatting - I'd like the output xml not to put extra line breaks
inside elements that contain only text nodes (which is what
.toprettyxml does by default) - the tool that uses the xml treats the
line breaks as significant. The .toxml method works, but I'd like to
have the output be prettier than this (while not being as pretty as
the output of .toprettyxml :). I can see how to get what I want by
replacing Element.writexml with one that checks to see whether all the
childNodes are text. Is there a better way to do this?

Thanks,
xtian




More information about the Python-list mailing list