Python to XML to Python conversion

Jeremy Bowers newfroups at jerf.org
Thu Jul 11 23:01:51 EDT 2002


thehaas at binary.net wrote:
> I'd do the Python -> XML like this:
> 
> 	outfile = file("out.xml")
> 
> 	outfile.write("<pydict>")
> 	for key in dict.keys():
> 		outfile.write("<%s>%s</%s>\n" %(key, dict[key], key) )
> 
> 	outfile.write("</pydict>")
> 	outfile.close()
> 
> How's that??  Well-formed XML, without any DOM-overhead.

This is common and incorrect; the XML is not going to be well formed for 
any number of reasons. The keys of the dict are not required to be valid 
XML tag names (consider a key "1 2", wrong for starting with a number 
AND having a space in it). The keys of the dict may not be strings. The 
values of the dict may not be strings either. The values of the dict may 
contain any of several XML chars which much be encoded, such as &. 
Goodness help your XML parser if the text happens to include XML or XML 
fragments.

For each key in the dict, the odds become increasingly stacked against you.

If you __know__ you have string keys and string vals, you can do 
something like

from xml.sax.saxutils import quoteattr

...
	outfile.write('<item name=%s value=%s>' % (quoteattr(key),
		quoteattr(dict[key]))
...

(untested)

but it is still better to go with the XML marshaler or standard Pickle 
module if at all possible.

Also, part of being a good programmer is learning how to elicit good 
requirements. Do you understand why you need XML? XML is a good transfer 
language between programs and language boundaries. If you just need to 
save some data for the same program to retrieve later, you actively 
*don't* want XML. Use pickle. (Or 'shelve', which I like for quick 
projects.) If you *are* going to transfer this data to another program, 
then what do those other programs take naturally? If they have a native 
format and you can match it, you can save yourself that much trouble.

Understand the motivation. If XML is being used as a bullet point, you 
may consider politely suggesting better, cheaper, faster, 
faster-to-*develop* alternatives (cPickle). Failing that and if you 
never intend to transfer the data anywhere, then use the XML marshaler 
for the buzzword compliance and ease-of-use pickling.

(Thought: XML should never be your *first* choice of file format. It is 
the choice of *last* resort, when you absolutely *need* easy parsing in 
multiple languages or environments and can't get it any other way. It is 
then a much better choice then other formats, but only under those 
limited, albiet extremely popular, conditions.)




More information about the Python-list mailing list