Pickle vs XML for file I/O

Tue Aug 1 05:43:45 EDT 2006

I've recently gone through a similar evaluation of my options for
persisting data. Object serialization to pickles or XML is a very easy,
quick way of persisting data but it does have drawbacks. I'm not a
professional developer, so if there are errors in my analysis, I'd love
to be corrected.

Suppose you make some changes to the object format - adding some new
attributes or properties. Suddenly your existing test data is useless.
At least with XML you can edit the files by hand to add dummy data, but
with pickles that's not an option and even with XML it's a painful and
error prone process.

Idealy you need to be able to browse and edit your saved data outside
the main program, to scan for errors, fix them manualy and easily
update your data structure as the application data model grows and
changes.

There are good reasons why relational databases are the default data
store for many professional applications because you can parse and edit
the data very easily using external tools. Personaly I'd go with
SQLite. It's soon to be a part of the Python standard library with 2.5
and is very compact. It can be a lot more work than just serializing
automaticaly, but there are toolkits such as SQLObject and SQL Alchemy
that can automate this as well.

Best regards,

Simon Hibbs