stable object serialization to text file

Terry Reedy tjreedy at udel.edu
Thu Jan 12 15:15:54 EST 2012


On 1/12/2012 7:24 AM, Peter Otten wrote:
> Máté Koch wrote:
>
>> I'm developing an app which stores the data in file system database. The
>> data in my case consists of large python objects, mostly dicts, containing
>> texts and numbers. The easiest way to dump and load them would be pickle,
>> but I have a problem with it: I want to keep the data in version control,
>> and I would like to use it as efficiently as possible. Is it possible to
>> force pickle to store the otherwise unordered (e.g. dictionary) data in a
>> kind of ordered way, so that if I dump a large dict, then change 1 tiny
>> thing in it and dump again, the diff of the former and the new file will
>> be minimal?
>>
>> If pickle is not the best choice for me, can you suggest anything else?
>> (If there isn't any solution for it so far, I will write the module of
>> course, but first I'd like to look around and make sure it hasn't been
>> created yet.)
>
> Have you considered json?
>
> http://docs.python.org/library/json.html
>
> The encoder features a sort_keys flag which might help.

If that does not do it for you, consider that a dict is a two-column 
table, with arbitrary structures in each column. Convert to list with 
sorted(somedict.items()). This is basically what json should do. Then 
write to a text stream, one line per key,value pair. Whether you put the 
text into an os file in a directory (a hierachical database ;-) or a 
text field in another database is up to you. Either way, diffs are easy.

-- 
Terry Jan Reedy





More information about the Python-list mailing list