[Tutor] in a pickle about pickle

Alan Gauld alan.gauld at yahoo.co.uk
Thu Aug 13 13:01:40 EDT 2020


On 13/08/2020 13:29, nathan tech wrote:

> 1. does pickle.dump(obj, file, protocol) thread itself in any way?

I doubt it. if you need threading that would be up to you. I would hope
that it would be thread safe but you should check.

> 2. When I do something like: pickle.dump(obj, file, 2) the file that is 
> put out is still, mostly, human readable. Is this normal? It talks about 
> bytes in the docs which I took to mean it would not be human readable.

bytes are just 8 bit patterns. Most of the commonly text characters are
encoded in bytes so if you can read the text you can read it as bytes
too. Some binary data will be written as a group of bytes that may or
may not be readable. But if you have significant amounts of text in your
data then it will be visible as such in your dump.

> Related to this I also wanted to ask is there a way to see what function 
> is hanging a python process?

Probably, by using a debugger to attach to the process.
But beware hanging does not mean going slow, it means stopped without
terminating. very different.
If the code is just taking a while to finish then probably
the answer is no.

> As part of this program it saves data to files using pickle, because 
> they're dictionaries. I could use xml2dict and dict2xml, but cPickle is 
> faster (I think).

Surely shelves would be even better than pickle since they are
effectively dictionaries in a file? In fact you could possibly do away
with the dictionaries and access the shelve directly instead...


> It executes the saving when you click a button. This occurs, the python 
> program freezes for a second, then goes bacck to the GUI. that's fine, 
> expected behaviour.
> 
> If I then click exit it returns to the console (expected), and then 
> sometimes will just sit there for about 10 to 15 minutes before finally 
> finishing.

That's a very long time to write a file - they are presumably many GB
in size? If not something else is going on, possibly a race condition
or deadlock within your code. Do you have a lot of complex relationships
between your objects for example?

> Is this an issue that comes up often with pickle?

Nope, certainly not minutes, maybe a second or two.

Are you sure you shouldn't be using a database instead?

> I work mostly with python dictionaries when dumping to files, which is 
> why I figured pickle would be the best option.

Have you looked at shelve? That potentially avoids the need to write a
huge amount of data out at once.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos




More information about the Tutor mailing list