pickle.load() extremely slow performance
Jim Garrison
jhg at acm.org
Fri Mar 20 20:26:22 EDT 2009
John Machin wrote:
> On Mar 21, 9:25 am, Jim Garrison <j... at acm.org> wrote:
>> I'm converting a Perl system to Python, and have run into a severe
>> performance problem with pickle.
>>
>> One facet of the system involves scanning and loading into memory a
>> couple of parallel directory trees containing OTO 10^4 files. The
>> trees don't change during development/testing and the scan takes 30-40
>> seconds, so to save time I cache the loaded tree structure to disk, in
>> Perl with module Storable, and in Python with pickle.
>>
>> In Perl, the save operation produces a file of about 3MB, and both
>> save and restore take a second or two. In Python, pickle.dump()
>> produces a similar-size file but takes 20 seconds, and pickle.load()
>> takes 45 seconds, which is actually LONGER than the time required to
>> scan the directory trees.
>>
>> Is there anything I can do to speed up pickle.load() to get
>> performance comparable to Perl's Storable?
>
> Have you read this:
> http://www.python.org/doc/2.6/library/pickle.html
> ?
> Have you considered using cPickle instead of pickle?
> Have you considered using *ickle.dump(..., protocol=-1) ?
I'm using Python 3 on Windows (Server 2003). According to the docs
"The pickle module has an transparent optimizer (_pickle) written
in C. It is used whenever available. Otherwise the pure Python
implementation is used."
How can I tell if _pickle is being used?
More information about the Python-list
mailing list