cPickle.load vs. file.read+cPickle.loads on large binary files

andrea.gavana at gmail.com andrea.gavana at gmail.com
Tue Nov 17 10:31:50 EST 2015


Hi Chris,

On Tuesday, November 17, 2015 at 4:20:34 PM UTC+1, Chris Angelico wrote:
> On Wed, Nov 18, 2015 at 1:20 AM,  Andrea Gavana wrote:
> > Thank you for your answer. I do get similar timings when I swap the two functions, and specifically still 15 seconds to read the file via file.read() and 2.4 seconds (more or less as before) via cPickle.load(fid).
> >
> > I thought that the order of operations might be an issue but apparently that was not the case...
> 
> What if you call one of them twice and then the other? Just trying to
> rule out any possibility that it's a caching problem.
> 
> On my Linux box, running 2.7.9 64-bit, the two operations take roughly
> the same amount of time (1.8 seconds for load vs 1s to read and 0.8 to
> loads). Are you able to run this off a RAM disk or something?
> 
> Most curious.


Thank you for taking the time to run my little script. I have now run it with multiple combinations of calls (twice the first then the other, then viceversa, then alternate between the two functions multiple times, then three times the second and once the first, ...) with no luck at all.

The file.read() line of code takes always at minimum 14 seconds (in all the trials I have done), while the cPickle.load call ranges between 2.3 and 2.5 seconds.

I am puzzled with no end... Might there be something funny with my C libraries that use fread? I'm just shooting in the dark. I have a standard Python installation on Windows, nothing fancy :-( 

Andrea.



More information about the Python-list mailing list