profiling and performance of shelves

Eric S. Johansson esj at harvee.org
Fri Jun 25 08:55:51 EDT 2004


Eric S. Johansson wrote:

> was profiling some of my code trying to figure out just why it was 
> running slow and I discovered that shelves containing dictionaries were 
> significantly slower than those containing simple tuples.  for example, 

I simplified the code further and the results suggest using shelf is not a good idea for certain data structures. it's really pretty slow.  Slow enough I'm considering writing my own data dependent system for preserving information.  

what's interesting is there is an 8 to 1 difference in performance between saving values and retrieving values (retrieving is 8 times slower).  I don't know enough yet to know whether this is real or an artifact of my test.

here's the code:

#!/usr/bin/python

import sys
import shelve

def main():
    # spamtrap_cache = camram_utils.message_cache( configuration_data = config_data)
    dictionary_test = shelve.open("slowpoke")

    cnt =0
    x = 0

    while x<5000:
        dictionary_test[str(x)]={
            "x1":"const string1",
            "x2":"const string2",
            "x3":"const string3",
            "x4":"const string4",
            "x5":"const string5",
            "x6":"const string6",
            "x7":"const string7",
            }
        
        x=x+1
    
    for i in dictionary_test.keys():
        token = dictionary_test[i]
       
        cnt=cnt+1

    print cnt
    print token
    
import profile
profile.run('main()', '/tmp/speed')

import pstats

stats = pstats.Stats('/tmp/speed')
stats.strip_dirs().sort_stats().print_stats()
stats.print_callees()


------  here's the results from the test run
[root at redweb esj]# python speed2.py
{'x2': 'const string2', 'x3': 'const string3', 'x1': 'const string1', 'x6': 'const string6', 'x7': 'const string7', 'x4': 'const string4', 'x5': 'const string5'}
Fri Jun 25 08:35:21 2004    /tmp/speed

Fri Jun 25 08:35:21 2004    /tmp/speed

         85032 function calls in 9.830 CPU seconds

   Random listing order was used

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.010    0.010 anydbm.py:43(?)
        1    0.000    0.000    0.010    0.010 shelve.py:146(__init__)
        1    0.430    0.430    9.830    9.830 speed2.py:6(main)
        0    0.000             0.000          profile:0(profiler)
        1    0.000    0.000    0.000    0.000 shelve.py:82(close)
        1    0.000    0.000    0.000    0.000 shelve.py:89(__del__)
        1    0.000    0.000    0.000    0.000 whichdb.py:5(whichdb)
     5001    7.640    0.002    8.360    0.002 shelve.py:69(__getitem__)
        1    0.000    0.000    0.000    0.000 anydbm.py:69(open)
        1    0.000    0.000    0.010    0.010 shelve.py:151(open)
        1    0.000    0.000    0.000    0.000 dumbdbm.py:33(_Database)
        1    0.000    0.000    0.000    0.000 anydbm.py:46(error)
        1    0.010    0.010    0.010    0.010 dbhash.py:1(?)
        1    0.000    0.000    0.000    0.000 dumbdbm.py:22(?)
        1    0.000    0.000    0.000    0.000 whichdb.py:1(?)
        1    0.000    0.000    0.000    0.000 shelve.py:52(__init__)
        1    0.040    0.040    0.040    0.040 shelve.py:55(keys)
     5000    0.990    0.000    0.990    0.000 shelve.py:73(__setitem__)
        1    0.000    0.000    9.830    9.830 <string>:1(?)
    75014    0.720    0.000    0.720    0.000 <string>:0(?)
        1    0.000    0.000    9.830    9.830 profile:0(main())


   Random listing order was used

Function                   called...
anydbm.py:43(?)             anydbm.py:46(error)(1)    0.000
                            dbhash.py:1(?)(1)    0.010
                            dumbdbm.py:22(?)(1)    0.000
shelve.py:146(__init__)     anydbm.py:43(?)(1)    0.010
                            anydbm.py:69(open)(1)    0.000
                            shelve.py:52(__init__)(1)    0.000
speed2.py:6(main)           shelve.py:55(keys)(1)    0.040
                            shelve.py:69(__getitem__)(5001)    8.360
                            shelve.py:73(__setitem__)(5000)    0.990
                            shelve.py:151(open)(1)    0.010
profile:0(profiler)         profile:0(main())(1)    9.830
shelve.py:82(close)         --
shelve.py:89(__del__)       shelve.py:82(close)(1)    0.000
whichdb.py:5(whichdb)       --
shelve.py:69(__getitem__)   <string>:0(?)(75014)    0.720
anydbm.py:69(open)          whichdb.py:1(?)(1)    0.000
                            whichdb.py:5(whichdb)(1)    0.000
shelve.py:151(open)         shelve.py:146(__init__)(1)    0.010
dumbdbm.py:33(_Database)    --
anydbm.py:46(error)         --
dbhash.py:1(?)              --
dumbdbm.py:22(?)            dumbdbm.py:33(_Database)(1)    0.000
whichdb.py:1(?)             --
shelve.py:52(__init__)      --
shelve.py:55(keys)          --
shelve.py:73(__setitem__)   --
<string>:1(?)               shelve.py:89(__del__)(1)    0.000
                            speed2.py:6(main)(1)    9.830
<string>:0(?)               --
profile:0(main())           <string>:1(?)(1)    9.830



    





More information about the Python-list mailing list