profiling and performance of shelves
Eric S. Johansson
esj at harvee.org
Fri Jun 25 08:55:51 EDT 2004
Eric S. Johansson wrote:
> was profiling some of my code trying to figure out just why it was
> running slow and I discovered that shelves containing dictionaries were
> significantly slower than those containing simple tuples. for example,
I simplified the code further and the results suggest using shelf is not a good idea for certain data structures. it's really pretty slow. Slow enough I'm considering writing my own data dependent system for preserving information.
what's interesting is there is an 8 to 1 difference in performance between saving values and retrieving values (retrieving is 8 times slower). I don't know enough yet to know whether this is real or an artifact of my test.
here's the code:
#!/usr/bin/python
import sys
import shelve
def main():
# spamtrap_cache = camram_utils.message_cache( configuration_data = config_data)
dictionary_test = shelve.open("slowpoke")
cnt =0
x = 0
while x<5000:
dictionary_test[str(x)]={
"x1":"const string1",
"x2":"const string2",
"x3":"const string3",
"x4":"const string4",
"x5":"const string5",
"x6":"const string6",
"x7":"const string7",
}
x=x+1
for i in dictionary_test.keys():
token = dictionary_test[i]
cnt=cnt+1
print cnt
print token
import profile
profile.run('main()', '/tmp/speed')
import pstats
stats = pstats.Stats('/tmp/speed')
stats.strip_dirs().sort_stats().print_stats()
stats.print_callees()
------ here's the results from the test run
[root at redweb esj]# python speed2.py
{'x2': 'const string2', 'x3': 'const string3', 'x1': 'const string1', 'x6': 'const string6', 'x7': 'const string7', 'x4': 'const string4', 'x5': 'const string5'}
Fri Jun 25 08:35:21 2004 /tmp/speed
Fri Jun 25 08:35:21 2004 /tmp/speed
85032 function calls in 9.830 CPU seconds
Random listing order was used
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.010 0.010 anydbm.py:43(?)
1 0.000 0.000 0.010 0.010 shelve.py:146(__init__)
1 0.430 0.430 9.830 9.830 speed2.py:6(main)
0 0.000 0.000 profile:0(profiler)
1 0.000 0.000 0.000 0.000 shelve.py:82(close)
1 0.000 0.000 0.000 0.000 shelve.py:89(__del__)
1 0.000 0.000 0.000 0.000 whichdb.py:5(whichdb)
5001 7.640 0.002 8.360 0.002 shelve.py:69(__getitem__)
1 0.000 0.000 0.000 0.000 anydbm.py:69(open)
1 0.000 0.000 0.010 0.010 shelve.py:151(open)
1 0.000 0.000 0.000 0.000 dumbdbm.py:33(_Database)
1 0.000 0.000 0.000 0.000 anydbm.py:46(error)
1 0.010 0.010 0.010 0.010 dbhash.py:1(?)
1 0.000 0.000 0.000 0.000 dumbdbm.py:22(?)
1 0.000 0.000 0.000 0.000 whichdb.py:1(?)
1 0.000 0.000 0.000 0.000 shelve.py:52(__init__)
1 0.040 0.040 0.040 0.040 shelve.py:55(keys)
5000 0.990 0.000 0.990 0.000 shelve.py:73(__setitem__)
1 0.000 0.000 9.830 9.830 <string>:1(?)
75014 0.720 0.000 0.720 0.000 <string>:0(?)
1 0.000 0.000 9.830 9.830 profile:0(main())
Random listing order was used
Function called...
anydbm.py:43(?) anydbm.py:46(error)(1) 0.000
dbhash.py:1(?)(1) 0.010
dumbdbm.py:22(?)(1) 0.000
shelve.py:146(__init__) anydbm.py:43(?)(1) 0.010
anydbm.py:69(open)(1) 0.000
shelve.py:52(__init__)(1) 0.000
speed2.py:6(main) shelve.py:55(keys)(1) 0.040
shelve.py:69(__getitem__)(5001) 8.360
shelve.py:73(__setitem__)(5000) 0.990
shelve.py:151(open)(1) 0.010
profile:0(profiler) profile:0(main())(1) 9.830
shelve.py:82(close) --
shelve.py:89(__del__) shelve.py:82(close)(1) 0.000
whichdb.py:5(whichdb) --
shelve.py:69(__getitem__) <string>:0(?)(75014) 0.720
anydbm.py:69(open) whichdb.py:1(?)(1) 0.000
whichdb.py:5(whichdb)(1) 0.000
shelve.py:151(open) shelve.py:146(__init__)(1) 0.010
dumbdbm.py:33(_Database) --
anydbm.py:46(error) --
dbhash.py:1(?) --
dumbdbm.py:22(?) dumbdbm.py:33(_Database)(1) 0.000
whichdb.py:1(?) --
shelve.py:52(__init__) --
shelve.py:55(keys) --
shelve.py:73(__setitem__) --
<string>:1(?) shelve.py:89(__del__)(1) 0.000
speed2.py:6(main)(1) 9.830
<string>:0(?) --
profile:0(main()) <string>:1(?)(1) 9.830
More information about the Python-list
mailing list