[Python-Dev] Simple dicts

damien morton dmorton@bitfurnace.com
Mon, 19 May 2003 04:32:32 -0400


Well, I implemented the chained dict, and against stock 2.3b1 I am
seeing about a 4-5% speedup on pystone, and about a 10-15% speedup on a
simplistic largedict benchmark which inserts 200K strings, retrieves
them once, and then removes them one at a time. Suggestions for more
appropriate benchmarks are, as usual, always welcome. (raymond - if you
have a suite of benchmarks specifically for dicts, I would love to have
access to them). Move-to-front chains are implemented, and a benchmark
that excerised skewed access patterns would be great.

Memory usage is more than the current implementation, but is highly
tunable. You can adjust the ratio of dictentry structs to first
pointers. My simple largedict benchmark performed best with an 8:16
ratio, while the pystone benchmark performed best with a 6:16 ratio.
Moving up to a 100:16 ratio on the largedict benchmark adversly affected
performance by about 10%. It may pay off to schedule the sparsity ratio
according to the size of the dict. Also, because performance and memory
usage varies roughly linearly with sparsity, it may be a less dangerous
candidate for being user settable. Where fail-fast characteristics are
required, sparsity may be highly desireable. 

I need to address Tim's concerns about the poor hash function used for
python integers, but I think this can be addressed easily enough. I
would welcome some guidance about what hash functions need to be
addressed though. Is it just integers? (theres a great article on
integer hash functions at www.cris.com/~Ttwang/tech/inthash.htm)

If anyone wants to try out the code, please download
www.bitfurnace.com/python/dict.zip

Im still trying to get the above code to pass the regression tests. Most
things go smoothly, but some tests throw out this kind of error:
"unknown scope for self in test_len(103) in C:\Documents and
Settings\Administrator\Desktop\python\Python-2.3b1\lib\test\test_builtin
.py
symbols: {}
locals: {}
globals: {}
"
Still trying to track down the source of this error. No idea why
symbols, locals and globals would all be empty at this point though.

Comments, suggestions, etc welcome.

- Damien Morton