[Python-Dev] New dictionaries patch on SF

M.-A. Lemburg mal@lemburg.com
Sat, 26 Aug 2000 11:56:20 +0200


"Fred L. Drake, Jr." wrote:
> 
>   I've been playing with dictionaries lately trying to stamp out a
> bug:
> 
> http://sourceforge.net/bugs/?func=detailbug&bug_id=112558&group_id=5470
> 
>   It looks like any fix that really works risks a fair bit of
> performance, and that's not good.  My best-effort fix so far is on
> SourceForge:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=101277&group_id=5470
> 
> but doesn't quite work, according to Guido (I've not yet received
> instructions from him about how to reproduce the observed failure).

The solution to all this is not easy, since dictionaries can
effectively also be used *after* interpreter finalization (no
thread state). The current PyErr_* APIs all rely on having the
thread state available, so the dictionary implementation would
have to add an extra check for the thread state.

All this will considerably slow down the interpreter and then
only to solve a rare problem... perhaps we should reenable
passing back exceptions viy PyDict_GetItem() instead ?!
This will slow down the interpreter too, but it'll at least
not cause the troubles with hacking the dictionary implementation
to handle exceptions during compares.

>   None the less, performance is an issue for dictionaries, so I came
> up with the idea to use a specialized version for string keys.  When I
> saw how few of the dictionaries created by the regression test ever
> had anything else, I tried to simply make all dictionaries the
> specialized variety (they can degrade themselves as needed).  What I
> found was that just over 2% of the dictionaries created by running the
> regression test ever held any non-string keys; this may be very
> different for "real" programs, but I'm curious about how different.
>   I've also done *no* performance testing on my patch for this yet,
> and don't expect it to be a big boost without something like the bug
> fix I mentioned above, but I could be wrong.  If anyone would like to
> play with the idea, I've posted my current patch at:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=101309&group_id=5470

I very much like the idea of having a customizable lookup
method for builtin dicts.

This would allow using more specific lookup function for
different tasks (it would even be possible switching the
lookup functions at run-time via a new dict method), e.g.
one could think of optimizing string lookups using a
predefined set of slots or by assuring that the stored
keys map 1-1 by using an additional hash value modifier
which is automatically tuned to assure this feature. This
would probably greatly speed up lookups for both successful and
failing searches.

We could add also add special lookup functions for keys
which are known not to raise exceptions during compares
(which is probably what motivated your patch, right ?)
and then fall back to a complicated and slow variant
for the general case.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/