[Python-Dev] Nested scopes core dump

Michael Hudson mwh21@cam.ac.uk
20 Mar 2001 15:13:35 +0000


Does anyone know how to reply to two messages gracefully in gnus?

Guido van Rossum <guido@digicool.com> writes:

> > Maybe you could do the check for resize *after* the call to
> > insertdict?  I think that would work, but I wouldn't like to go
> > messing with such a performance critical bit of code without some
> > careful thinking.
>
> No, that could still decide to resize, couldn't it?

Yes, but not when you're inserting on a key that is already in the
dictionary - because the resize would have happened when the key was
inserted into the dictionary, and thus the problem we're seeing here
wouldn't happen.

What's happening in Ping's test case is that the dict is in some sense
being prepped to resize when an item is added but not actually
resizing until PyDict_SetItem is called again, which is unfortunately
inside a PyDict_Next loop.

Guido van Rossum <guido@digicool.com> writes:

> Ah, the solution is simple.  Check for identical keys only when about
> to resize:
> 
> 	/* if fill >= 2/3 size, double in size */
> 	if (mp->ma_fill*3 >= mp->ma_size*2) {
> 		***** test here *****
> 		if (dictresize(mp, mp->ma_used*2) != 0) {
> 			if (mp->ma_fill+1 > mp->ma_size)
> 				return -1;
> 		}
> 	}

This might also do nasty things to performance - this code path gets
travelled fairly often for small dicts.

Does anybody know the average (mean/mode/median) size for dicts in
a "typical" python program?

  -------

Using mal's pybench with and without the patch I posted shows a 0.30%
slowdown, including these interesting lines:

                  DictCreation:    1662.80 ms   11.09 us  +34.23%
        SimpleDictManipulation:     764.50 ms    2.55 us  -15.67%

DictCreation repeatedly creates dicts of size 0 and 3.
SimpleDictManipulation repeatedly adds six elements to a dict and then
deletes them again.

Dicts of size 3 are likely to be the worst case wrt. my patch; without
it, they will have a ma_fill of 3 and a ma_size of 4 (but calling
PyDict_SetItem again will trigger a resize - this is what happens in
Ping's example), but with my patch they will always have an ma_fill of
3 and a ma_size of 8.  Hence why the DictCreation is so much worse,
and why I asked the question about average dict sizes.

Mind you, 6 is a similar edge case, so I don't know why
SimpleDictManipulation does better.  Maybe something to do with
collisions or memory behaviour.

Cheers,
M.

-- 
  I don't remember any dirty green trousers.
                                             -- Ian Jackson, ucam.chat