Thread-safety of dict

Rhamphoryncus rhamph at gmail.com
Fri Jun 1 14:43:19 EDT 2007


On Jun 1, 3:51 am, Duncan Booth <duncan.bo... at invalid.invalid> wrote:
> "Adam Olsen" <rha... at gmail.com> wrote:
> > So there you have it: if you're using a dict with custom classes (or
> > anything other than str) across multiple threads, and without locking
> > it, it's possible (though presumably extremely rare) for a lookup to
> > fail even through the key was there the entire time.
>
> Nice work.
>
> It would be an interesting exercise to demonstrate this in practice, and I
> think it should be possible without resorting to threads (by putting
> something to simulate what the other thread would do into the __cmp__
> method).

I had attempted to do so, but I lost interest when I realized I'd have
to manipulate the memory allocator at the same time. ;)


> I don't understand your reasoning which says it cannot stay in
> ma_smalltable: PyDict_SetItem only calls dictresize when at least 2/3 of
> the slots are filled. You can have 5 items in the small (8 slot) table and
> the dictionary will resize to 32 slots on adding the 6th,the next resize
> comes when you add the 22nd item.

What you're missing is that a slot can be in any of 3 states:
1) Active.  A key is here.  If the current slot doesn't match
lookdict() will try the next one.
2) Dummy.  Used to be a key here, but now it's gone.  lookdict() will
try the next one.
3) NULL.  Never was a key here.  lookdict() will stop.

lookdict() needs the NULL slots to stop searching.  As keys are added
and removed from the table it will get filled with dummy slots, so
when the total number of active+dummy slots exceeds 2/3rds it will
trigger a resize (to the same size or even a smaller size!) so as to
clear out all the dummy slots (letting lookups finish sooner).

--
Adam Olsen, aka Rhamphoryncus




More information about the Python-list mailing list