[Python-Dev] Issue 14417: consequences of new dict runtime error

R. David Murray rdmurray at bitdance.com
Fri Mar 30 00:07:54 CEST 2012


On Thu, 29 Mar 2012 23:00:20 +0200, Stefan Behnel <stefan_ml at behnel.de> wrote:
> R. David Murray, 29.03.2012 22:31:
> > On Thu, 29 Mar 2012 13:09:17 -0700, Guido van Rossum wrote:
> >> On Thu, Mar 29, 2012 at 12:58 PM, R. David Murray wrote:
> >>> Some of us have expressed uneasiness about the consequences of dict
> >>> raising an error on lookup if the dict has been modified, the fix Victor
> >>> made to solve one of the crashers.
> >>>
> >>> I don't know if I speak for the others, but (assuming that I understand
> >>> the change correctly) my concern is that there is probably a significant
> >>> amount of threading code out there that assumes that dict *lookup* is
> >>> a thread-safe operation.  Much of that code will, if moved to Python
> >>> 3.3, now be subject to random runtime errors for which it will not
> >>> be prepared.  Further, code which appears safe can suddenly become
> >>> unsafe if a refactoring of the code causes an object to be stored in
> >>> the dictionary that has a Python equality method.
> >>
> >> My original assessment was that this only affects dicts whose keys
> >> have a user-implemented __hash__ or __eq__ implementation, and that
> >> the number of apps that use this *and* assume the threadsafe property
> >> would be pretty small. This is just intuition, I don't have hard
> >> facts. But I do want to stress that not all dict lookups automatically
> >> become thread-unsafe, only those that need to run user code as part of
> >> the key lookup.
> > 
> > You are probably correct, but the thing is that one still has to do the
> > code audit to be sure...and then make sure that no one later introduces
> > such an object type as a dict key.
> 
> The thing is: the assumption that arbitrary dict lookups are GIL-atomic has
> *always* been false. Only those that do not involve Python code execution
> for the hash key calculation or the object comparison are. That includes
> the built-in strings and numbers (and tuples of them), which are by far the
> most common dict keys. Looking up arbitrary user provided objects is
> definitely not guaranteed to be atomic.

Well, I'm afraid I was using the term 'thread safety' rather too loosely
there.  What I mean is that if you do a dict lookup, the lookup either
returns a value or a KeyError, and that if you get back an object that
object has internally consistent state.  The problem this fix introduces
is that the lookup may fail with a RuntimeError rather than a KeyError,
which it has never done before.

I think that is what Guido means by code that uses objects with python
eq/hash *and* assumes threadsafe lookup.  If mutation of the objects
or dict during the lookup is a concern, then the code would use locks
and wouldn't have the problem.  But there are certainly situations
where it doesn't matter if the dictionary mutates during the lookup,
as long as you get either an object or a KeyError, and thus no locks are
(currently) needed.

Maybe I'm being paranoid about breakage here, but as with most backward
compatibility concerns, there are probably more bits of code that will
be affected than our intuition indicates.

--David


More information about the Python-Dev mailing list