Thread-safe way to add a key to a dict only if it isn't already there?

Sun Jul 8 09:38:17 EDT 2018

On Sun, 08 Jul 2018 14:11:58 +0300, Marko Rauhamaa wrote:

> Steven D'Aprano <steve+comp.lang.python at pearwood.info>:
>> Changing implementations from one which is thread safe to one which is
>> not can break people's code, and shouldn't be done on a whim.
>> Especially since such breakage could be subtle, hard to notice, harder
>> to track down, and even harder still to fix.
> 
> Java's HotSpot does it all the time, and it did result in code breakage
> -- although the code was broken to begin with.

I said "shouldn't be done", rather than claiming that was the situation 
right now with all compilers.

But I'm willing to give a little bit of slack to aggressively optimizing 
compilers, provided they come with a warning.

>> So there is no coherent way to get a result of "impossible" from just
>> adding 1 to 1 in any coherent implementation of Python.
> 
> Back to Java, there was a real case of 64-bit integer operations not
> being atomic on 32-bit machines. Mixing up upper and lower halves
> between threads could result in really weird evaluations.

Oh don't get me wrong, I agree with you that threading can result in 
strange, unpredictable errors.

That's why I try not to use threading. I have no illusions about my 
ability to debug those sorts of problems.

> More importantly, this loop may never finish:
> 
>     # Initially
>     quit = False
> 
>     # Thread 1
>     global quit
>     while not quit:
>         time.sleep(1)
> 
>     # Thread 2
>     global quit
>     quit = True

Assuming that thread 2 actually runs *at some point*, I don't see how 
that can't terminate. Neither thread sets quit to False, so provided 
thread 2 runs at all, it has to terminate.

I suppose if the threading implementation *could* fall back to sequential 
code (thread 2 doesn't run until thread 1 finishes, which it never 
does...) that outcome is possible. But it would have to be a pretty poor 
implementation.

Now if you said there were fifty threads (aside from the main thread, 
which is guaranteed to run) all reading quit, and only thread 50 ever 
assigns to it, I'd believe that perhaps thread 50 never gets a chance to 
run. But with just two threads? Explain please.

> That's the reality in Java and C. I see no reason why that wouldn't be
> the reality in Python as well -- unless the language specification said
> otherwise.

Because the Python core developers care more about correctness than speed.

> Marko
> 
> PS My example with "impossible" being the result of a racy integer
> operation is of course unlikely but could be the outcome if the Python
> runtime reorganized its object cache on the fly (in a hypothetical
> implementation).

That would be a cache bug :-)

Not every interpreter bug should be considered the caller's fault :-)

-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson