[Python-ideas] thread safe dictionary initialisation from mappings: dict(non_dict_mapping)

Anselm Kruis a.kruis at science-computing.de
Tue Nov 20 10:33:53 CET 2012


Am 19.11.2012 22:09, schrieb Terry Reedy:
> On 11/19/2012 12:24 PM, Anselm Kruis wrote:
>> Hello,
>>
>> I found the following annoying behaviour of dict(non_dict_mapping) and
>> dict.update(non_dict_mapping), if non_dict_mapping implements
>> collections.abc.Mapping but is not an instance of dict. In this case the
>> implementations of dict() and dict.update() use PyDict_Merge(PyObject
>> *a, PyObject *b, int override).
>>
>> The essential part of PyDict_Merge(a,b, override) is
>>
>> # update dict a with the content of mapping b.
>> keys = b.keys()
>> for key in keys:
>>     ...
>>     a[key] = b.__getitem__(key)
>>
>> This algorithm is susceptible to race conditions, if a second thread
>> modifies the source mapping b between "b.keys()" and b.__getitem__(key):
>> - If the second thread deletes an item from b, PyDict_Merge fails with a
>> KeyError exception.
>> - If the second thread inserts a new value and then modifies an existing
>> value, a contains the modified value but not the new value.
>
> It is well-known that mutating a collection while iterating over it can
> lead to unexpected or undesired behavior, including exceptions. This is
> not limited updating a dict from a non-dict source. The generic answer
> is Don't Do That.

Actually that's not the case here: the implementation of dict does not 
iterate over the collection while another thread mutates the collection. 
It iterates over a list of the keys and this list does not change.

>
>> Of course the current behaviour is the best you can get with a "minimum
>> mapping interface".
>
> To me, if you know that the source in d.update(source) is managed (and
> mutated) in another thread, the obvious solution (to Not Do That) is to
> lock the source. This should work for any source and for any similar
> operation. What am I missing?

> Instead, you propose to add a specialized, convoluted method that only
> works for updates of dicts by non_dict_mappings that happen to have a
> new and very specialized magic method that automatically does the lock.
> Sorry, I don't see the point. It is not at all a generic solution to a
> generic problem.

It is the automatic locking. For list- and set-like collections it is 
already possible to implement this kind of automatic locking, because 
iterating over them returns the complete information. Mappings are 
special because of their key-value items.

If automatic locking of a collection is the right solution to a 
particular problem, depends on the problem. There are problems, where 
automatic locking is the best choice. I think, python should support it.

(If my particular applications belongs to this class of problems is 
another question and not relevant here.)

Regards
   Anselm

-- 
  Dipl. Phys. Anselm Kruis                       science + computing ag
  Senior Solution Architect                      Ingolstädter Str. 22
  email A.Kruis at science-computing.de             80807 München, Germany
  phone +49 89 356386 874  fax 737               www.science-computing.de
-- 
Vorstandsvorsitzender/Chairman of the board of management:
Gerd-Lothar Leonhart
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Michael Heinrichs, 
Dr. Arno Steitz, Dr. Ingrid Zech
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196




More information about the Python-ideas mailing list