[Python-ideas] Dict joining using + and +=

Steven D'Aprano steve at pearwood.info
Mon Mar 4 17:56:32 EST 2019


On Sat, Mar 02, 2019 at 11:14:18AM -0800, Raymond Hettinger wrote:

> If the existing code were in the form of "d=e.copy(); d.update(f); 
> d.update(g); d.update(h)", converting it to "d = e + f + g + h" would 
> be a tempting but algorithmically poor thing to do (because the 
> behavior is quadratic).

I mention this in the PEP. Unlike strings, but like lists and tuples, I 
don't expect that this will be a problem in practice:

- it's easy to put repeated string concatenation in a tight loop;
  it is harder to think of circumstances where one needs to
  concatenate lists or tuples, or merge dicts, in a tight loop;

- it's easy to have situations where one is concatenating thousands
  of strings; its harder to imagine circumstances where one would be
  merging more than three or four dicts;

- concatentation s1 + s2 + ... for strings, lists or tuples results
  in a new object of length equal to the sum of the lengths of each
  of the inputs, so the output is constantly growing; but merging 
  dicts d1 + d2 + ... typically results in a smaller object of 
  length equal to the number of unique keys.


> Most likely, the right thing to do would be 
> "d = ChainMap(e, f, g, h)" for a zero-copy solution or "d = 
> dict(ChainMap(e, f, g, h))" to flatten the result without incurring 
> quadratic costs.  Both of those are short and clear.

And both result in the opposite behaviour of what you probably intended 
if you were trying to match e + f + g + h. Dict merging/updating 
operates on "last seen wins", but ChainMap is "first seen wins". To get 
the same behaviour, we have to write the dicts in opposite order 
compared to update, from most to least specific:

    # least specific to most specific
    prefs = site_defaults + user_defaults + document_prefs

    # most specific to least
    prefs = dict(ChainMap(document_prefs, user_defaults, site_defaults))

To me, the later feels backwards: I'm applying document prefs first, and 
then trusting that the ChainMap doesn't overwrite them with the 
defaults. I know that's guaranteed behaviour, but every time I read it 
I'll feel the need to check :-)


> Lastly, I'm still bugged by use of the + operator for replace-logic 
> instead of additive-logic.  With numbers and lists and Counters, the 
> plus operator creates a new object where all the contents of each 
> operand contribute to the result.  With dicts, some of the contents 
> for the left operand get thrown-away.  This doesn't seem like addition 
> to me (IIRC that is also why sets have "|" instead of "+").

I'm on the fence here. Addition seems to be the most popular operator 
(it often gets requested) but you might be right that this is more like 
a union operation than concatenation or addition operation. MRAB also 
suggested this earlier.

One point in its favour is that + goes nicely with - but on the other 
hand, sets have | and - with no + and that isn't a problem.


-- 
Steven


More information about the Python-ideas mailing list