[Python-ideas] Dict joining using + and +=
Steven D'Aprano
steve at pearwood.info
Mon Mar 4 17:56:32 EST 2019
On Sat, Mar 02, 2019 at 11:14:18AM -0800, Raymond Hettinger wrote:
> If the existing code were in the form of "d=e.copy(); d.update(f);
> d.update(g); d.update(h)", converting it to "d = e + f + g + h" would
> be a tempting but algorithmically poor thing to do (because the
> behavior is quadratic).
I mention this in the PEP. Unlike strings, but like lists and tuples, I
don't expect that this will be a problem in practice:
- it's easy to put repeated string concatenation in a tight loop;
it is harder to think of circumstances where one needs to
concatenate lists or tuples, or merge dicts, in a tight loop;
- it's easy to have situations where one is concatenating thousands
of strings; its harder to imagine circumstances where one would be
merging more than three or four dicts;
- concatentation s1 + s2 + ... for strings, lists or tuples results
in a new object of length equal to the sum of the lengths of each
of the inputs, so the output is constantly growing; but merging
dicts d1 + d2 + ... typically results in a smaller object of
length equal to the number of unique keys.
> Most likely, the right thing to do would be
> "d = ChainMap(e, f, g, h)" for a zero-copy solution or "d =
> dict(ChainMap(e, f, g, h))" to flatten the result without incurring
> quadratic costs. Both of those are short and clear.
And both result in the opposite behaviour of what you probably intended
if you were trying to match e + f + g + h. Dict merging/updating
operates on "last seen wins", but ChainMap is "first seen wins". To get
the same behaviour, we have to write the dicts in opposite order
compared to update, from most to least specific:
# least specific to most specific
prefs = site_defaults + user_defaults + document_prefs
# most specific to least
prefs = dict(ChainMap(document_prefs, user_defaults, site_defaults))
To me, the later feels backwards: I'm applying document prefs first, and
then trusting that the ChainMap doesn't overwrite them with the
defaults. I know that's guaranteed behaviour, but every time I read it
I'll feel the need to check :-)
> Lastly, I'm still bugged by use of the + operator for replace-logic
> instead of additive-logic. With numbers and lists and Counters, the
> plus operator creates a new object where all the contents of each
> operand contribute to the result. With dicts, some of the contents
> for the left operand get thrown-away. This doesn't seem like addition
> to me (IIRC that is also why sets have "|" instead of "+").
I'm on the fence here. Addition seems to be the most popular operator
(it often gets requested) but you might be right that this is more like
a union operation than concatenation or addition operation. MRAB also
suggested this earlier.
One point in its favour is that + goes nicely with - but on the other
hand, sets have | and - with no + and that isn't a problem.
--
Steven
More information about the Python-ideas
mailing list