[Python-ideas] Dict joining using + and +=

Mon Mar 4 08:29:06 EST 2019

01.03.19 21:31, Guido van Rossum пише:
> On Thu, Feb 28, 2019 at 10:30 PM Serhiy Storchaka 
> <storchaka at gmail.com 
> <mailto:storchaka at gmail.com>> wrote:
>     And this opens a non-easy problem: how to create a mapping of the same
>     type? Not all mappings, and even not all dict subclasses have a copying
>     constructor.
> 
> 
> There's a compromise solution for this possible. We already do this for 
> Sequence and MutableSequence: Sequence does *not* define __add__, but 
> MutableSequence *does* define __iadd__, and the default implementation 
> just calls self.update(other). I propose the same for Mapping (do 
> nothing) and MutableMapping: make the default __iadd__ implementation 
> call self.update(other).

This LGTM for mappings. But the problem with dict subclasses still 
exists. If use the copy() method for creating a copy, d1 + d2 will 
always return a dict (unless the plus operator or copy() are redefined 
in a subclass). If use the constructor of the left argument type, there 
will be problems with subclasses with non-compatible constructors (e.g. 
defaultdict).

> Anyways, the main reason to prefer d1+d2 over {**d1, **d2} is that the 
> latter is highly non-obvious except if you've already encountered that 
> pattern before, while d1+d2 is what anybody familiar with other Python 
> collection types would guess or propose. And the default semantics for 
> subclasses of dict that don't override these are settled with the "d = 
> d1.copy(); d.update(d2)" equivalence.

Dicts are not like lists or deques, or even sets. Iterating dicts 
produces keys, but not values. The "in" operator tests a key, but not a 
value.

It is not that I like to add an operator for dict merging, but dicts are 
more like sets than sequences: they can not contain duplicated keys and 
the size of the result of merging two dicts can be less than the sum of 
their sizes. Using "|" looks more natural to me than using "+". We 
should look at discussions for using the "|" operator for sets, if the 
alternative of using "+" was considered, I think the same arguments for 
preferring "|" for sets are applicable now for dicts.

But is merging two dicts a common enough problem that needs introducing 
an operator to solve it? I need to merge dicts maybe not more than one 
or two times by year, and I am fine with using the update() method. 
Perhaps {**d1, **d2} can be more appropriate in some cases, but I did 
not encounter such cases yet.