[Python-ideas] + operator on generators
Steven D'Aprano
steve at pearwood.info
Sun Jun 25 23:23:36 EDT 2017
On Sun, Jun 25, 2017 at 02:06:54PM +0200, lucas via Python-ideas wrote:
> What about providing something like the following:
>
> a = (n for n in range(2))
> b = (n for n in range(2, 4))
> tuple(a + b) # -> 0 1 2 3
As Serhiy points out, this is going to conflict with existing use of +
operator for string and sequence concatenation.
I have a counter-proposal: introduce the iterator chaining operator "&":
iterable & iterable --> itertools.chain(iterable, iterable)
The reason I choose & rather than + is that & is less likely to conflict
with any existing string/sequence types. None of the built-in or std lib
sequences that I can think of support the & operator.
Also, & is used for (string?) concatenation in some languages, such as
VB.Net, some BASIC dialects, Hypertalk, AppleScript, and Ada. Iterator
chaining is more like concatenation than (numeric) addition.
However, the & operator is already used for bitwise-AND. Under my
proposal that behaviour will continue, and will take priority over
chaining. Currently, the & operator does something similar to (but
significantly more complex) to this:
# simplified pseudo-code of existing behaviour
if hasattr(x, '__and__'):
return x.__and__(y)
elif hasattr(y, '__rand__'):
return y.__rand__(x)
else:
raise TypeError
The key is to insert the new behaviour after the existing __(r)and__
code, just before TypeError is raised:
attempt existing __(r)and__ behaviour
if and only if that fails to apply:
return itertools.chain(iter(x), iter(y))
So classes that define a __(r)and__ method will keep their existing
behaviour.
This implies that we cannot use & to chain sets and frozen sets, since
they already define __(r)and__. This has an easy work-around: just call
iter() on the set first.
Applying & to objects which don't define __(r)and__ and aren't iterable
will continue to raise TypeError, just as it does now. The only
backwards incompatibility this proposal introduces is for any code which
relies on `iterable & iterable` to raise TypeError. Frankly I can't
imagine that there is any such code, outside of the Python test suite,
but if there is, and people think it is worth it, we could make this a
__future__ import. But I think that's overkill.
The downside to this proposal is that it adds some conceptual complexity
to Python operators. Apart from `is` and `is not`, all Python operators
call one or more dunder methods. This is (as far as I know) the first
operator which has fall-back functionality if the dunder methods aren't
defined.
Up to now, I've talked about & chaining being equivalent to the
itertools.chain function. That glosses over one difference which
needs to be mentioned. The chain function currently doesn't attempt
to iterate over its arguments until needed:
py> x = itertools.chain("a", 1, "c")
py> next(x)
'a'
py> next(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
Any proposal to change this behaviour for the itertools.chain
function should be kept separate from this one.
But for the & chaining operator, I think that behaviour must change: if
we have an operand that is neither iterable nor defines __(r)and__, the
& operator should fail early:
[1, 2, 3] & None
should raise TypeError immediately, unlike itertools.chain().
--
Steve
More information about the Python-ideas
mailing list