[Python-ideas] Python 4: Concatenation

Chris Angelico rosuav at gmail.com
Fri Jun 30 00:00:22 EDT 2017


On Fri, Jun 30, 2017 at 12:14 PM, Soni L. <fakedme+py at gmail.com> wrote:
> This isn't a *major* backwards incompatibility. Unlike with unicode/strings,
> a dumb static analysis program can trivially replace + with the
> concatenation operator, whatever that may be. Technically, nothing forces us
> to remove + from strings and such and the itertools stuff - we could just
> make them deprecated in python 4, and remove them in python 5.

It wouldn't be quite that trivial though. If all you do is replace "+"
with "&", you've broken anything that uses numeric addition. Since
this is a semantic change, you can't defer it to run-time (the way a
JIT compiler like PyPy could), and you can't afford to have it be
"mostly right but might have edge cases" like something based on type
hints would be. So there'd be some human work involved, as with the
bytes/text distinction. What you're wanting to do is take one operator
("+") and split it into two roles (addition and concatenation). That
means getting into the programmer's head, so it can't be completely
automated.

Even if it CAN be fully automated, though, what would you gain? You've
made str+str no longer valid - to what end?

Here's a counter-proposal: Start with your step 2, and create a new
__concat__ magic method and corresponding operator. Then str gets a
special case:

class str(str): # let's pretend
    def __concat__(self, other):
        return self + str(other)

And tuple gets a special case:

class tuple(tuple): # pretend again
    def __concat__(self, other):
        return *self, *other

And maybe a few others (list, set, possibly dict). For everything
else, object() will handle them:

class object(object): # mind-bending
    def __concat__(self, other):
        return itertools.chain(iter(self), iter(other))

Since this isn't *changing* the meaning of anything, it's backwards
compatible. You gain an explicit concatenation operator, the default
case is handled by Python's standard mechanisms, the special cases are
handled by Python's standard mechanisms, and it's all exactly what
people would expect. Then the use of '+' to concatenate strings can be
deprecated without removal (or, more likely, kept fully supported by
the language but deprecated in style guides), and you've mostly
achieved what you sought.

Your challenge: Find a suitable operator to use. It wants to be ASCII,
and it has to be illegal syntax in current Python versions. It doesn't
have to be a single character, but it should be short (two is okay,
three is the number thou shalt stop at, four thou shalt not count, and
five is right out) and easily typed, since string concatenation is
incredibly common. It should ideally evoke "concatenation", but that
isn't strictly necessary (the link between "@" and "matrix
multiplication" is tenuous at best).

Good luck. :) For my part, I'm -0.5 on my own counter-proposal, but
that's a fair slab better than the -1000 that I am on the version that
breaks backward compatibility for minimal real gain.

ChrisA


More information about the Python-ideas mailing list