[Python-ideas] Infix functions

Sun Feb 23 22:59:25 CET 2014

From: Nathan Schneider <nathan at cmu.edu>
Sent: Sunday, February 23, 2014 1:06 PM

>On Sat, Feb 22, 2014 at 5:59 PM, Andrew Barnert <abarnert at yahoo.com> wrote:
>
>From: Hannu Krosing <hannu at krosing.net>
>>
>>Sent: Saturday, February 22, 2014 11:34 AM
>>
>>
>>
>>> On 02/22/2014 03:45 AM, Andrew Barnert wrote:
>>
>>>>  Sure, having an infinite number of easily-distinguishable operators that 
>>
>>>> happen to have the precedence that fits your use case would be create. But we
>>>> have a very limited set of operators. If you want element-wise multiplication,
>>>> div, trued, and mod, there aren't any operators left with the right
>>>> precedence for cross-multiplication, dot-product, and back-division. And even if
>>>> that weren't an issue, using % to mean mod sometimes and cross-product other
>>>> times is bound to be confusing to readers.
>>
>>
>>> Why not use multiple "operator characters" for user-defined infix
>>> operators, like postgreSQL does ?
>>>
>>> a *% b ++ c *% (d *. e)
>>
>>First, I'm not sure why everyone is focusing on the mathematical examples. 
>
>I have not seen a compelling use case for infix expressions other than mathematical operators which currently require a method, thereby forcing an order that does not match what we are used to writing. As has been argued, conciseness is a virtue for the readability of complex mathematical expressions.

For the mathematical case, are you a heavy user of NumPy, SymPy, or other such libraries who finds them unreadable because of this problem? Because my understanding is that most such people don't think it's a serious problem in their real work, which is why PEP 225 hasn't been taken up again. I believe the big trick is thinking of even vectors as multi-dimensional: Multiply two row-vectors and you get element-wise multiplication; multiply a row-vector and a column-vector and you get cross-product.

Beyond the mathematical case, let me try to explain again.

Why do people want an except expression, when you could just write it as a function? Because there are two problems with this:

    catch(lambda: 1/n, ZeroDivisionError, lambda e: nan)

First, you have to manually "delay" all the expressions by wrapping them in lambdas. Second, everything reads out of order—the important bit here is 1/n, not the fact that it's catching an error. That's why people want to write this:

    1/n except ZeroDivisionError as e: nan

But what if we had general solutions to both problems? Using Nick's short lambda syntax (actually just PEP 312 in this case) and backticks for infix operators, even though those may not be the best solutions:

    :1/n `catch` (ZeroDivisionError, :nan)

Perfect? No. Good enough that we wouldn't feel compelled to add new syntax to improve it? Maybe.

Obviously, if these two features together only solved one use case, and imperfectly at that, they wouldn't be worth it. But consider the with expression someone recently proposed:

    data = f.read() with open(path) as f

(Apologies for using a silly example; I can't remember what the supporting use cases were.) You can write that as a function, but it looks like this:

    data = withcontext(lambda f: f.read(), open(path))

But with short lambdas and infix functions, it's a lot better:

    data = :?.read() `withcontext` open(path)

Are there other "missing features" in Python that could be worked around with short lambdas and inline functions, instead of having to add new syntax for each one? (Could we have avoided some of the new features added in the past decade?) This is the part I'm not sure of. It may be that, at least for the kinds of programs people write today, Python has almost all of the constructions almost anyone needs, and we're better off filling in the last one or two gaps than looking for a general solution.

As a side note, Clojure uses a different trick in a similar way. For example, list comprehensions are plain old functions, just as in Racket and other Lisp descendants—but its Smalltalk-style infix parameter names make it look a lot more like Python/Haskell/Miranda's special syntax:

    [p**2 for p in primes while p<1000 if p%2]

    (for [p primes :when (odd? p) :while (p<1000)] p**2)

And the same trick could work for a catch function to make it even nicer, and it could have solved ternary expressions like the conditional:

    x = :1/n :catch ZeroDivisionError :then :NaN
    x = :n :then :1/n :orelse :NaN

… but I have absolutely no idea how to fit that idea into Python.

But back to your take on the mathematical case:

>So I think the best compromise would be to allow current binary operators to be augmented with a special character, such as ~ or : or `, such that (a) new ambiguities would not be introduced into the grammar, and (b) the precedence and dunder method name would be determined by the current operator. Whether the operator has been "modified" with the special character could be passed as an argument to the dunder method.

This is nearly identical to PEP 225, except that you're passing a "tilde-prefixed" flag to the existing dunder methods instead of doubling the number of dunder methods. I think in general it's better to have separate functions for separate purposes than to add flag parameters, but I can see the argument that this is a special case: likely all of the operators would treat that flag the same way, so it could just be a matter of putting, say, "if tilde: self = self.lift_and_transpose()" at the start of each method, or doing the equivalent with a decorator on each method, or dynamically applying that decorator to all of the methods, or …

>Using the special modifier character would tell the user that they have to refer to the library documentation to interpret it. In fact, I think builtins should be prohibited from supporting any of the modified operators, so as to avoid establishing a default interpretation for any of them.

Yes, PEP 225 has an extensive argument for why this is the best interpretation. (The short version is: Matlab vs. R. But there's more to it; it's worth reading.)

>>If you have a whole suite of free-form operators made up of symbol strings, I have absolutely no idea what *% means.

>Agreed, which is why I would be in favor of restricting ourselves to a single modifier character that is not already an operator.

Unfortunately, ~ _is_ already an operator, and can in fact be combined with the unary + and - operators:

    >>> ~3
    -4

    >>> ~-3
    2