[Numpy-discussion] NumPy-Discussion Digest, Vol 90, Issue 45

Sun Mar 16 12:37:14 EDT 2014

I would like to see the case made for @.  Yes, I know that Guido has 
accepted the idea, but he has changed his mind before.

The PEP seems neutral to retaining both np.matrix and @.  Nearly ten 
years ago, Tim Peters <http://legacy.python.org/dev/peps/pep-0020/> gave us:

    /There should be one-- and preferably only one --obvious way to do it.
    /

W/e now have:
/

    /C=  A *  B    C becomes  an instance of  the Matrix class (m, p)
          When A and B are matrices a matrix of (m, n) and (n, p)
    respectively.
    Actually, the rules are a little more general than the above.
    /

The PEP proposes that

    /C= /A @ B where the types or classes of A, B and C are not clear.

We also have A.I for the inverse, for the square matrix) or A.T for the 
transpose of a matrix.

One way is recommended in the Zen of Python, of the two, which is the 
obvious way?

Colin W.

    /
    /

On 15-Mar-2014 9:25 PM, numpy-discussion-request at scipy.org wrote:
> Send NumPy-Discussion mailing list submissions to
> 	numpy-discussion at scipy.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://mail.scipy.org/mailman/listinfo/numpy-discussion
> or, via email, send a message with subject or body 'help' to
> 	numpy-discussion-request at scipy.org
>
> You can reach the person managing the list at
> 	numpy-discussion-owner at scipy.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of NumPy-Discussion digest..."
>
>
> Today's Topics:
>
>     1. Re: [help needed] associativity and precedence	of '@'
>        (josef.pktd at gmail.com)
>     2. Re: [RFC] should we argue for a matrix power operator, @@?
>        (josef.pktd at gmail.com)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sat, 15 Mar 2014 21:20:40 -0400
> From: josef.pktd at gmail.com
> Subject: Re: [Numpy-discussion] [help needed] associativity and
> 	precedence	of '@'
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Message-ID:
> 	<CAMMTP+Ahag9fN3XPtS4uDRThBknVXzudc0G8TtJ7G3w3dWbBWw at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> Hi all,
>>
>> Here's the main blocker for adding a matrix multiply operator '@' to
>> Python: we need to decide what we think its precedence and associativity
>> should be. I'll explain what that means so we're on the same page, and what
>> the choices are, and then we can all argue about it. But even better would
>> be if we could get some data to guide our decision, and this would be a lot
>> easier if some of you all can help; I'll suggest some ways you might be
>> able to do that.
>>
>> So! Precedence and left- versus right-associativity. If you already know
>> what these are you can skim down until you see CAPITAL LETTERS.
>>
>> We all know what precedence is. Code like this:
>>    a + b * c
>> gets evaluated as:
>>    a + (b * c)
>> because * has higher precedence than +. It "binds more tightly", as they
>> say. Python's complete precedence able is here:
>>    http://docs.python.org/3/reference/expressions.html#operator-precedence
>>
>> Associativity, in the parsing sense, is less well known, though it's just
>> as important. It's about deciding how to evaluate code like this:
>>    a * b * c
>> Do we use
>>    a * (b * c)    # * is "right associative"
>> or
>>    (a * b) * c    # * is "left associative"
>> ? Here all the operators have the same precedence (because, uh... they're
>> the same operator), so precedence doesn't help. And mostly we can ignore
>> this in day-to-day life, because both versions give the same answer, so who
>> cares. But a programming language has to pick one (consider what happens if
>> one of those objects has a non-default __mul__ implementation). And of
>> course it matters a lot for non-associative operations like
>>    a - b - c
>> or
>>    a / b / c
>> So when figuring out order of evaluations, what you do first is check the
>> precedence, and then if you have multiple operators next to each other with
>> the same precedence, you check their associativity. Notice that this means
>> that if you have different operators that share the same precedence level
>> (like + and -, or * and /), then they have to all have the same
>> associativity. All else being equal, it's generally considered nice to have
>> fewer precedence levels, because these have to be memorized by users.
>>
>> Right now in Python, every precedence level is left-associative, except
>> for '**'. If you write these formulas without any parentheses, then what
>> the interpreter will actually execute is:
>>    (a * b) * c
>>    (a - b) - c
>>    (a / b) / c
>> but
>>    a ** (b ** c)
>>
>> Okay, that's the background. Here's the question. We need to decide on
>> precedence and associativity for '@'. In particular, there are three
>> different options that are interesting:
>>
>> OPTION 1 FOR @:
>> Precedence: same as *
>> Associativity: left
>> My shorthand name for it: "same-left" (yes, very creative)
>>
>> This means that if you don't use parentheses, you get:
>>     a @ b @ c  ->  (a @ b) @ c
>>     a * b @ c  ->  (a * b) @ c
>>     a @ b * c  ->  (a @ b) * c
>>
>> OPTION 2 FOR @:
>> Precedence: more-weakly-binding than *
>> Associativity: right
>> My shorthand name for it: "weak-right"
>>
>> This means that if you don't use parentheses, you get:
>>     a @ b @ c  ->  a @ (b @ c)
>>     a * b @ c  ->  (a * b) @ c
>>     a @ b * c  ->  a @ (b * c)
>>
>> OPTION 3 FOR @:
>> Precedence: more-tightly-binding than *
>> Associativity: right
>> My shorthand name for it: "tight-right"
>>
>> This means that if you don't use parentheses, you get:
>>     a @ b @ c  ->  a @ (b @ c)
>>     a * b @ c  ->  a * (b @ c)
>>     a @ b * c  ->  (a @ b) * c
>>
>> We need to pick which of which options we think is best, based on whatever
>> reasons we can think of, ideally more than "hmm, weak-right gives me warm
>> fuzzy feelings" ;-). (In principle the other 2 possible options are
>> tight-left and weak-left, but there doesn't seem to be any argument in
>> favor of either, so we'll leave them out of the discussion.)
>>
>> Some things to consider:
>>
>> * and @ are actually not associative (in the math sense) with respect to
>> each other, i.e., (a * b) @ c and a * (b @ c) in general give different
>> results when 'a' is not a scalar. So considering the two expressions 'a * b
>> @ c' and 'a @ b * c', we can see that each of these three options gives
>> produces different results in some cases.
>>
>> "Same-left" is the easiest to explain and remember, because it's just, "@
>> acts like * and /". So we already have to know the rule in order to
>> understand other non-associative expressions like a / b / c or a - b - c,
>> and it'd be nice if the same rule applied to things like a * b @ c so we
>> only had to memorize *one* rule. (Of course there's ** which uses the
>> opposite rule, but I guess everyone internalized that one in secondary
>> school; that's not true for * versus @.) This is definitely the default we
>> should choose unless we have a good reason to do otherwise.
>>
>> BUT: there might indeed be a good reason to do otherwise, which is the
>> whole reason this has come up. Consider:
>>      Mat1 @ Mat2 @ vec
>> Obviously this will execute much more quickly if we do
>>      Mat1 @ (Mat2 @ vec)
>> because that results in two cheap matrix-vector multiplies, while
>>      (Mat1 @ Mat2) @ vec
>> starts out by doing an expensive matrix-matrix multiply. So: maybe @
>> should be right associative, so that we get the fast behaviour without
>> having to use explicit parentheses! /If/ these kinds of expressions are
>> common enough that having to remember to put explicit parentheses in all
>> the time is more of a programmer burden than having to memorize a special
>> associativity rule for @. Obviously Mat @ Mat @ vec is more common than vec
>> @ Mat @ Mat, but maybe they're both so rare that it doesn't matter in
>> practice -- I don't know.
>>
>> Also, if we do want @ to be right associative, then I can't think of any
>> clever reasons to prefer weak-right over tight-right, or vice-versa. For
>> the scalar multiplication case, I believe both options produce the same
>> result in the same amount of time. For the non-scalar case, they give
>> different answers. Do people have strong intuitions about what expressions
>> like
>>    a * b @ c
>>    a @ b * c
>> should do actually? (I'm guessing not, but hey, you never know.)
>>
>> And, while intuition is useful, it would be really *really* nice to be
>> basing these decisions on more than *just* intuition, since whatever we
>> decide will be subtly influencing the experience of writing linear algebra
>> code in Python for the rest of time. So here's where I could use some help.
>> First, of course, if you have any other reasons why one or the other of
>> these options is better, then please share! But second, I think we need to
>> know something about how often the Mat @ Mat @ vec type cases arise in
>> practice. How often do non-scalar * and np.dot show up in the same
>> expression? How often does it look like a * np.dot(b, c), and how often
>> does it look like np.dot(a * b, c)? How often do we see expressions like
>> np.dot(np.dot(a, b), c), and how often do we see expressions like np.dot(a,
>> np.dot(b, c))? This would really help guide the debate. I don't have this
>> data, and I'm not sure the best way to get it. A super-fancy approach would
>> be to write a little script that uses the 'ast' module to count things
>> automatically. A less fancy approach would be to just pick some code you've
>> written, or a well-known package, grep through for calls to 'dot', and make
>> notes on what you see. (An advantage of the less-fancy approach is that as
>> a human you might be able to tell the difference between scalar and
>> non-scalar *, or check whether it actually matters what order the 'dot'
>> calls are done in.)
>>
>> -n
>>
>> --
>> Nathaniel J. Smith
>> Postdoctoral researcher - Informatics - University of Edinburgh
>> http://vorpus.org
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> I'm in favor of same-left because it's the easiest to remember.
> with scalar factors it is how I read formulas.
>
> Both calculating dot @ first or calculating elementwise * first sound
> logical, but I wouldn't know which should go first. (My "feeling" would be
> @ first.)
>
>
> two cases I remembered in statsmodels
> H = np.dot(results.model.pinv_wexog, scale[:,None] *
> results.model.pinv_wexog.T)
> se = (exog * np.dot(covb, exog.T).T).sum(1)
>
> we are mixing * and dot pretty freely in all combinations AFAIR
>
> my guess is that I wouldn't trust any sequence without parenthesis for a
> long time.
> (and I don't trust a sequence of dots @ without parenthesis either, in our
> applications.)
>
> x @ (W.T @ W) @ x      ( W.shape = (10000, 5) )
> or
> x * (W.T @ W) * x
>
> (w * x) @ x    weighted sum of squares
>
> Josef
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140315/d4126289/attachment-0001.html
>
> ------------------------------
>
> Message: 2
> Date: Sat, 15 Mar 2014 21:31:22 -0400
> From: josef.pktd at gmail.com
> Subject: Re: [Numpy-discussion] [RFC] should we argue for a matrix
> 	power operator, @@?
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Message-ID:
> 	<CAMMTP+DA-xhZNTekdK2fUxifyZjJOomHtPKr1eAvVfuK-WpODw at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> On Sat, Mar 15, 2014 at 8:47 PM, Warren Weckesser <
> warren.weckesser at gmail.com> wrote:
>
>> On Sat, Mar 15, 2014 at 8:38 PM, <josef.pktd at gmail.com> wrote:
>>
>>> I think I wouldn't use anything like @@ often enough to remember it's
>>> meaning. I'd rather see english names for anything that is not **very**
>>> common.
>>>
>>> I find A@@-1 pretty ugly compared to inv(A)
>>> A@@(-0.5)  might be nice   (do we have matrix_sqrt ?)
>>>
>>
>> scipy.linalg.sqrtm:
>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.sqrtm.html
>>
> maybe a good example: I could never figured that one out
>
> M = sqrtm(A)
>
> A = M @ M
>
> but what we use in stats is
>
> A = R.T @ R
> (eigenvectors dot diag(sqrt of eigenvalues)
>
> which sqrt is A@@(0.5) ?
>
> Josef
>
>
>
>>
>> Warren
>>
>>
>>
>>> Josef
>>>
>>>
>>>
>>> On Sat, Mar 15, 2014 at 5:11 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
>>>
>>>> Speaking only for myself (and as someone who has regularly used matrix
>>>> powers), I would not expect matrix power as @@ to follow from matrix
>>>> multiplication as @. I do agree that matrix power is the only reasonable
>>>> use for @@ (given @), but it's still not something I would be confident
>>>> enough to know without looking up.
>>>>
>>>> We should keep in mind that each new operator imposes some (small)
>>>> cognitive burden on everyone who encounters them for the first time, and,
>>>> in this case, this will include a large fraction of all Python users,
>>>> whether they do numerical computation or not.
>>>>
>>>> Guido has given us a tremendous gift in the form of @. Let's not insist
>>>> on @@, when it is unclear if the burden of figuring out what @@ means it
>>>> would be worth using, even for heavily numeric code. I would certainly
>>>> prefer to encounter norm(A), inv(A), matrix_power(A, n),
>>>> fractional_matrix_power(A, n) and expm(A) rather than their infix
>>>> equivalents. It will certainly not be obvious which of these @@ will
>>>> support for objects from any given library.
>>>>
>>>> One useful data point might be to consider whether matrix power is
>>>> available as an infix operator in other languages commonly used for
>>>> numerical work. AFAICT from some quick searches:
>>>> MATLAB: Yes
>>>> R: No
>>>> IDL: No
>>>>
>>>> All of these languages do, of course, implement infix matrix
>>>> multiplication, but it is apparently not clear at all whether the matrix
>>>> power is useful.
>>>>
>>>> Best,
>>>> Stephan
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, Mar 15, 2014 at 9:03 AM, Olivier Delalleau <shish at keba.be>wrote:
>>>>
>>>>> 2014-03-15 11:18 GMT-04:00 Charles R Harris <charlesr.harris at gmail.com>
>>>>> :
>>>>>
>>>>>
>>>>>>
>>>>>> On Fri, Mar 14, 2014 at 10:32 PM, Nathaniel Smith <njs at pobox.com>wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Here's the second thread for discussion about Guido's concerns about
>>>>>>> PEP 465. The issue here is that PEP 465 as currently written proposes
>>>>>>> two new operators, @ for matrix multiplication and @@ for matrix power
>>>>>>> (analogous to * and **):
>>>>>>>    http://legacy.python.org/dev/peps/pep-0465/
>>>>>>>
>>>>>>> The main thing we care about of course is @; I pushed for including @@
>>>>>>> because I thought it was nicer to have than not, and I thought the
>>>>>>> analogy between * and ** might make the overall package more appealing
>>>>>>> to Guido's aesthetic sense.
>>>>>>>
>>>>>>> It turns out I was wrong :-). Guido is -0 on @@, but willing to be
>>>>>>> swayed if we think it's worth the trouble to make a solid case.
>>>>>>>
>>>>>>> Note that question now is *not*, how will @@ affect the reception of
>>>>>>> @. @ itself is AFAICT a done deal, regardless of what happens with @@.
>>>>>>> For this discussion let's assume @ can be taken for granted, and that
>>>>>>> we can freely choose to either add @@ or not add @@ to the language.
>>>>>>> The question is: which do we think makes Python a better language (for
>>>>>>> us and in general)?
>>>>>>>
>>>>>>> Some thoughts to start us off:
>>>>>>>
>>>>>>> Here are the interesting use cases for @@ that I can think of:
>>>>>>> - 'vector @@ 2' gives the squared Euclidean length (because it's the
>>>>>>> same as vector @ vector). Kind of handy.
>>>>>>> - 'matrix @@ n' of course gives the matrix power, which is of marginal
>>>>>>> use but does come in handy sometimes, e.g., when looking at graph
>>>>>>> connectivity.
>>>>>>> - 'matrix @@ -1' provides a very transparent notation for translating
>>>>>>> textbook formulas (with all their inverses) into code. It's a bit
>>>>>>> unhelpful in practice, because (a) usually you should use solve(), and
>>>>>>> (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But
>>>>>>> sometimes transparent notation may be important. (And in some cases,
>>>>>>> like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be
>>>>>>> compiled into a call to solve() anyway.)
>>>>>>>
>>>>>>> (Did I miss any?)
>>>>>>>
>>>>>>> In practice it seems to me that the last use case is the one that's
>>>>>>> might matter a lot practice, but then again, it might not -- I'm not
>>>>>>> sure. For example, does anyone who teaches programming with numpy have
>>>>>>> a feeling about whether the existence of '@@ -1' would make a big
>>>>>>> difference to you and your students? (Alan? I know you were worried
>>>>>>> about losing the .I attribute on matrices if switching to ndarrays for
>>>>>>> teaching -- given that ndarray will probably not get a .I attribute,
>>>>>>> how much would the existence of @@ -1 affect you?)
>>>>>>>
>>>>>>> On a more technical level, Guido is worried about how @@'s precedence
>>>>>>> should work (and this is somewhat related to the other thread about
>>>>>>> @'s precedence and associativity, because he feels that if we end up
>>>>>>> giving @ and * different precedence, then that makes it much less
>>>>>>> clear what to do with @@, and reduces the strength of the */**/@/@@
>>>>>>> analogy). In particular, if we want to argue for @@ then we'll need to
>>>>>>> figure out what expressions like
>>>>>>>     a @@ b @@ c
>>>>>>> and
>>>>>>>     a ** b @@ c
>>>>>>> and
>>>>>>>     a @@ b ** c
>>>>>>> should do.
>>>>>>>
>>>>>>> A related question is what @@ should do if given an array as its right
>>>>>>> argument. In the current PEP, only integers are accepted, which rules
>>>>>>> out a bunch of the more complicated cases like a @@ b @@ c (at least
>>>>>>> assuming @@ is right-associative, like **, and I can't see why you'd
>>>>>>> want anything else). OTOH, in the brave new gufunc world, it
>>>>>>> technically would make sense to define @@ as being a gufunc with
>>>>>>> signature (m,m),()->(m,m), and the way gufuncs work this *would* allow
>>>>>>> the "power" to be an array -- for example, we'd have:
>>>>>>>
>>>>>>>     mat = randn(m, m)
>>>>>>>     pow = range(n)
>>>>>>>     result = gufunc_matrix_power(mat, pow)
>>>>>>>     assert result.shape == (n, m, m)
>>>>>>>     for i in xrange(n):
>>>>>>>         assert np.all(result[i, :, :] == mat ** i)
>>>>>>>
>>>>>>> In this case, a @@ b @@ c would at least be a meaningful expression to
>>>>>>> write. OTOH it would be incredibly bizarre and useless, so probably
>>>>>>> no-one would ever write it.
>>>>>>>
>>>>>>> As far as these technical issues go, my guess is that the correct rule
>>>>>>> is that @@ should just have the same precedence and the same (right)
>>>>>>> associativity as **, and in practice no-one will ever write stuff like
>>>>>>> a @@ b @@ c. But if we want to argue for @@ we need to come to some
>>>>>>> consensus or another here.
>>>>>>>
>>>>>>> It's also possible the answer is "ugh, these issues are too
>>>>>>> complicated, we should defer this until later when we have more
>>>>>>> experience with @ and gufuncs and stuff". After all, I doubt anyone
>>>>>>> else will swoop in and steal @@ to mean something else! OTOH, if e.g.
>>>>>>> there's a strong feeling that '@@ -1' will make a big difference in
>>>>>>> pedagogical contexts, then putting that off for years might be a
>>>>>>> mistake.
>>>>>>>
>>>>>>>
>>>>>> I don't have a strong feeling either way on '@@' . Matrix inverses are
>>>>>> pretty common in matrix expressions, but I don't know that the new operator
>>>>>> offers much advantage over a function call. The positive integer powers
>>>>>> might be useful in some domains, as others have pointed out, but
>>>>>> computational practice one would tend to factor the evaluation.
>>>>>>
>>>>>> Chuck
>>>>>>
>>>>> Personally I think it should go in, because:
>>>>> - it's useful (although marginally), as in the examples previously
>>>>> mentioned
>>>>>   - it's what people will expect
>>>>> - it's the only reasonable use of @@ once @ makes it in
>>>>>
>>>>> As far as the details about precedence rules and what not... Yes,
>>>>> someone should think about them and come up with rules that make sense, but
>>>>> since it will be pretty much only be used in unambiguous situations, this
>>>>> shouldn't be a blocker.
>>>>>
>>>>> -=- Olivier
>>>>>
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>>
>>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140315/ddc1812f/attachment.html
>
> ------------------------------
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> End of NumPy-Discussion Digest, Vol 90, Issue 45
> ************************************************
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140316/232670a9/attachment.html>