[Python-Dev] accumulator display syntax

Guido van Rossum guido at python.org
Fri Oct 17 17:45:43 EDT 2003


[Guido]
> > Let's look for an in-line generator notation instead.  I like
> >
> >   sum((yield x for x in S))

[Alex]
> So do I, _with_ the mandatory extra parentheses and all, and in
> fact I think it might be even clearer with the extra colon that Phil
> had mentioned, i.e.
> 
>     sum((yield: x for x in S))
> 
> > but perhaps we can make this work:
> >
> >   sum(x for x in S)
> 
> Perhaps the parser can be coerced to make this work, but the
> mandatory parentheses, the yield keyword, and possibly the colon, 
> too, may all help, it seems to me, in making this syntax stand
> out more.

Hm.  I'm not sure that it *should* stand out more.  The version with
the yield keyword and the colon draws undue attention to the
mechanism.  I bet that if you showed

  sum(x for x in range(10))

to a newbie they'd have no problem understanding it (their biggest
problem would be that range(10) is [0, 1, ..., 9] rather than [1, 2,
..., 10]) but if you showed them

  sum((yield: x for x in S))

they would probably scratch their heads.

I also note that if it wasn't for list comprehensions, the form

  <expr> for <vars> in <expr>

poses absolutely no problems to the parser, since it's just a ternary
operator (though the same is true for the infamous

  <expr> if <test> else <expr>

:-).

List comprehensions make this a bit difficult because they use the
same form in a specific context for something different; at the very
best this would mean that

  [x for x in S]

and

  [(x for x in S)]

are completely different beasts: the first would be equivalent to

  list(S)

while the second would be equivalent to

  [iter(S)]

i.e. a list whose only only element is an iterator over S (not a very
useful thing to have, except perhaps if you had a function taking a
list of iterators as an argument).

> Yes, some uses may "read" more naturally with as
> little extras as feasible, notably [examples that might be better
> done with list comprehensions except for _looks_...]:
> 
> even_digits = Set(x for x in range(0, 10) if x%2==0)
> 
> versus
> 
> even_digits = Set((yield: x for x in range(0, 10) if x%2==0))
> 
> but that may be because the former notation leads back to
> the "set comprehensions" that list comprehensions were
> originally derived from.  I don't think it's that clear in other
> cases which have nothing to do with sets, such as, e.g.,
> Peter Norvig's original examples of "accumulator displays".

Let's go over the examples from http://www.norvig.com/pyacc.html :

    [Sum: x*x for x in numbers]
    sum(x*x for x in numbers)

    [Product: Prob_spam(word) for word in email_msg]
    product(Prob_spam(word) for word in email_msg)

    [Min: temp(hour) for hour in range(24)]
    min(temp(hour) for hour in range(24))

    [Mean: f(x) for x in data]
    mean(f(x) for x in data)

    [Median: f(x) for x in data]
    median(f(x) for x in data)

    [Mode: f(x) for x in data]
    mode(f(x) for x in data)

So far, these can all be written as simple functions that take an
iterable argument, and they look as good with an iterator
comprehension as with a list argument.

    [SortBy: abs(x) for x in (-2, -4, 3, 1)]

This one is a little less obvious, because it requires the feature
from Norvig's PEP that if add() takes a second argument, the unadorned
loop control variable is passed in that position.  It could be done
with this:

    sortby((abs(x), x) for x in (-2, 3, 4, 1))

but I think that Raymond's code in CVS is just as good. :-)

Norvig's Top poses no problem:

    top(humor(joke) for joke in jokes)

In conclusion, I think this syntax is pretty cool.  (It will probably
die the same death as the ternary expression though.)

> And as soon as you consider the notation being used in
> any situation EXCEPT as the ONLY argument in a call...:

Who said that?  I fully intended it to be an expression, acceptable
everywhere, though possibly requiring parentheses to avoid ambiguities
(in list comprehensions) or excessive ugliness (e.g. to the right of
'in' or 'yield').

> foo(x, y for y in glab for x in blag)
> 
> yes, I know this passes ONE x and one iterator, because
> to pass one iterator of pairs one would have to write
> 
> foo((x, y) for y in glab for x in blag)
> 
> but the distinction between the two seems quite error
> prone to me.

It would requier extra parentheses here:

  foo(x, (y for y in glab for x in blag))

> BTW, semantically, it WOULD be OK for
> these iterator comprehension to NOT "leak" their
> control variables to the surrounding scope, right...?

Yes.  (I think list comprehensions shouldn't do this either; it's
just a pain to introduce a new scope; maybe such control variables
should simply be renamed to "impossible" names like the names used for
the anonymous first argument to f below:

  def f((a, b), c): ...

> I
> do consider the fact that list comprehensions "leak" that
> way a misfeature, and keep waiting for some fanatic of
> assignment-as-expression to use it IN EARNEST, e.g.,
> to code his or her desired "while c=beep(): boop(c)", use
> 
> while [c for c in [beep()] if c]:
>     boop(c)
> 
> ...:-).

Yuck.  Fortunately that would be quite slow, and the same fanatics
usually don't like that. :-)

> Anyway, back to the subject, those calls to foo seem
> very error-prone, while:
> 
> foo(x, (yield: y for y in glab for x in blag))
> 
> (mandatory extra parentheses, 'yield', and colon) seems
> far less likely to cause any such error.

I could live with the extra parentheses.  Then we get:

  (x for x in S)             # iter(S)

  [x for x in S]	     # list(S)

--Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-Dev mailing list