[Python-ideas] Fwd: Fwd: unpacking generalisations for list comprehension

Steven D'Aprano steve at pearwood.info
Fri Oct 14 22:38:08 EDT 2016


On Thu, Oct 13, 2016 at 05:30:49PM -0400, Random832 wrote:

> Frankly, I don't see why the pattern isn't obvious

*shrug*

Maybe your inability to look past your assumptions and see things from 
other people's perspective is just as much a blind spot as our inability 
to see why you think the pattern is obvious. We're *all* having 
difficulty in seeing things from the other side's perspective here.

Let me put it this way: as far as I am concerned, sequence unpacking is 
equivalent to manually replacing the sequence with its items:

    t = (1, 2, 3)
    [100, 200, *t, 300]

is equivalent to replacing "*t" with "1, 2, 3", which gives us:

    [100, 200, 1, 2, 3, 300]

That's nice, simple, it makes sense, and it works in sufficiently recent 
Python versions. It applies to function calls and assignments:

    func(100, 200, *t)  # like func(100, 200, 1, 2, 3)

    a, b, c, d, e = 100, 200, *t  # like a, b, c, d, e = 100, 200, 1, 2, 3

although it doesn't apply when the star is on the left hand side:

    a, b, *x, e = 1, 2, 3, 4, 5, 6, 7

That requires a different model for starred names, but *that* model is 
similar to its role in function parameters: def f(*args). But I digress.

Now let's apply that same model of "starred expression == expand the 
sequence in place" to a list comp:

    iterable = [t]
    [*t for t in iterable]

If you do the same manual replacement, you get:

    [1, 2, 3 for t in iterable]

which isn't legal since it looks like a list display [1, 2, ...] 
containing invalid syntax. The only way to have this make sense is to 
use parentheses:

    [(1, 2, 3) for t in iterable]

which turns [*t for t in iterable] into a no-op.

Why should the OP's complicated, hard to understand (to many of us) 
interpretation take precedence over the simple, obvious, easy to 
understand model of sequence unpacking that I describe here?

That's not a rhetorical question. If you have a good answer, please 
share it. But I strongly believe that on the evidence of this thread, 

    [a, b, *t, d]

is easy to explain, teach and understand, while:

    [*t for t in iterable]

will be confusing, hard to teach and understand except as "magic syntax" 
-- it works because the interpreter says it works, not because it 
follows from the rules of sequence unpacking or comprehensions. It might 
as well be spelled:

    [ MAGIC!!!! HAPPENS!!!! HERE!!!! t for t in iterable]

except it is shorter.

Of course, ultimately all syntax is "magic", it all needs to be learned. 
There's nothing about + that inherently means plus. But we should 
strongly prefer to avoid overloading the same symbol with distinct 
meanings, and * is one of the most heavily overloaded symbols in Python:

- multiplication and exponentiation
- wildcard imports
- globs, regexes
- collect arguments and kwargs
- sequence unpacking
- collect unused elements from a sequence

and maybe more. This will add yet another special meaning:

- expand the comprehension ("extend instead of append").

If we're going to get this (possibly useful?) functionality, I'd rather 
see an explicit flatten() builtin, or see it spelled:

    [from t for t in sequence]

which at least is *obviously* something magical, than yet another magic 
meaning to the star operator. Its easy to look it up in the docs or 
google for it, and doesn't look like Perlish line noise.



-- 
Steve


More information about the Python-ideas mailing list