|Title:||Extended Iterable Unpacking|
|Last-Modified:||2013-01-11 19:08:14 +0100 (Fri, 11 Jan 2013)|
|Author:||Georg Brandl <georg at python.org>|
This PEP proposes a change to iterable unpacking syntax, allowing to specify a "catch-all" name which will be assigned a list of all items not assigned to a "regular" name.
An example says more than a thousand words:
>>> a, *b, c = range(5) >>> a 0 >>> c 4 >>> b [1, 2, 3]
Many algorithms require splitting a sequence in a "first, rest" pair. With the new syntax,
first, rest = seq, seq[1:]
is replaced by the cleaner and probably more efficient:
first, *rest = seq
For more complex unpacking patterns, the new syntax looks even cleaner, and the clumsy index handling is not necessary anymore.
Also, if the right-hand value is not a list, but an iterable, it has to be converted to a list before being able to do slicing; to avoid creating this temporary list, one has to resort to
it = iter(seq) first = it.next() rest = list(it)
A tuple (or list) on the left side of a simple assignment (unpacking is not defined for augmented assignment) may contain at most one expression prepended with a single asterisk (which is henceforth called a "starred" expression, while the other expressions in the list are called "mandatory"). This designates a subexpression that will be assigned a list of all items from the iterable being unpacked that are not assigned to any of the mandatory expressions, or an empty list if there are no such items.
For example, if seq is a slicable sequence, all the following assignments are equivalent if seq has at least three elements:
a, b, c = seq, list(seq[1:-1]), seq[-1] a, *b, c = seq [a, *b, c] = seq
It is an error (as it is currently) if the iterable doesn't contain enough items to assign to all the mandatory expressions.
It is also an error to use the starred expression as a lone assignment target, as in
*a = range(5)
This, however, is valid syntax:
*a, = range(5)
Note that this proposal also applies to tuples in implicit assignment context, such as in a for statement:
for a, *b in [(1, 2, 3), (4, 5, 6, 7)]: print(b)
would print out
[2, 3] [5, 6, 7]
Starred expressions are only allowed as assignment targets, using them anywhere else (except for star-args in function calls, of course) is an error.
This feature requires a new grammar rule:
star_expr: ['*'] expr
In these two rules, expr is changed to star_expr:
comparison: star_expr (comp_op star_expr)* exprlist: star_expr (',' star_expr)* [',']
A new ASDL expression type Starred is added which represents a starred expression. Note that the starred expression element introduced here is universal and could later be used for other purposes in non-assignment context, such as the yield *iterable proposal.
The compiler is changed to recognize all cases where a starred expression is invalid and flag them with syntax errors.
A new bytecode instruction, UNPACK_EX, is added, whose argument has the number of mandatory targets before the starred target in the lower 8 bits and the number of mandatory targets after the starred target in the upper 8 bits. For unpacking sequences without starred expressions, the old UNPACK_ITERABLE opcode is kept.
The function unpack_iterable() in ceval.c is changed to handle the extended unpacking, via an argcntafter parameter. In the UNPACK_EX case, the function will do the following:
- collect all items for mandatory targets before the starred one
- collect all remaining items from the iterable in a list
- pop items for mandatory targets after the starred one from the list
- push the single items and the resized list on the stack
Shortcuts for unpacking iterables of known types, such as lists or tuples, can be added.
The current implementation can be found at the SourceForge Patch tracker [SFPATCH]. It now includes a minimal test case.
After a short discussion on the python-3000 list , the PEP was accepted by Guido in its current form. Possible changes discussed were:
- Only allow a starred expression as the last item in the exprlist. This would simplify the unpacking code a bit and allow for the starred expression to be assigned an iterator. This behavior was rejected because it would be too surprising.
- Try to give the starred target the same type as the source iterable, for example, b in a, *b = 'hello' would be assigned the string 'ello'. This may seem nice, but is impossible to get right consistently with all iterables.
- Make the starred target a tuple instead of a list. This would be consistent with a function's *args, but make further processing of the result harder.
This document has been placed in the public domain.