[Python-3000] PEP 3132: Extended Iterable Unpacking

Mon May 7 21:24:28 CEST 2007

On 5/7/07, Daniel Stutzbach <daniel at stutzbachenterprises.com> wrote:
> On 5/7/07, Guido van Rossum <guido at python.org> wrote:
> > And what do you return when it doesn't support the container protocol?
>
> Assign the iterator object with the remaining items to d.
>
> > Think about the use cases. It seems that *your* use case is some kind
> > of (car, cdr) splitting known from Lisp and from functional languages
> > (Haskell is built out of this idiom it seems from the examples). But
> > in Python, if you want to loop over one of those things, you ought to
> > use a for-loop; and if you really want a car/cdr split, explicitly
> > using the syntax you show above (x[0], x[1:]) is fine.
>
> The use came I'm thinking of is this:
>
> A container type or an iterable where the first few entries contain
> one type of information, and the rest of the entries are something
> that will either be discard or run through for-loop.
>
> I encounter this frequently when reading text files where the first
> few lines are some kind of header with a known format and the rest of
> the file is data.

This sounds like a parsing problem. IMO it's better to treat it as such.

> > The important use case in Python for the proposed semantics is when
> > you have a variable-length record, the first few items of which are
> > interesting, and the rest of which is less so, but not unimportant.
>
> > (If you wanted to throw the rest away, you'd just write a, b, c =
> > x[:3] instead of a, b, c, *d = x.)
>
> That doesn't work if x is an iterable that doesn't support getslice
> (such as a file object).
>
> > It is much more convenient for this
> > use case if the type of d is fixed by the operation, so you can count
> > on its behavior.
>
> > There's a bug in the design of filter() in Python 2 (which will be
> > fixed in 3.0 by turning it into an iterator BTW): if the input is a
> > tuple, the output is a tuple too, but if the input is a list *or
> > anything else*, the output is a list.  That's a totally insane
> > signature, since it means that you can't count on the result being a
> > list, *nor* on it being a tuple -- if you need it to be one or the
> > other, you have to convert it to one, which is a waste of time and
> > space. Please let's not repeat this design bug.
>
> I agree that's broken, because it carves out a weird exception for
> tuples.  I disagree that it's analogous because I'm not suggesting
> carving out an exception.
>
> I'm suggesting, that:
>
> - lists return lists
> - tuples return tuples
> - XYZ containers return XYZ containers
> - non-container iterables return iterators.
>
> It's a consistent rule, albeit a different consistent rule than always
> returning the same type.

But I expect less useful. It won't support "a, *b, c = <something>"
either. From an implementation POV, if you have an unknown object on
the RHS, you have to try slicing it before you try iterating over it;
this may cause problems e.g. if the object happens to be a defaultdict
-- since x[3:] is implemented as x[slice(None, 3, None)], the
defaultdict will give you its default value. I'd much rather define
this in terms of iterating over the object until it is exhausted,
which can be optimized for certain known types like lists and tuples.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)