[Python-ideas] iterable.__unpack__ method

Terry Reedy tjreedy at udel.edu
Wed Feb 27 01:44:26 CET 2013


On 2/25/2013 5:25 PM, Alex Stewart wrote:

Negative putdowns are off-topic and not persuasive. I starting with 1.3 
in March 1997 and first posted a month later. While cautious about 
changes and additions, I have mostly been in favor of those that have 
been released. I started using 3.0 in beta and starting using 3.3.0 with 
the first alpha to get the new stuff.

> On Sunday, February 24, 2013 7:43:58 PM UTC-8, Terry Reedy wrote:
>     The related but distinct concepts of sequential access and random
>     access
>     are basic to computation and were *always* distinct in Python.
>
> Oh really?

Yes really!

 > Then how, prior to the development of the iterator protocol,
> did one define an object which was accessible sequentially but not
> randomly in the Python language?

As I said, by using the original fake-getitem iterator protocol, which 
still works, instead of the newer iter-next iterator protocol.

Take any python-coded iterator class, such as my lookahead class. Remove 
or comment out the 'def __iter__(self): return self' statement. Change 
the header line 'def __next__(self):' to 'def __getitem__(self, n):'. 
Instances of the revised class will *today* work with for statements. 
Doing this with lookahead (3.3):

 >>> for item in lookahead('abc'): print(item)

a
b
c
 >>> dir(lookahead)
['_NONE', '__bool__', '__class__', '__delattr__', '__dict__', '__dir__', 
'__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', 
'__getitem__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', 
'__module__', '__ne__', '__new__', '__qualname__', '__reduce__', 
'__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', 
'__subclasshook__', '__weakref__', '_set_peek']

See, no __iter__, no __next__, but a __getitem__ that ignores the index 
passed. Here is where it gets a bit crazy.

 >>> s = lookahead('abc')
 >>> s[0]
'a'
 >>> s[0], s[300]
('b', 'c')
 >>> s[0]
Traceback...
StopIteration

Still, I think enabling generators was more of a motivation than 
discouraging nonsense like the above (which, obviously, is still 
possible! but also which people did not intentionally do except as a 
demonstration).

A practical difference between sequential and random access is that 
sequential access usually requires history ('state'), which random 
access does not. The __iter__ method and iter() function allows the 
separation of iterators, with iteration state, from an underlying 
concrete collections (if there is one), which usually does not need the 
iteration state. (Files, which *are* iterators and which do not have a 
separate iterator class, are unusual among builtins.) This enables 
multiple stateful iterators and nested for loops like this:

 >>> s = 'abc'
 >>> for c in lookahead(s):
	for d in lookahead(s):
		print((c,d))
		
('a', 'a')
('a', 'b')
('a', 'c')
('b', 'a')
('b', 'b')
('b', 'c')
('c', 'a')
('c', 'b')
('c', 'c')


 > If you couldn't do that, you can't
> claim that the concepts were really distinct in Python, in my opinion.

But I and anyone could and still can, so I can and will make the claim 
and reject arguments based on the false counter-claim.

---
Back to unpack and informing the source as to the number n of items 
expected to be requested.

Case 1. The desired response of the source to n is generic.

Example a: we want the source to simply refuse to produce more than the 
number specified at the start. The current generic solutions are to 
either make the exact number of explicit next() calls needed or to wrap 
the iterator in islice(iterator, n) (which in turn will make the number 
of next() calls needed).

Example b : if the source does yield n items, we want it to then yield 
the residual iterator. Again, a simple generic wrapper does the job.

import itertools

def slice_it(iterable, n):
     "Yield up to n items from iterable and if successful, the residual 
iterator:"

     it = iter(iterable)
     for i in range(n):
         yield next(it)
     else:
         yield it

a, b, c, rest = slice_it(itertools.count(), 3)
print(a, b, c, rest)
d, e = itertools.islice(rest, 2)
print(d, e)
 >>>
0 1 2 count(3)
3 4

Conclusion: generic iterator behavior modification should be done by 
generic wrappers. This is the philosophy and practice of comprehensions, 
built-in wrappers (enumerate, filter, map, reversed, and zip), and 
itertools.

Case 2. The desired response is specific to a class or even each instance.

Example: object has attributes a, b, c, d (of lesser importance), and 
can calculate e. The pieces and structures it should yield depend on n 
as in the following table.

1  ((a,b),c)
2  (a,b), c
3  a, b, c
4  a, b, c, d
5  a, b, c, d, e

Solution: write a method, call it .unpack(n), that returns an iterator 
that will produce the objects specified in the table. This can be done 
today with no change to Python. It can be done whether or not there is a 
.__iter__ method to produce a generic default iterator for the object. 
And, of course, xxx.unpack can have whatever signature is appropriate to 
xxx. It seems to me that this procedure can handle any special 
collection or structure breakup need.

Comment 1: if people can pass explicit counts to islice (or slice_it), 
they can pass explicit counts to .unpack.

Comment 2: we are not disagreeing that people might want to do custom 
count-dependent disassembly or that they should be able to do so. It can 
already be done.

Areas of disagreement:

1. consumer-source interdependency: you seem to think there is something 
special about the consumer assigning items to multiple targets in one 
statement, as opposed to doing anything else, including doing the 
multiple assignments in multiple statements. I do not. Moreover, I so 
far consider introducing such dependency in the core to be a regression.

2. general usefulness: you want .unpack to be standardized and made a 
special method. I think it is inherently variable enough and and the 
need rare enough to not justify that.

I retrieved your original post and plan to look at it sometime to see if 
I missed anything not covered by the above. But enough for today.

-- 
Terry Jan Reedy




More information about the Python-ideas mailing list