[Python-ideas] iterable.__unpack__ method
Terry Reedy
tjreedy at udel.edu
Wed Feb 27 01:44:26 CET 2013
On 2/25/2013 5:25 PM, Alex Stewart wrote:
Negative putdowns are off-topic and not persuasive. I starting with 1.3
in March 1997 and first posted a month later. While cautious about
changes and additions, I have mostly been in favor of those that have
been released. I started using 3.0 in beta and starting using 3.3.0 with
the first alpha to get the new stuff.
> On Sunday, February 24, 2013 7:43:58 PM UTC-8, Terry Reedy wrote:
> The related but distinct concepts of sequential access and random
> access
> are basic to computation and were *always* distinct in Python.
>
> Oh really?
Yes really!
> Then how, prior to the development of the iterator protocol,
> did one define an object which was accessible sequentially but not
> randomly in the Python language?
As I said, by using the original fake-getitem iterator protocol, which
still works, instead of the newer iter-next iterator protocol.
Take any python-coded iterator class, such as my lookahead class. Remove
or comment out the 'def __iter__(self): return self' statement. Change
the header line 'def __next__(self):' to 'def __getitem__(self, n):'.
Instances of the revised class will *today* work with for statements.
Doing this with lookahead (3.3):
>>> for item in lookahead('abc'): print(item)
a
b
c
>>> dir(lookahead)
['_NONE', '__bool__', '__class__', '__delattr__', '__dict__', '__dir__',
'__doc__', '__eq__', '__format__', '__ge__', '__getattribute__',
'__getitem__', '__gt__', '__hash__', '__init__', '__le__', '__lt__',
'__module__', '__ne__', '__new__', '__qualname__', '__reduce__',
'__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__',
'__subclasshook__', '__weakref__', '_set_peek']
See, no __iter__, no __next__, but a __getitem__ that ignores the index
passed. Here is where it gets a bit crazy.
>>> s = lookahead('abc')
>>> s[0]
'a'
>>> s[0], s[300]
('b', 'c')
>>> s[0]
Traceback...
StopIteration
Still, I think enabling generators was more of a motivation than
discouraging nonsense like the above (which, obviously, is still
possible! but also which people did not intentionally do except as a
demonstration).
A practical difference between sequential and random access is that
sequential access usually requires history ('state'), which random
access does not. The __iter__ method and iter() function allows the
separation of iterators, with iteration state, from an underlying
concrete collections (if there is one), which usually does not need the
iteration state. (Files, which *are* iterators and which do not have a
separate iterator class, are unusual among builtins.) This enables
multiple stateful iterators and nested for loops like this:
>>> s = 'abc'
>>> for c in lookahead(s):
for d in lookahead(s):
print((c,d))
('a', 'a')
('a', 'b')
('a', 'c')
('b', 'a')
('b', 'b')
('b', 'c')
('c', 'a')
('c', 'b')
('c', 'c')
> If you couldn't do that, you can't
> claim that the concepts were really distinct in Python, in my opinion.
But I and anyone could and still can, so I can and will make the claim
and reject arguments based on the false counter-claim.
---
Back to unpack and informing the source as to the number n of items
expected to be requested.
Case 1. The desired response of the source to n is generic.
Example a: we want the source to simply refuse to produce more than the
number specified at the start. The current generic solutions are to
either make the exact number of explicit next() calls needed or to wrap
the iterator in islice(iterator, n) (which in turn will make the number
of next() calls needed).
Example b : if the source does yield n items, we want it to then yield
the residual iterator. Again, a simple generic wrapper does the job.
import itertools
def slice_it(iterable, n):
"Yield up to n items from iterable and if successful, the residual
iterator:"
it = iter(iterable)
for i in range(n):
yield next(it)
else:
yield it
a, b, c, rest = slice_it(itertools.count(), 3)
print(a, b, c, rest)
d, e = itertools.islice(rest, 2)
print(d, e)
>>>
0 1 2 count(3)
3 4
Conclusion: generic iterator behavior modification should be done by
generic wrappers. This is the philosophy and practice of comprehensions,
built-in wrappers (enumerate, filter, map, reversed, and zip), and
itertools.
Case 2. The desired response is specific to a class or even each instance.
Example: object has attributes a, b, c, d (of lesser importance), and
can calculate e. The pieces and structures it should yield depend on n
as in the following table.
1 ((a,b),c)
2 (a,b), c
3 a, b, c
4 a, b, c, d
5 a, b, c, d, e
Solution: write a method, call it .unpack(n), that returns an iterator
that will produce the objects specified in the table. This can be done
today with no change to Python. It can be done whether or not there is a
.__iter__ method to produce a generic default iterator for the object.
And, of course, xxx.unpack can have whatever signature is appropriate to
xxx. It seems to me that this procedure can handle any special
collection or structure breakup need.
Comment 1: if people can pass explicit counts to islice (or slice_it),
they can pass explicit counts to .unpack.
Comment 2: we are not disagreeing that people might want to do custom
count-dependent disassembly or that they should be able to do so. It can
already be done.
Areas of disagreement:
1. consumer-source interdependency: you seem to think there is something
special about the consumer assigning items to multiple targets in one
statement, as opposed to doing anything else, including doing the
multiple assignments in multiple statements. I do not. Moreover, I so
far consider introducing such dependency in the core to be a regression.
2. general usefulness: you want .unpack to be standardized and made a
special method. I think it is inherently variable enough and and the
need rare enough to not justify that.
I retrieved your original post and plan to look at it sometime to see if
I missed anything not covered by the above. But enough for today.
--
Terry Jan Reedy
More information about the Python-ideas
mailing list