Comment on PEP-0322: Reverse Iteration Methods

Raymond Hettinger vze4rx4y at verizon.net
Thu Sep 25 14:38:20 EDT 2003


[Andrew Dalke]
> However, the example code will work on a string, because list(string)
> returns a list, which does have a reverse.

Right.  I'll remove the comment.


> ] Extending slicing minimizes the code overhead but does nothing
> ] for memory efficiency, beauty, or clarity.
> I don't agree with all three of those.

Beauty and clarity are a matter of taste and we differ here.
To some (including me), s[::-1]  looks more like line noise
than s.iter_backwards().  Also, I think the word form is
more clear.  The situation becomes more acute when the
other indices have values.  Compare range[n-1, 0, -1] to
range(1, n-1).iterbackwards().  Whenever negative indicies
are used, it takes more knowledge of the implementation
rules to be able to read,write, understand, and verify the code.


> ] Should file objects be included? Implementing reverse iteration
> ] may not be easy though it would be useful on occasion.
>
> I would suggest no - the file may not be seekable.

Agreed.



> One thing to bear in mind is that the semantics are slightly
> different.
. . .
> That is only a very, very minor nit.  Feel free to ignore it :)

Agreed.  If finalizer call order is important, then the
'while elist: elist.pop()' form is the way to go.  If not,
then the for-loop is the way to go.


> The former is more terse, but the latter, existing code is
> broken down into bite-size chunks and allows verbose
> documentation of each step.  That isn't possible when
> nearly everything is written in one line.

Even the former can be broken down into more lines if
that is what is needed:

result.sort()
result = result[-n]
return [x for score, x in result.iter_backwards()]


> Also, a variant implementation may be even more concise,

You must be pretty confident to take on the Timbot's code  ;-)


> Later in your post you mention a reason against a function
> which uses __riter__ magic is that it is "slower than direct
> access to the underlying object."
>
> If that's a concern then you should also point out that using
> iter_backwards for cases like this is slower than the existing
> solution.  (It has an extra attr lookup and function call.  OTOH,
> doesn't the -1, -1 require two calls of the negative unary op
> code, or has that been fixed?  Good, it has, says dis on
> some test code.)

Yes, it's true.  All of the speed comments focus on the inner loop.
I'll reword to make that clear.  It is definitely an issue.  In Py2.2,
there was a generic sequence iterator that used __getitem__ in
iterators for lists and tuples.  In Py2.3, we went to a custom
iterator for each object.  There was a significant improvement
in performance.



> ] platform._dist_try_harder()
 . . .
> I think a better reformulation is
>
>       verfiles = os.listdir('/usr/lib/setup')
>       verfiles = [name for name in verfiles
>                                if name.startswith("slack-version-")]

I like your version better and recommend you submit it as a patch.
I'll take out the ifilter() comment.



> ] random.shuffle() uses for i in xrange(len(x)-1, 0, -1)
> This isn't a use case.  The xrange never returns a 0 while
> the iter_backwards one will.

It is an important use case.  The replacement code is:

  for i in xrange(1,len(x)). iter_backwards()

Whenever the indices have any complexity, the forwards version,
followed by .iter_backwards() is, IMHO, much easier to mentally verify.


> There were two intermediate 'reverse' proposals.  Here's the four
> I know about:

I'll try to add these without making the PEP as long as this thread ;-)


> I agree with the conclusion that the last three have unexpected
> interactions with mappings.  However, iter *also* has unexpected
> interactions with mappings.

Yes, but that doesn't make it less of a disadvantage.
The proposed list.iter_backwards() method does not have that
disadvantage.



> In your post list a set of problems with the function approaches.
...
> I would like to see this list added to the PEP.

Okay.  They are added.


>  I do have'
> some qualifiers for the list.
>
>   - it doesn't need to be a global function; it could be in,
>       say, itertools.  (I know I originally suggested it as a builtin,
>       but I'm not saying I'm right ;)

If there is to be a magic method, __riter__, then the access
function needs to be a builtin.  They go hand in hand.

BTW, don't underestimate the intensity of resistance to adding
new builtins.


>   - (see comments above that xrange(5).iter_backwards()
>       is likely slower than xrange(5, -1, -1) because of the
>       extra lookup and call)

Reworded it to talk only to the inner loop, "approaches using
__getitem__() are slower using a custom method for
each object."


>   - the first variant, which only looks for __riter__ and raises
>      an exception if it doesn't exist, has exactly the same problems
>      with infinite iterators as calling the new method.
>
>   - similarly, it doesn't have additional problems with mappings.

Okay, I've moved the comments around for you.


> Also, none of the use cases dealt with infinite sequences so
> the lack of appropriate support for them shouldn't be considered
> an outright rejection.

The use cases were just the ones I found in the library.
With generators and itertools being new, I expect
infinite iterators to become more common.

Also, I have a preference for creating something that
is as robust as possible.  Making a builtin function that
doesn't apply to all objects; behaves badly with mappings;
and crashes with an infinite iterator is not my idea of robust.

I really don't understand the overpowering urge to make this a function
and add a new magic method protocol.  Why isn't there a similar rage
against dict.iteritems()?

Perhaps, you'll like it better if the method is named ireverse()
instead of iter_backwards().


> BTW, my current belief is that if this PEP leads to code
> then it should be:
>   - a new function, either builtin or in itertools
>
>   - which uses __riter__ if present (unless we decide to
>       use the adapt PEP)
>
>   - else uses "for i in itertools.count(): yield[obj[-i]]"

Okay, we simply disagree here.  That's fine.

I appreciate the thoughtful comments and careful analysis.
They have helped clarify the pep wording and helped to
define the issues.


Raymond








More information about the Python-list mailing list