Comment on PEP-0322: Reverse Iteration Methods

Andrew Dalke adalke at mindspring.com
Thu Sep 25 03:39:12 EDT 2003


Raymond Hettinger:
> Please comment on the new PEP for reverse iteration methods.

It says:
] Also, it only works with lists (strings, for example, do not define a
reverse method):
]
] rseqn = list(seqn)
] rseqn.reverse()
] for value in rseqn:
]     print value

However, the example code will work on a string, because list(string)
returns a list, which does have a reverse.

] Extending slicing minimizes the code overhead but does nothing
] for memory efficiency, beauty, or clarity.

I don't agree with all three of those.  Take for example your mhlib
use case

  testfolders.reverse();
  for t in testfolders:
      do('mh.deletefolder(%s)' % `t`)

The PEP suggests that

  for t in testfolders.iter_backwards():
      do('mh.deletefolder(%s)' % `t`)

is more beautiful and clearer than

  for t in testfolders[::-1]:
      do('mh.deletefolder(%s)' % `t`)

(It definitely is more memory efficient if there are enough
folders.)

I don't think I agree with that.  If I'm not concerned about memory
efficiency then extended slicing is nice because I don't have to
learn nor explain one more thing.  ("There should be one ...")

The beauty and clarity, at least to me, only come into play
when memory is a concern, which is only rarely true for most
of the times iter_backwards is expected to be used.  (Estimate
based on the use cases and my own judgement.)

] Should file objects be included? Implementing reverse iteration
] may not be easy though it would be useful on occasion.

I would suggest no - the file may not be seekable.

You suggest changing
]  while _exithandlers:
]     func, targs, kargs = _exithandlers.pop()
to
] for func, target, kargs in _exithandlers.iter_backwards():
]     . . .
] del _exithandlers

One thing to bear in mind is that the semantics are slightly
different.  In the first one, the finalizers for func, targs, kargs
*may* be called from last to first.  This is the case for
CPython if there are no other references to the _exithandlers
objects.  But since Python-the-language makes no guarantees
on that and Jython doesn't support it, it's a bad idea to write
code that way.

That is only a very, very minor nit.  Feel free to ignore it :)


In difflib, I disagree that
   result.sort()
   return [x for score, x in result[-n:].iter_backwards()]

is easier to understand than
  result.sort()           # Retain only the best n.
  result = result[-n:]    # Move best-scorer to head of list.
  result.reverse()        # Strip scores.
  return [x for score, x in result]

The former is more terse, but the latter, existing code is
broken down into bite-size chunks and allows verbose
documentation of each step.  That isn't possible when
nearly everything is written in one line.

Also, a variant implementation may be even more concise,
which is to replace
            result.append((s.ratio(), x))
with
            result.append((-s.ratio(), x))
then get rid of the reverse (since the minus sign reverses the
sort order) and simply do

   result.sort()
   return [x for score, x in result[:n]]


] heapq.heapify() uses for i in xrange(n//2 - 1, -1, -1)

The proposed form is

    for i in xrange(n//2-1).iter_backwards():

Later in your post you mention a reason against a function
which uses __riter__ magic is that it is "slower than direct
access to the underlying object."

If that's a concern then you should also point out that using
iter_backwards for cases like this is slower than the existing
solution.  (It has an extra attr lookup and function call.  OTOH,
doesn't the -1, -1 require two calls of the negative unary op
code, or has that been fixed?  Good, it has, says dis on
some test code.)

] platform._dist_try_harder() uses for n in range(len(verfiles)-1,-1,-1)
] because the loop deletes selected elements from verfiles but needs
] to leave the rest of the list intact for further iteration. This use case
] could be addressed with itertools.ifilter() but would require the
] selection predicate to be in a lambda expression. The net result is
] less clear and readable than the original. A better reformulation is
] to replace the first line with the proposed method.

Are you sure about that?  The code is

        verfiles = os.listdir('/usr/lib/setup')
        for n in range(len(verfiles)-1, -1, -1):
            if verfiles[n][:14] != 'slack-version-':
                del verfiles[n]

I think a better reformulation is

      verfiles = os.listdir('/usr/lib/setup')
      verfiles = [name for name in verfiles
                               if name.startswith("slack-version-")]

(you could put the os.listdir in the list comp. but I've
found that gets ungainly.)

] random.shuffle() uses for i in xrange(len(x)-1, 0, -1)

This isn't a use case.  The xrange never returns a 0 while
the iter_backwards one will.

>>> list(xrange(5, 0, -1))
[5, 4, 3, 2, 1]
>>>

] Rejected Alternative Ideas

There were two intermediate 'reverse' proposals.  Here's the four
I know about:

 - call the magic method __riter__

 - call the magic method __riter__.  If it doesn't exist, use
      for i in xrange(len(obj)): yield obj[i]

  - call the magic method __riter__.  If it doesn't exist, use
      for i in itertools.count(): yield[obj[-i]]

(this has the advantage of supporting the infinite sequence
  0, 1, 2, ... -3, -2, -1
)

  - same as the 2nd but if len doesn't exist use
       data = list(obj)
       data.reverse()
       for x in data: yield x

I agree with the conclusion that the last three have unexpected
interactions with mappings.  However, iter *also* has unexpected
interactions with mappings.

>>> class Dict:
...    def keys(self):
...       return [0, 1, 2, 4]
...    def __getitem__(self, key):
...       if key not in self.keys(): raise KeyError(key)
...         return key * 2
...    def values(self):
...       return [self[x] for x in self.keys()]
...    def items(self): return zip(self.keys(), self.values())
...    def len(self): return len(self.keys())
...
>>> for x in Dict():
...  print x
...
0
2
4
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
  File "<interactive input>", line 5, in __getitem__
KeyError: 3
>>>

While I think I was the one who originally raised this
objection, given the way Python already behaves wrt
mappings, I retract my objection.

Another possibility, which I just thought of now, is
  - call the magic method __riter__.  If it doesn't exist, look
       for the reverse-iterator adapter, as per PEP-0246.

I don't have any experience with the PEP to judge its
appropriateness nor usefulness.  The motivation section
seems to suggest that it would be appropriate.


In your post list a set of problems with the function approaches.

>    It saves implementing some object methods at the
>       expense of adding a new builtin function and of
>       creating a new magic method name.
>    It is slower than direct access to the underlying object.
>    It crashes when applied to an infinite iterator.
>    It produces bizarre results when applied to mappings.

I would like to see this list added to the PEP.  I do have'
some qualifiers for the list.

  - it doesn't need to be a global function; it could be in,
      say, itertools.  (I know I originally suggested it as a builtin,
      but I'm not saying I'm right ;)

  - (see comments above that xrange(5).iter_backwards()
      is likely slower than xrange(5, -1, -1) because of the
      extra lookup and call)

  - the first variant, which only looks for __riter__ and raises
     an exception if it doesn't exist, has exactly the same problems
     with infinite iterators as calling the new method.

  - similarly, it doesn't have additional problems with mappings.


Also, none of the use cases dealt with infinite sequences so
the lack of appropriate support for them shouldn't be considered
an outright rejection.

BTW, my current belief is that if this PEP leads to code
then it should be:
  - a new function, either builtin or in itertools

  - which uses __riter__ if present (unless we decide to
      use the adapt PEP)

  - else uses "for i in itertools.count(): yield[obj[-i]]"


                    Andrew
                    dalke at dalkescientific.com






More information about the Python-list mailing list