[Python-3000] Iterators for dict keys, values, and items == annoying :)

Thu Mar 23 23:58:02 CET 2006

On 3/23/06, Ian Bicking <ianb at colorstudy.com> wrote:
> Guido van Rossum wrote:
> > On 3/23/06, Ian Bicking <ianb at colorstudy.com> wrote:
> >
> >>This has been my personal experience with the iterators in SQLObject as
> >>well.  The fact that an empty iterator is true tends to cause particular
> >>problems in that case, though I notice iterkeys() acts properly in this
> >>case; maybe part of the issue is that I'm actually using iterables
> >>instead of iterators, where I can't actually test the truthfulness.
> >
> > This sounds like some kind of fundamental confusion -- you should
> > never be tempted to test an iterator for its truth value.
>
> I'm testing if it is empty or not, which seems natural enough.  Or would
> be, if it worked.

Testing whether an iterator is empty or not is an oxymoron; the only
legit way is to call next() and see whether it raises StopIteration.
This is the fundamental confusion I am talking about. It is NOT
"natural enough". It reveals a fundamental misunderstanding of the
design of the iterator protocol.

(There's also a design bug in 2.4 which perpetuates the confusion,
unfortunately; see below.)

> So I start out doing:
>
>    for item in select_results: ...
>
> Then I realize that the zero-item case is special (which is common), and do:
>
>    select_results = list(select_results)
>    if select_results:
>        ...
>    else:
>        for item in select_results:...

You should write that like this:

  empty = True
  for item in select_results:
    empty = False
    ...
  if empty:
    ...

> That's not a very comfortable code transformation.  When I was just
> first learning Python I thought this would work:
>
>    for item in select_results:
>        ...
>    else:
>        ... stuff when there are no items ...
>
> But it doesn't work like that.

Another fundamental confusion (about the for loop's else clause). It
can't mean two different things. It means "if I didn't break out of
the loop with a break statement".

> .iterkeys() does return an iterator with a useful __len__ method, so the
> principle that iterators shouldn't be tested for truth doesn't seem right.

Which iterkeys()? This is dependent on the object and on the Python
version; Python 2.4 accidentally implemented __len__ on certain
built-in iterators, which may explain why you are seeing this. It
doesn't work pre-2.4 not post-2.5, at least not for dict.iterkeys().

> (Very small mostly unrelated problem that occurs to me just at this
> moment -- I can't override __len__ with any implementation that isn't
> really cheap, because lots of code calls __len__ under the covers, like
> list() -- originally SQLObject used len(query) to do a COUNT(*) query,
> but that didn't work)

Except in 2.4, you can avoid most implicit __len__ calls by
implementing __nonzero__ separately; bool(x) tries __nonzero__ before
__len__. Unfortunately, the iterator accelerator in 2.4 is called
__len__ so various code tries to call __len__ when converting an
iterator to a list/tuple. 2.3 didn't; 2.5 won't.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)