Interesting list() un-optimization

Ian Kelly ian.g.kelly at gmail.com
Thu Mar 7 11:00:32 EST 2013


On Thu, Mar 7, 2013 at 4:22 AM, Wolfgang Maier
<wolfgang.maier at biologie.uni-freiburg.de> wrote:
> Well, it skips the costly len() call because your iter(Foo()) returns
> iter(range()) under the hood and list() uses that object's __len__() method.

Iterators do not generally have __len__ methods.

>>> len(iter(range(10)))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object of type 'range_iterator' has no len()

> In
> most cases, such a workaround will not be feasible. Why should iter(QuerySet())
> have a faster __len__() method defined than QuerySet() itself.

iter(QuerySet()) should not have any __len__ method defined at all,
which is why the optimization would be skipped.

> Most likely,
> iter(QuerySet()) just returns self anyway?

But on this point, you are correct.  The mongoengine QuerySet.__iter__
method is defined as:

    def __iter__(self):
        self.rewind()
        return self

This is unfortunate design.  Not only does it mean that the iterator's
__len__ method cannot be trusted (what should the __len__ of a
partially exhausted iterator return?), but it also means that requesting
an iterator over the QuerySet will also silently invalidate any
existing iterators.



More information about the Python-list mailing list