Consume an iterable

Peter Otten __peter__ at web.de
Sun Jan 24 18:21:36 EST 2010


Raymond Hettinger wrote:

> FWIW, the deque() approach becomes even faster in Py2.7 and Py3.1
> which has a high-speed path for the case where maxlen is zero.
> Here's a snippet from Modules/_collectionsmodule.c:
> 
> /* Run an iterator to exhaustion.  Shortcut for
>    the extend/extendleft methods when maxlen == 0. */
> static PyObject*
> consume_iterator(PyObject *it)
> {
>         PyObject *item;
> 
>         while ((item = PyIter_Next(it)) != NULL) {
>                 Py_DECREF(item);
>         }
>         Py_DECREF(it);
>         if (PyErr_Occurred())
>                 return NULL;
>         Py_RETURN_NONE;
> }
> 
> 
> This code consumes an iterator to exhaustion.
> It is short, sweet, and hard to beat.

islice() is still a tad faster. A possible optimization:

static PyObject*
consume_iterator(PyObject *it)
{
	PyObject *item;
	PyObject *(*iternext)(PyObject *);

	iternext = *Py_TYPE(it)->tp_iternext;

	while ((item = iternext(it)) != NULL) {
		Py_DECREF(item);
	}
	Py_DECREF(it);
	if (PyErr_Occurred()) {
		if(PyErr_ExceptionMatches(PyExc_StopIteration))
			PyErr_Clear();
		else
			return NULL;
	}
	Py_RETURN_NONE;
}

Before:

$ ./python -m timeit -s"from itertools import repeat, islice; from 
collections import deque; from sys import maxint" "next(islice(repeat(None, 
1000), maxint, maxint), None)"
100000 loops, best of 3: 6.49 usec per loop

$ ./python -m timeit -s"from itertools import repeat, islice; from 
collections import deque; from sys import maxint" "deque(repeat(None, 1000), 
maxlen=0)"
100000 loops, best of 3: 9.93 usec per loop

After:

$ ./python -m timeit -s"from itertools import repeat, islice; from 
collections import deque; from sys import maxint" "deque(repeat(None, 1000), 
maxlen=0)"
100000 loops, best of 3: 6.31 usec per loop

Peter

PS: Two more, for Paul Rubin:

$ ./python -m timeit -s"from itertools import repeat" "sum(0 for _ in 
repeat(None, 1000))"
10000 loops, best of 3: 125 usec per loop
$ ./python -m timeit -s"from itertools import repeat" "sum(0 for _ in 
repeat(None, 1000) if 0)"
10000 loops, best of 3: 68.3 usec per loop





More information about the Python-list mailing list