[Python-ideas] Consider making enumerate a sequence if its argument is a sequence

Wed Sep 30 21:19:05 CEST 2015

On Sep 30, 2015, at 11:43, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>> On 30.09.2015 20:26, Andrew Barnert via Python-ideas wrote:
>>> On Sep 30, 2015, at 11:11, M.-A. Lemburg <mal at egenix.com> wrote:
>>> 
>>>> On 30.09.2015 19:19, Neil Girdhar wrote:
>>>> I guess, I'm just asking for enumerate to go through the same change that
>>>> range went through.  Why wasn't it a problem for range?
>>> 
>>> range() returns a list in Python 2 and a generator in Python 3.
>> 
>> No it doesn't. It returns a (lazy) sequence. Not a generator, or any other kind of iterator.
> 
> You are right that it's not of a generator type
> and more like a lazy sequence. To be exact, it returns
> a range object and does implement the iter protocol via
> a range_iterator object.

To be exact, it returns an object which returns True for isinstance(r, Sequence), which offers correct implementations of the entire sequence protocol. In other words, it's not "more like a lazy sequence", it's _exactly_ a lazy sequence.

In 2.3-2.5, xrange was a lazy "sequence-like object", and the docs explained how it didn't have all the methods of a sequence but otherwise was like one. When the collections ABCs were added, xrange (2.x)/range (3.x) started claiming to be a sequence, but the implementation was incomplete, so it was defective. This was fixed in 3.2 (which also made all of the sequence methods efficient—e.g., a range that fits into C longs can test an int for __contains__ in constant time).

>> I don't know why so many people seem to believe it returns a generator. (And, when you point out what it returns, most of them say, "Why was that changed from 2.x xrange, which returned a generator?" but xrange never returned a generator either--it returned a lazy almost-a-sequence from the start.)
> 
> Perhaps because it behaves like one ? :-)
> 
> Unlike an iterator, it doesn't iterate over a sequence, but instead
> generates the values on the fly.

You're confusing things even worse here.

A generator is an iterator. It's a perfect subtype relationship.

A range does not behave like a generator, or like any other kind of iterator. It behaves like a sequence.

Laziness is orthogonal to the iterator-vs.-sequenceness. Dictionary views are also lazy but not iterators, for example. And there's nothing stopping you from writing a generator with "yield from f.readlines()" (except that it would be stupid), which would be an iterator despite being not lazy in any useful sense.

Maybe the problem is that we don't have enough words. I've tried to use "view" to refer to a lazy non-iterator iterable (dict views, range, NumPy slices), which seems to help within the context of a single long explanation for a single user's problem, but I'm not sure that's something we'd want enshrined in the glossary, since it's a general English word that probably has wider usefulness.

> FWIW: I don't think many people use the lazy sequence features
> of range(), e.g. the slicing or index support. By far most
> uses are in for-loops.

I've used range as a sequence (or at least a reusable iterable, a sized object, and a container). I've answered questions from people on StackOverflow who are doing so, and seen the highest-rep Python answerer on SO suggest such uses to other people.

I don't think I'd ever use the index method (although I did see one SO user who was doing so, to wrap up some arithmetic in a way that avoids a possibly off-by-one error, and wanted to know why it was so slow in 3.1 but worked fine in 3.2...), but there's no reason range should be a defective "not-quite-sequence" instead of a sequence. What would be the point of that?