Iterators, iterables and special objects

Terry Reedy tjreedy at udel.edu
Tue Jul 21 15:54:10 EDT 2020


On 7/21/2020 5:32 AM, Peter Slížik wrote:
> Hi list, two related questions:
> 
> 1. Why do functions used to iterate over collections or dict members return
> specialized objects like
> 
> type(dict.keys()) -> class 'dict_keys'
> type(dict.values()) -> class 'dict_values'
> type(dict.items()) -> class 'dict_items'
> type(filter(..., ...)) -> class 'filter'
> type(map(..., ...)) -> class 'map'
> type(enumerate(...)) -> class 'enumerate'
> 
> instead of returning some more general 'iterable' and 'view' objects? Are
> those returned objects really that different from one another that it makes
> sense to have individual implementations?

Yes.  The dict views have to know how the dict is implemented to extract 
the keys, values, or pairs thereof.

The transformers each have different code. I suppose that each could 
instead pass a function to a generic 'transformer' class, but the 
indirection would just make execution slower and hide the specific info 
as to what the iterator is doing.

> 2. Why do these functions return iterators instead of iterables?

The view are iterables.  They can be iterated more than once and used in 
other operations.

The transformers should be once-through iterators because they can be 
passed once-through interators.  I suppose one could make them iterables 
and add an attribute 'pristine' set to True in __init__ and False in 
__iter__, but why have 2 objects instead of 1 when there is not gain in 
function?

> First, I
> find it confusing - to me, it is the loop's job to create an iterator from
> the supplied iterable, and not the object that is being iterated over.

Python's design that iter(iterator) is iterator is extremely handy.

Note that iterators can be driven directly with next(), and not only 
indirectly with for...

Suppose line iterator 'file' has a header line with field names and 
multiple data lines.  One can do

it = iter(file)
fields = next(file)
<process fields>
for line in it:
     <process data line in light of fields line>

Yes, one can add a flag variable 'first = True' and inside the loop
     if first:
         first = False
         fields = line
         <process fields>
but the 3 extra boilerplate lines add nothing.
         <process fiel

> And
> second, with this design, iterators are required to be iterables too, which
> is confusing as hell (at least for people coming from other languages in
> which the distinction is strict).

I guess I was fortunate to not have to unlearn anything ;-).


-- 
Terry Jan Reedy




More information about the Python-list mailing list