[Python-ideas] len() for map()

Wed Nov 28 14:53:50 EST 2018

On 11/28/2018 9:27 AM, E. Madison Bray wrote:
> On Mon, Nov 26, 2018 at 10:35 PM Kale Kundert <kale at thekunderts.net> wrote:
>>
>> I just ran into the following behavior, and found it surprising:
>>
>>>>> len(map(float, [1,2,3]))
>> TypeError: object of type 'map' has no len()
>>
>> I understand that map() could be given an infinite sequence and therefore might not always have a length.  But in this case, it seems like map() should've known that its length was 3.  I also understand that I can just call list() on the whole thing and get a list, but the nice thing about map() is that it doesn't copy data, so it's unfortunate to lose that advantage for no particular reason.
>>
>> My proposal is to delegate map.__len__() to the underlying iterable.  

One of the guidelines in the Zen of Python is
"Special cases aren't special enough to break the rules."

This proposal claims that the Python 3 built-in iterator class 'map' is 
so special that it should break the rule that iterators in general 
cannot and therefore do not have .__len__ methods because their size may 
be infinite, unknowable until exhaustion, or declining with each 
.__next__ call.

For iterators, 3.4 added an optional __length_hint__ method.  This makes 
sense for iterators, like tuple_iterator, list_iterator, range_iterator, 
and dict_keyiterator, based on a known finite collection.  At the time, 
map.__length_hint__ was proposed and rejected as problematic, for 
obvious reasons, and insufficiently useful.

The proposal above amounts to adding an unspecified __length_hint__ 
misnamed as __len__.  Won't happen.  Instead, proponents should define 
and test one or more specific implementations of __length_hint__ in map 
subclass(es).

> I mostly agree with the existing objections, though I have often found
> myself wanting this too, especially now that `map` does not simply
> return a list.

What makes the map class special among all built-in iterator classes? 
It appears not to be a property of the class itself, as an iterator 
class, but of its name.  In Python 2, 'map' was bound to a different 
implementation of the map idea, a function that produced a list, which 
has a length.  I suspect that if Python 3 were the original Python, we 
would not have this discussion.

> As a simple counter-proposal which I believe has fewer issues, I would
> really like it if the built-in `map()` and `filter()` at least
> provided a Python-level attribute to access the underlying iterables.

This proposes to make map (and filter) special in a different way, by 
adding other special (dunder) attributes.  In general, built-in 
callables do not attach their args to their output, for obvious reasons. 
  If they do, they do not expose them.  If input data must be saved, the 
details are implementation dependent.  A C-coded callable would not 
necessarily save information in the form of Python objects.

Again, it seems to me that the only thing special about these two, 
versus the other iterators left in itertools, is the history of the names.

> This is necessary because if I have a function that used to take, say,
> a list as an argument, and it receives a `map` object, I now have to
> be able to deal with map()s,

If a function is documented as requiring a list, or a sequence, or a 
length object, it is a user bug to pass an iterator.  The only thing 
special about map and filter as errors is the rebinding of the names 
between Py2 and Py3, so that the same code may be good in 2.x and bad in 
3.x.

Perhaps 2.7, in addition to future imports of text as unicode and print 
as a function, should have had one to make map and filter be the 3.x 
iterators.

Perhaps Sage needs something like

def size_map(func, *iterables):
     for it in iterables:
         if not hasattr(it, '__len__'):
             raise TypeError(f'iterable {repr(it)} has no size')
     return map(func, *iterables)

https://docs.python.org/3/library/functions.html#map says
"map(function, iterable, ...)
     Return an iterator [...]"

The wording is intentional.  The fact that map is a class and the 
iterator an instance of the class is a CPython implementation detail. 
Another implementation could use the generator function equivalent given 
in the Python 2 itertools doc, or a translation thereof.  I don't know 
what pypy and other implementations do.  The fact that CPython itertools 
callables are (now) C-coded classes instead Python-coded generator 
functions, or C translations thereof (which is tricky) is for 
performance and ease of maintenance.

-- 
Terry Jan Reedy

[Python-ideas] __len__() for map()

[Python-ideas] len() for map()