[Python-ideas] __len__() for map()

Terry Reedy tjreedy at udel.edu
Thu Nov 29 04:25:38 EST 2018


On 11/28/2018 5:27 PM, Steven D'Aprano wrote:
> On Wed, Nov 28, 2018 at 02:53:50PM -0500, Terry Reedy wrote:
> 
>> One of the guidelines in the Zen of Python is
>> "Special cases aren't special enough to break the rules."
>>
>> This proposal claims that the Python 3 built-in iterator class 'map' is
>> so special that it should break the rule that iterators in general
>> cannot and therefore do not have .__len__ methods because their size may
>> be infinite, unknowable until exhaustion, or declining with each
>> .__next__ call.
>>
>> For iterators, 3.4 added an optional __length_hint__ method.  This makes
>> sense for iterators, like tuple_iterator, list_iterator, range_iterator,
>> and dict_keyiterator, based on a known finite collection.  At the time,
>> map.__length_hint__ was proposed and rejected as problematic, for
>> obvious reasons, and insufficiently useful.
> 
> Thanks for the background Terry, but doesn't that suggest that sometimes
> special cases ARE special enough to break the rules? *wink*

Yes, but these cases is not special enough to break the rules for len 
and __len__, especially when an alternative already exists.

> Unfortunately, I don't think it is obvious why map.__length_hint__ is
> problematic. 

It is less obvious (there are more details to fill in) than the (exact) 
length_hints for the list, tuple, range, and dict iterators.  This are 
*always* based on a sized collection.  Map is *sometimes* based on sized 
collection(s).  It is the other cases that are problematic, as 
illustrated by your next sentence.

> It only needs to return the *maximum* length, or
> sentinel (zero?) to say "I don't know".  It doesn't
> need to be accurate, unlike __len__ itself.

> Perhaps we should rethink the decision not to give map() and filter() a
> length hint?

I should have said this more explicitly.  This is why I suggested that 
someone define and test one or specific map.__length_hint__ 
implementations. Someone doing so should look into the C code for list 
to see how list handles iterators with a length hint.  I suspect that 
low estimates are better than high estimates.  Does list recognize any 
value as "I don't know"?

>> What makes the map class special among all built-in iterator classes?
>> It appears not to be a property of the class itself, as an iterator
>> class, but of its name.  In Python 2, 'map' was bound to a different
>> implementation of the map idea, a function that produced a list, which
>> has a length.  I suspect that if Python 3 were the original Python, we
>> would not have this discussion.
> 
> No, in fairness, I too have often wanted to know the length of an
> arbitrary iterator, including map(), without consuming it. In general
> this is an unsolvable problem, but sometimes it is (or at least, at first
> glance *seems*) solvable. map() is one of those cases.
> 
> If we could solve it, that would be great -- but I'm not convinced that
> it is solvable, since the solution seems worse than the problem it aims
> to solve. But I live in hope that somebody cleverer than me can point
> out the flaws in my argument.

The current situation with length_hint reminds me a bit of the situation 
with annotations before the addition of typing.  Perhaps it is time to 
think about conventions for the non-obvious 'other cases'.

>> Perhaps 2.7, in addition to future imports of text as unicode and print
>> as a function, should have had one to make map and filter be the 3.x
>> iterators.
> 
> I think that's future_builtins:
> 
> [steve at ando ~]$ python2.7 -c "from future_builtins import *; print map(len, [])"
> <itertools.imap object at 0xb7ed39ec>

Thanks for the info.

> But that wouldn't have helped E. Madison Bray or SageMath, since their
> difficulty is not their own internal use of map(), but their users' use
> of map().

In particular, by people who are not vividly aware that we broke the 
back-compatibility rule by rebinding 'map' and 'filter' in 3.0.

Breaking back-compatibility *again* by redefining len (to mean something 
like operator.length) is not the right solution to problems caused by 
the 3.0 break.

> Unless they simply ban any use of iterators at all, which I imagine will
> be a backwards-incompatible change (and for that matter an excessive
> overreaction for many uses), SageMath can't prevent users from providing
> map() objects or other iterator arguments.

I think their special case problem requires some special case solutions. 
  At this point, I am refraining from making suggestions.

-- 
Terry Jan Reedy



More information about the Python-ideas mailing list