[Python-ideas] __len__() for map()
Terry Reedy
tjreedy at udel.edu
Thu Nov 29 04:25:38 EST 2018
On 11/28/2018 5:27 PM, Steven D'Aprano wrote:
> On Wed, Nov 28, 2018 at 02:53:50PM -0500, Terry Reedy wrote:
>
>> One of the guidelines in the Zen of Python is
>> "Special cases aren't special enough to break the rules."
>>
>> This proposal claims that the Python 3 built-in iterator class 'map' is
>> so special that it should break the rule that iterators in general
>> cannot and therefore do not have .__len__ methods because their size may
>> be infinite, unknowable until exhaustion, or declining with each
>> .__next__ call.
>>
>> For iterators, 3.4 added an optional __length_hint__ method. This makes
>> sense for iterators, like tuple_iterator, list_iterator, range_iterator,
>> and dict_keyiterator, based on a known finite collection. At the time,
>> map.__length_hint__ was proposed and rejected as problematic, for
>> obvious reasons, and insufficiently useful.
>
> Thanks for the background Terry, but doesn't that suggest that sometimes
> special cases ARE special enough to break the rules? *wink*
Yes, but these cases is not special enough to break the rules for len
and __len__, especially when an alternative already exists.
> Unfortunately, I don't think it is obvious why map.__length_hint__ is
> problematic.
It is less obvious (there are more details to fill in) than the (exact)
length_hints for the list, tuple, range, and dict iterators. This are
*always* based on a sized collection. Map is *sometimes* based on sized
collection(s). It is the other cases that are problematic, as
illustrated by your next sentence.
> It only needs to return the *maximum* length, or
> sentinel (zero?) to say "I don't know". It doesn't
> need to be accurate, unlike __len__ itself.
> Perhaps we should rethink the decision not to give map() and filter() a
> length hint?
I should have said this more explicitly. This is why I suggested that
someone define and test one or specific map.__length_hint__
implementations. Someone doing so should look into the C code for list
to see how list handles iterators with a length hint. I suspect that
low estimates are better than high estimates. Does list recognize any
value as "I don't know"?
>> What makes the map class special among all built-in iterator classes?
>> It appears not to be a property of the class itself, as an iterator
>> class, but of its name. In Python 2, 'map' was bound to a different
>> implementation of the map idea, a function that produced a list, which
>> has a length. I suspect that if Python 3 were the original Python, we
>> would not have this discussion.
>
> No, in fairness, I too have often wanted to know the length of an
> arbitrary iterator, including map(), without consuming it. In general
> this is an unsolvable problem, but sometimes it is (or at least, at first
> glance *seems*) solvable. map() is one of those cases.
>
> If we could solve it, that would be great -- but I'm not convinced that
> it is solvable, since the solution seems worse than the problem it aims
> to solve. But I live in hope that somebody cleverer than me can point
> out the flaws in my argument.
The current situation with length_hint reminds me a bit of the situation
with annotations before the addition of typing. Perhaps it is time to
think about conventions for the non-obvious 'other cases'.
>> Perhaps 2.7, in addition to future imports of text as unicode and print
>> as a function, should have had one to make map and filter be the 3.x
>> iterators.
>
> I think that's future_builtins:
>
> [steve at ando ~]$ python2.7 -c "from future_builtins import *; print map(len, [])"
> <itertools.imap object at 0xb7ed39ec>
Thanks for the info.
> But that wouldn't have helped E. Madison Bray or SageMath, since their
> difficulty is not their own internal use of map(), but their users' use
> of map().
In particular, by people who are not vividly aware that we broke the
back-compatibility rule by rebinding 'map' and 'filter' in 3.0.
Breaking back-compatibility *again* by redefining len (to mean something
like operator.length) is not the right solution to problems caused by
the 3.0 break.
> Unless they simply ban any use of iterators at all, which I imagine will
> be a backwards-incompatible change (and for that matter an excessive
> overreaction for many uses), SageMath can't prevent users from providing
> map() objects or other iterator arguments.
I think their special case problem requires some special case solutions.
At this point, I am refraining from making suggestions.
--
Terry Jan Reedy
More information about the Python-ideas
mailing list