[Python-ideas] __len__() for map()
Terry Reedy
tjreedy at udel.edu
Wed Nov 28 14:53:50 EST 2018
On 11/28/2018 9:27 AM, E. Madison Bray wrote:
> On Mon, Nov 26, 2018 at 10:35 PM Kale Kundert <kale at thekunderts.net> wrote:
>>
>> I just ran into the following behavior, and found it surprising:
>>
>>>>> len(map(float, [1,2,3]))
>> TypeError: object of type 'map' has no len()
>>
>> I understand that map() could be given an infinite sequence and therefore might not always have a length. But in this case, it seems like map() should've known that its length was 3. I also understand that I can just call list() on the whole thing and get a list, but the nice thing about map() is that it doesn't copy data, so it's unfortunate to lose that advantage for no particular reason.
>>
>> My proposal is to delegate map.__len__() to the underlying iterable.
One of the guidelines in the Zen of Python is
"Special cases aren't special enough to break the rules."
This proposal claims that the Python 3 built-in iterator class 'map' is
so special that it should break the rule that iterators in general
cannot and therefore do not have .__len__ methods because their size may
be infinite, unknowable until exhaustion, or declining with each
.__next__ call.
For iterators, 3.4 added an optional __length_hint__ method. This makes
sense for iterators, like tuple_iterator, list_iterator, range_iterator,
and dict_keyiterator, based on a known finite collection. At the time,
map.__length_hint__ was proposed and rejected as problematic, for
obvious reasons, and insufficiently useful.
The proposal above amounts to adding an unspecified __length_hint__
misnamed as __len__. Won't happen. Instead, proponents should define
and test one or more specific implementations of __length_hint__ in map
subclass(es).
> I mostly agree with the existing objections, though I have often found
> myself wanting this too, especially now that `map` does not simply
> return a list.
What makes the map class special among all built-in iterator classes?
It appears not to be a property of the class itself, as an iterator
class, but of its name. In Python 2, 'map' was bound to a different
implementation of the map idea, a function that produced a list, which
has a length. I suspect that if Python 3 were the original Python, we
would not have this discussion.
> As a simple counter-proposal which I believe has fewer issues, I would
> really like it if the built-in `map()` and `filter()` at least
> provided a Python-level attribute to access the underlying iterables.
This proposes to make map (and filter) special in a different way, by
adding other special (dunder) attributes. In general, built-in
callables do not attach their args to their output, for obvious reasons.
If they do, they do not expose them. If input data must be saved, the
details are implementation dependent. A C-coded callable would not
necessarily save information in the form of Python objects.
Again, it seems to me that the only thing special about these two,
versus the other iterators left in itertools, is the history of the names.
> This is necessary because if I have a function that used to take, say,
> a list as an argument, and it receives a `map` object, I now have to
> be able to deal with map()s,
If a function is documented as requiring a list, or a sequence, or a
length object, it is a user bug to pass an iterator. The only thing
special about map and filter as errors is the rebinding of the names
between Py2 and Py3, so that the same code may be good in 2.x and bad in
3.x.
Perhaps 2.7, in addition to future imports of text as unicode and print
as a function, should have had one to make map and filter be the 3.x
iterators.
Perhaps Sage needs something like
def size_map(func, *iterables):
for it in iterables:
if not hasattr(it, '__len__'):
raise TypeError(f'iterable {repr(it)} has no size')
return map(func, *iterables)
https://docs.python.org/3/library/functions.html#map says
"map(function, iterable, ...)
Return an iterator [...]"
The wording is intentional. The fact that map is a class and the
iterator an instance of the class is a CPython implementation detail.
Another implementation could use the generator function equivalent given
in the Python 2 itertools doc, or a translation thereof. I don't know
what pypy and other implementations do. The fact that CPython itertools
callables are (now) C-coded classes instead Python-coded generator
functions, or C translations thereof (which is tricky) is for
performance and ease of maintenance.
--
Terry Jan Reedy
More information about the Python-ideas
mailing list