[Python-ideas] Suggested MapView object (Re: len() for map())

Tue Dec 11 06:48:10 EST 2018

On Tue, Dec 11, 2018 at 12:13 PM Paul Moore <p.f.moore at gmail.com> wrote:
>
> On Tue, 11 Dec 2018 at 10:38, E. Madison Bray <erik.m.bray at gmail.com> wrote:
> > I don't understand why this is confusing.
> [...]
> > For something like a fixed sequence a "map" could just as easily be
> > defined as a pair (<function>, <sequence>) that applies <function>,
> > which I'm claiming is a pure function, to every element returned by
> > the <sequence>.  This transformation can be applied lazily on a
> > per-element basis whether I'm iterating over it, or performing random
> > access (since <sequence> is known for all N).
>
> What's confusing to *me*, at least, is what's actually being suggested
> here. There's a lot of theoretical discussion, but I've lost track of
> how it's grounded in reality:

It's true, this has been a wide-ranging discussion and it's confusing.
Right now I'm specifically responding to the sub-thread that Greg
started "Suggested MapView object", so I'm considering this a mostly
clean slate from the previous thread "__len__() for map()".  Different
ideas have been tossed around and the discussion has me thinking about
broader possibilities.  I responded to this thread because I liked
Greg's proposal and the direction he's suggesting.

I think that the motivation underlying much of this discussion, forth
both the OP who started the original thread, as well as myself, and
others is that before Python 3 changed the implementation of map()
there were certain assumptions one could make about map() called on a
list* which, under normal circumstances were quite reasonable and sane
(e.g. len(map(func, lst)) == len(lst), or map(func, lst)[N] ==
func(lst[N])).

Python 3 broke all of these assumptions, for reasons that I personally
have no disagreement with, in terms of motivation.

However, in retrospect, it might have been nice if more consideration
were given to backwards compatibility for some "obvious" simple cases.
This isn't a Python 2 vs Python 3 whine though: I'm just trying to
think about how I might expect map() to work on different types of
arguments, and I see no problem--so long as it's properly
documented--with making its behavior somewhat polymorphic on the types
of arguments.

The idea would be to now enhance the existing built-ins to restore at
least some previously lost assumptions, at least in the relevant
cases.  To give an analogy, Python 3.0 replaced range() with
(effectively) xrange().  This broken a lot of assumptions that the
object returned by range(N) would work much like a list, and Python
3.2 restored some of that list-like functionality by adding support
for slicing and negative indexing on range(N).  I believe it's worth
considering such enhancements for filter() and map() as well, though
these are obviously a bit trickier.

* or other fixed-length sequence, but let's just use list as a
shorthand, and assume for the sake of simplicity a single list as
well.

> 1. If we're saying that "it would be nice if there were a function
> that acted like map but kept references to its arguments", that's easy
> to do as a module on PyPI. Go for it - no-one will have any problem
> with that.

Sure, though since this is about the behavior of global built-ins that
are commonly used by users at all experience levels the problem is a
bit hairier.  Anybody can implement anything they want and put it in a
third-party module. That doesn't mean anyone will use it.  I still
have to write code that handles map objects.

In retrospect I think Guido might have had the right idea of wanting
to move map() and filter() into functools along with reduce().
There's a surprisingly lot more at stake in terms of backwards
compatibility and least-astonishment when it comes to built-ins.  I
think that's in part why the new Python 3 definitions of map() and
filter() were kept so simple: although they were not backwards
compatible I do think they were well designed to minimize
astonishment.  That's why I don't necessarily disagree with the
choices made (but still would like to think about how we can make
enhancements going forward).

> 2. If we're saying "the builtin map needs to behave like that", then
>   2a. *Why*? What is so special about this situation that the builtin
> has to be changed?

Same question could apply to last time it was changed.  I think now
we're trying to find some middle-ground.

>   2b. Compatibility questions need to be addressed. Is this important
> enough to code that "needs" it that such code is OK with being Python
> 3.8+ only? If not, why aren't the workarounds needed for Python 3.7
> good enough? (Long term improvement and simplification of the code
> *is* a sufficient reason here, it's just something that should be
> explicit, as it means that the benefits are long-term rather than
> immediate).

That's a good point: I think the same arguments as for enhancing
range() apply here, but this is worth further consideration (though
having a more concrete proposal in the first place should come first).

>   2c. Weird corner case questions, while still being rare, *do* need
> to be addressed - once a certain behaviour is in the stdlib, changing
> it is a major pain, so we have a responsibility to get even the corner
> cases right.

It depends on what you mean by getting them "right".  It's definitely
worth going over as one can think of.  Not all corner cases have a
satisfying resolution (and may be highly context-dependent).  In those
cases getting it "right" is probably no more than documenting that
corner case and perhaps warning against it.

>   2d. It's not actually clear to me how critical that need actually
> is. Nice to have, sure (you only need a couple of people who would use
> a feature for it to be "nice to have") but beyond that I haven't seen
> a huge number of people offering examples of code that would benefit
> (you mentioned Sage, but that example rapidly degenerated into debates
> about Sage's design, and while that's a very good reason for not
> wanting to continue using that as a use case, it does leave us with
> few actual use cases, and none that I'm aware of that are in
> production code...)

That's a fair point worthy of further consideration.  To me, at least,
map on a list working as an augmented list is obvious, clear, useful,
at solves most of the use-cases where having map.__len__ might be
desirable, among others.

> 3. If we're saying something else (your comment "map could just as
> easily be defined as..." suggests that you might be) then I'm not
> clear what it is. Can you describe your proposal as pseudo-code, or a
> Python implementation of the "map" replacement you're proposing?

Again, I'm mostly responding to Greg's proposal which I like.  To
extend it, I'm suggesting that a call to map() where all the arguments
are sequences** might return something like his MapView.  If even that
idea is crazy or impractical though, I can accept that.  But I think
it's quite analogous to how map on arbitrary iterables went from
immediate evaluation to lazy evaluation while iterating: in the same
way map on some sequence(s) can be evaluated lazily on random access.

** I have a separate complaint that there's no great way, at the
Python level, to define a class that is explicitly a "sequence" as
opposed to a more general "mapping", but that's a topic for another
thread...

[Python-ideas] Suggested MapView object (Re: __len__() for map())

[Python-ideas] Suggested MapView object (Re: len() for map())