[Python-ideas] sequence.apply(function)

Yuval Greenfield ubershmekel at gmail.com
Sun Sep 2 10:50:07 CEST 2012


On Sun, Sep 2, 2012 at 5:14 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Sun, Sep 2, 2012 at 8:02 AM, Antoine Pitrou <solipsis at pitrou.net>
> wrote:
> > On Sun, 2 Sep 2012 00:55:39 +0300
> > Yuval Greenfield <ubershmekel at gmail.com>
> > wrote:
> >> On Sat, Sep 1, 2012 at 8:06 PM, Guido van Rossum <guido at python.org>
> wrote:
> >>
> >> > It's less Pythonic, because every sequence-like type (not just list)
> >> > would have to reimplement it.
> >> >
> >> > Similar things get proposed for iterators (e.g. it1 + it2, it[:n],
> >> > it[n:]) regularly and they are (and should be) rejected for the same
> >> > reason.
> >> >
> >> >
> >> Python causes some confusion because some things are methods and others
> >> builtins. Is there a PEP or rationale that defines what goes where?
> >
> > When something only applies to a single type or a couple of types, it is
> > a method. When it is generic enough, it is a builtin.
> > Of course there are grey areas but that's the basic idea.
>
> Yes, it comes down to the fact that we are *very* reluctant to impose
> required base classes (I believe the only ones currently enforced
> anywhere are object, BaseException and str - everything else should
> fall back to a protocol method, ABC or interface specific registration
> mechanism. Most interfaces that used to require actual integer objects
> are now using operator.index, or one of its C API equivalents).
>
> In Python, we also actively discourage "reopening" classes to add new
> methods (this is mostly a cultural thing, though - the language
> doesn't actually contain any mechanism to stop you by default,
> although it's possible to add such enforcement via metaclasses)
>
> Thus, protocols are born which define "has this behaviour", rather
> than "is one of these". That's why we have the len() builtin and
> associated __len__() protocol to say "taking the length of this object
> is a meaningful operation" rather than mandatory inheritance from a
> Container class that has a ".len()" method.
>
> They're most obviously beneficial when there are *multiple* protocols
> that can be used to implement a particular behaviour. For example,
> with iter(), the __iter__ protocol is only the first option tried. If
> that fails, then it will instead check for __getitem__ and if that
> exists, return a standard sequence iterator instead. Similarly,
> reversed() checks for __reversed__ first, and then checks for __len__
> and __getitem__, producing a reverse sequence iterator in the latter
> case.
>
> Similarly, next() was moved from a standard method to a builtin
> function in 3.x? Why? Mainly to add the "if not found, return this
> default value" behaviour. That kind of thing is much easier to add
> when the object is only handling a piece of the behaviour, with
> additional standard mechanisms around it (in this case, optionally
> returning a default value when StopIteration is thrown by the
> iterator).
>
> Generators are another good illustration of the principle: For iter()
> and next(), they follow the standard protocol and rely on the
> corresponding builtins. However, g.send() and g.throw() require deep
> integration with the interpreter's eval loop. There's currently no way
> to implement either of those behaviours as an ordinary type, thus
> they're exposed as ordinary methods, since they're genuinely generator
> specific.
>
> As to *why* this is a good thing: procedural APIs encourage low
> coupling. Yes, object oriented programming is a good way to scale an
> application architecture up to more complicated problems. The issue is
> with fetishising OOP to the point where you disallow the creation of
> procedural APIs that hide the OOP details. That approach sets a
> minimum floor to the complexity of your implementations, as even if
> you don't *need* the power of OOP, you're forced to deal with it
> because the language doesn't offer anything else, and that way lies
> Java. There's a reason Java is significantly more popular on large
> enterprise projects than it is in small teams - it takes a certain,
> rather high, level of complexity for the reasons behind any of that
> boilerplate to start to become clear :)
>
> Cheers,
> Nick.
>
>
Thanks, that's some interesting reasoning.

Maybe I'm old fashioned but I like running dir(x) to find out what an
object can do, and the wall of double underscores is hard to read.

Perhaps we could add to the inspect module a "dirprotocols" function which
returns a list of builtins that can be used on an object. I see that the
builtins are listed in e.g. help([]) but on user defined classes it might
be less obvious. Maybe we could just add a dictionary:

inspect.special_methods = {'__len__': len,
 '__getitem__': 'x.__getitem__(y) <==> x[y]',
 '__iter__': iter,
 ... }

and then dirprotocols would be easy to implement.


Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120902/fcc40544/attachment.html>


More information about the Python-ideas mailing list