[Python-ideas] Method signature syntactic sugar (especially for dunder methods)

Steven D'Aprano steve at pearwood.info
Sun Nov 6 20:27:25 EST 2016


On Sun, Nov 06, 2016 at 01:28:34AM -0500, Nathan Dunn wrote:

> Python has very intuitive and clear syntax, except when it comes to method
> definitions, particularly dunder methods.

I disagree with your premise here. Python's method definitions are just 
as intuitive and clear as the rest of Python's syntax: methods are just 
functions, indented in the body of the class where they belong, with an 
explicit "self" parameter.

And dunder methods are just a naming convention. They're not the most 
visually attractive methods, due to the underscores, but its just a 
naming convention. Otherwise they are declared in exactly the same way 
as any other method: using normal function syntax, indented inside the 
body of the class, with an explicit "self" the same as other methods.

So there's no magic to learn. Once you know how to declare a function, 
it is a tiny step to learn to declare a method: put it inside a class, 
indent it, and add "self", and now you have a method. And once you know 
how to declare a method, there's nothing more to learn to handle dunder 
methods. All you need know is the name of the method or methods you 
need, including the underscores.


[...]
> Having to declare a self parameter is confusing since you don't pass
> anything in when you call the method on an instance (I am aware of bound
> vs. unbound methods, etc. but a beginner would not be).

You are mistaking "mysterious" for "confusing".

"Why do I have to explicitly declare a self parameter?" is a mystery, 
and the answer can be given as:

- you just do
- because internally methods are just functions
- because it is actually useful (e.g. for unbound methods)

depending on the experience of the person asking. But its not 
*confusing*. "Sometimes I have to implicitly declare self, and sometimes 
I don't, and there doesn't seem to be any pattern to which it is" would 
be confusing. "Always explicitly declare self" is not.


> The double underscores are also confusing.

I've certainly a few cases of people who misread __init__ as _init_ and 
was surprised by their code not working. In over a decade of dealing 
with beginners' questions on comp.lang.python and the tutor mailing 
list. So it is an easy mistake to make, but apparently a *rare* mistake 
to make, and very easy to correct.

So I disagree that double underscores are "confusing". What is confusing 
about the instructions "press underscore twice at the beginning and end 
of the method name"?


> I propose syntactic sugar to make these method signatures more intuitive
> and clean.
> 
> class Vec(object):
>    def class(x, y):
>        self.x, self.y = x, y

I don't think that there is anything intuitive about changing the name 
of the method from __init__ to "class". What makes you think that people 
will intuit the word "class" to create instance? That seems like a 
dubious idea to me.

And it certainly isn't *clean*. At the moment, Python's rules are nicely 
clean: keywords can never be used as identifiers. You would either break 
that rule, or have some sort of magic where *some* keywords can 
*sometimes* be used as identifiers, but not always. That's the very 
opposite of clean -- it is a nasty, yucky design, and it doesn't scale 
to other protocols:

    def with:  # is this __enter__ or __exit__?

It doesn't even work for instance construction! Is class(...) the 
__new__ or __init__ method?

Not all beginners to Python are beginners to programming at all. Other 
languages typically use one of three naming conventions for the 
constructor:

- a method with the same name as the class itself

  e.g. Java, C#, PHP 4, C++, ActionScript.

- special predefined method names

  e.g. "New" in VisualBasic, "alloc" and "init" in Objective C, 
  "initialize" in Ruby, "__construct" in PHP 5.

- a keyword used before an otherwise normal method definition

  e.g. "constructor" in Object Pascal, "initializer" in Ocaml, 
  "create" in Eiffel, "new" in F#.


So there's lots of variation in how constructors are written, and what 
seems "intuitive" will probably depend on the reader's background. Total 
beginners to OOP don't have any pre-conceived expectations, because the 
very concept of initialising an instance is new to them. Whether it is 
spelled "New" or "__init__" or "mzygplwts" is just a matter of how hard 
it is to spell correctly and memorise.


>    def self + other:
>        return Vec(self.x + other.x, self.y + other.y)

My guess is that this is impossible in a LL(1) parser, but even if 
possible, how do you write the reversed __radd__ method? My guess is 
that you would need:

    def other + self:

but for that to work, "self" now needs to be a keyword rather than just 
a regular identifier which is special only because it is the first in 
the parameter list. And that's a problem because there are cases 
(rare, but they do happen) where we don't want to use "self" for the 
instance parameter.

A very common pattern in writing classes is:

   def __add__(self, other):
       # implementation goes here

   __radd__ = __add__


since addition is usually commutative. How would your new syntax handle 
that?



-- 
Steve


More information about the Python-ideas mailing list