[Python-ideas] Mitigating 'self.' Method Pollution

Sat Jul 11 18:18:22 CEST 2015

On Sun, Jul 12, 2015 at 01:07:56AM +1000, Steven D'Aprano wrote:

> So what makes attributes so special that we have to break the rules that 
> apply to all other variables?
> 
>     No tags needed: builtins, globals, nonlocals, locals.
> 
>     Tags needed: attributes.
> 
> 
> This isn't a rhetorical question

The short answer is, sometimes they are special, and sometimes they 
aren't.

Implicit self is a much-requested feature, and not just in Python. 
Eiffel, for example, uses the "Current" keyword, and it is sometimes 
optional. Swift has implicit member access as well:

    var x = MyEnumeration.SomeValue
    x = .AnotherValue

https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/Expressions.html#//apple_ref/swift/grammar/implicit-member-expression

And yet C++ which has implicit "this", people regularly end up prfixing 
their attributes with a pseudo namespace, "m_..." (for member).

I have come to the conclusion that this is not just personal preference. 
I now believe that where you stand on this question will, to some 
degree, depend on the type of code you are writing.

If you are writing "enterprisey" heavily object oriented code, chances 
are you are dealing with lots of state, lots of methods, lots of 
classes, and you'll need all the help you can get to keep track of where 
that state lives. You will probably want explicit self for every 
attribute access, or a pseudo-self naming convention like m_ in C++, to 
help keep it straight in your head. *Understanding the code* is harder 
than writing it in the first place, so having to write a few extra 
selfs is a negligible cost with a big benefit. Attributes are special 
because they are state, and you have lots of state.

But look at Michael's example code. That's real code too, it's just not 
enterprisey, and there are many Python programmers who don't write 
enterprisey code. They're beginners, or sys admins hacking together a 
short script, or editing one that already exists. For them, or at least 
for some of them, they have only a few classes with a little bit of 
state. Their methods tend to be short. Although they don't have much 
state, they refer to it over and over again.

For these people, I believe, all those explicit selfs do is add visual 
noise to the code, especially if the code is the least bit mathematical 
or if there are well-known conventions that are being followed:

class Vector:
    def abs(self):
        return sqrt(x**2 + y**2 + z**2)
        # versus
        return sqrt(self.x**2 + self.y**2 + self.z**2)

We're told that programs are written firstly for the human readers, and 
only incidentally for the computer. I expect that most people, with any 
knowledge of vectors, would have understood that the x, y, z in the 
first return must refer to coordinates of the vector. What else could 
they be? The selfs are just noise. If we are writing for human readers, 
sometimes we can be too explicit. 

Or to put it another way: If we are writing text for human readers who 
will read that text written by us, sometimes we can be too explicit in 
our writing for those same readers, causing a decrease in readability, 
which is an undesirable outcome.

I don't intend to defend "Do What I Mean" attribute inference. I don't 
believe that is practical or desirable in Python. But there's a middle 
ground: Michael's suggestion, where we can explicitly declare that 
certain names are attributes of self, and thereby decrease the visual 
noise in that method.

If you're thinking about enterprisy 50- or 100-line methods, this 
solution probably sounds awful. Every time you see a variable name in 
the method, you have to scroll back to the top of the method to see 
whether it is declared as a self attribute or not. And there are too 
many potential attributes to keep track of them all in your head.

And I agree. But that sort of code is not the code that would benefit 
from this. Instead, think of small methods, say, ten or fifteen lines at 
most, few enough that you can keep the whole method in view at once. 
Think of classes with only a few attributes, but where you refer to them 
repeatedly. Maybe *you* don't feel the need to remove those in-your-face 
selfs, but can you understand why some people do?

In recent years, Python has gained some nice new features and batteries 
aimed at the advanced programmer, such as async etc. This, I believe, is 
one which may assist programmers at the less advanced end. They have no 
need for awaitables and static type checking, but they sure would like 
less visual noise in the methods.

-- 
Steve