Getting rid of "self."

Alex Martelli aleaxit at yahoo.com
Mon Jan 10 04:51:01 EST 2005


BJörn Lindqvist <bjourne at gmail.com> wrote:
   ...
> > http://starship.python.net/crew/mwh/hacks/selfless.py
> 
> That's excellent! There is one small problem with the code though:

It shows the fundamentals of how to rewrite the bytecode, yes.

> .class Hi(Selfless):
> .    __attrs__ = ["x"]
> .    def __init__(x):
> .        self.x = x
> 
> In this case, I think the Python interpreter should realise that the
> parameter x shadows the attribute x. But the selfless code has
> problems with that. I want it to work exactly like how the situation
> is handled in Java and C++.

I believe you're referring to the test in rewrite_method:

if op.arg < code.co_argcount:
    raise ValueError, "parameter also instance member!"

If you think that parameters that are also instance members should
"shadow" instance members, just skip the op.arg cases which are less
than code.co_argcount -- those are the parameters.

> > Alex Martelli:
> > A decorator can entirely rewrite the bytecode (and more) of the method
> > it's munging, so it can do essentially anything that is doable on the
> > basis of information available at the time the decorator executes.
> 
> Which I believe means that the instance variables have to be declared
> in the class? I am content with declaring them like the selfless
> approach does:

It means the information about which names are names of instance
attributes must be available somewhere, be that "declared", "inferred",
or whatever.  For example, many C++ shops have an ironclad rule that
instance attributes, and ONLY instance attributes, are always and
invariably named m_<something>.  If that's the rule you want to enforce,
then you don't necessarily need other declarations or inferences, but
rather can choose to infer the status of a name from looking at the name
itself, if you wish.  "Declarations" or other complications yet such as:

> alternative would be not to declare the variables in an __attr__ list,
> and instead let them be "declared" by having them initialised in the
> __init__. I.e:
> 
> .def __init__(hi, foo):
> .    self.hi = hi
> .    self.foo = foo
> 
> When the metaclass then does it magic, it would go through the code of
> the __init__ method, see the assignments to "self.hi" and "self.foo",
> decide that "hi" and "foo" are attributes of the object and replace
> "hi" and "foo" in all other methods with "self.hi" and "self.foo". The

OK, but this approach is not compatible with your stated desire, which I
re-quote...:

> I want it to work exactly like how the situation
> is handled in Java and C++.

...because for example it does not deal with any attributes which may be
initialized in a *superclass*'s __init__.  However, I guess it could be
extended at the cost of some further lack of transparency, to obtain
just as horrid a mess as you require, where it's impossible for any
human reader to guess whether, say,
    hi = 33
is setting a local variable, or an instance variable, without chasing
down and studying the sources of an unbounded number of superclasses.

I do not think there is any _good_ solution (which is why many C++ shops
have that rule about spelling this m_hi if it's an instance variable,
keeping the spelling 'hi' for non-instance variables -- an attempt to
get SOME human readability back; a smaller but non-null number of such
shops even achieve the same purpose by mandating the use of 'this->hi'
-- just the Python rule you want to work around, essentially).  The
least bad might be to rely on __attrs__, enriching whatever is in the
current class's __attr__ with any __attrs__ that may be found in base
classes PLUS any member variables specifically set in __init__ -- if you
focus on convenience in writing the code, to the detriment of ability to
read and understand it; or else, for more readability, demand that
__attrs__ list everything (including explicitly attributes coming from
subclasses and ones set in any method by explicit "self.whatever = ...")
and diagnose the problem, with at least a warning, if it doesn't.

Yes, there's redundancy in the second choice, but that's what
declarations are all about: if you want to introduce the equivalent of
declarations, don't be surprised if redundancy comes with them.

> downside is that it probably could never be foolproof against code
> like this:
> 
> .def __init__(hi, foo):
> .    if hi:
> .        self.hi = hi
> .    else:
> .        self.foo = foo
> 
> But AFAIK, that example is a corner case and you shouldn't write such
> code anyway. :)

I don't see any problem with this code.  A static analysis will show
that both hi and foo are local variables.  Either may be not
initialized, of course, but having to deal with variables which are not
initialized IS a common problem of C++: you said you want to do things
like in C++, so you should be happy to have this problem, too.


> > Alex Martelli:
> > You do, however, need to nail down the specs.  What your 'magic' does
> > is roughly the equivalent of a "from ... import *" (except it
> > ...
> > Then, you must decide whether this applies to all names the method
> > accesses (which aren't already local).  For example, if the method
> > has a statement such as: 
> >   x = len(y)
> 
> All names should be checked like this:
> 1. Is the name in the parameter list? If so, do not rebind it.
> 2. Is the name in the objects attribute list? If so, prepend "self."
> 3. Do stuff like normal.

That's basically what the already-mentioned "selfless" does, then, with
the small change to consider name conflicts (parameter vs instance
attribute) to be OK rather than errors, as above mentioned; and possibly
larger changes to determine the attribute names, depending on what
strategy you want to pursue for that.


> > Alex Martelli:
> > If you can give totally complete specifications, I can tell you
> > whether your specs are doable (by a decorator, or other means), how,
> > and at what cost.  Without knowing your specs, I can't tell; I can
> > _guess_ that the answer is "probably doable" (as long as you're not
> > demanding the code in the decorator to be an oracle for the future,
> 
> This is promising, I'm content with whatever slowdowns necessary as
> long as I can prove those who say "you can't do it" wrong. :) It seems

Nobody (that I saw) said you can't do it, as long as you're willing to
pay the price in terms of some mixture of extra stuff to write
(__attrs__ or whatever), more difficulty in human reading (not being
able to tell locally what's a local variable and what isn't), new and
interesting kinds of errors such as "uninitialized member variables",
and time to execute the class statement or its methods, depending on the
exact mix and strategies you choose.

> to me that it should be doable by having the metaclass that modifies
> the class go through the class and bytecode-rewrite all its methods.
> So there has to be a big slowdown when the class is created, but after
> that, it should execute at pure Python speed? That doesn't seem to
> hard, and pretty robust too since bytecode doesn't change so often.

Starting with "selfless" and the few extra pointers I've given, I agree
it should not be too hard.  Not sure what you mean by bytecode not
changing often: the bytecode rewrite will happen every time you execute
the 'class' statement, and only then.

> And THEN I'll rewrite python-mode so that it syntax highlights member
> attributes! It will be cool.

This part might well be pretty hard, given the possibility of
inheritance from classes that might be anywhere at all -- sys.path can
change dynamically, so to know exactly what classes from other modules
your class is inheriting from (and that is crucial to determine which
names are those of instance attributes) basically requires executing all
the program up to the 'class' statement.  If you're keen on this part I
suggest you use one of the approaches that also facilitate human
reading: for exactly the same reasons they'll facilitate the
highlighting.  Either use something like m_<blah> as the name for
instance attributes, or have all instance attributes listed in
__attrs__, considering it an error (worth at least a warning... and a
lack of highlighting!) to use other instance attributes (from
superclasses or whatever) that _aren't_ listed in __attrs__.


Alex



More information about the Python-list mailing list