"Private" attributes - a possible idea (maybe PEP-worthy).

Wed Jan 30 18:35:12 EST 2002

I'm just throwing this out for now to get people's ideas. The idea would
cause code breakage (RED LIGHT!!!), but probably for a very limited number
of people (those who circumvent the privacy of double-underscore "private"
attributes). If there is enough support I will write up a PEP. I know it
would break some of my own code, but that is all code where I was
specifically trying out how to circumvent the system - no "real" code I've
written would be broken.

One of the problems with "private" attributes in Python is that whilst they
shouldn't be accessible without some real work from subclasses, in
particular cases they are. This can lead to the base class attribute being
accidentally used when it should be hidden.

# a.py

class Klass:

    __attr1 = 'A1'

    def __init__(self):
        self.__attr2 = 'A2'

    def __repr__(self):
        """"""
        # valid, and in my code, most common
        return self.__attr1

    def __str__(self):
        """"""
        return self.__attr2

# b.py

import a

class Klass(a.Klass):
    """"""

    __attr1 = 'B1'

    def __init__(self):
        """"""
        self.__attr2 = 'B2'

if __name__ == '__main__':

    aClass = a.Klass()
    bClass = Klass()

    print repr(aClass)
    print repr(bClass)

    print str(aClass)
    print str(bClass)

The result is that b.Klass uses the same private names as a.Klass
(_Klass__attr1 and _Klass__attr2).

Whilst in this particular example the problem is obvious, what if you have
an intermediate class with a different name?

# a.py

class Klass:

# b.py

class BKlass (a.Klass):

#c.py

class Klass (b.BKlass):

Then the problem becomes much harder to solve - "private" attributes are
being rebound for no apparent reason.

My proposal would be that rather than forming private variable names like
_<class name>__attr (e.g. _Klass__attr) they should instead be defined as
_<class id>__attr. It is guaranteed that there is only one object with a
particular ID at any one time.

So, using the code above, if id(a.Klass) == 1 and id(b.Klass) == 2 we would
end up with 4 attribute names rather than the 2 we have currently:

a.Klass -> _1__attr1 and _1__attr2
b.Klass -> _2__attr1 and _2__attr2

Note that the ID of the *class* is always used rather than the ID of the
instance - this allows all the current semantics (except for the "broken"
behaviour). Using the instance ID would mean that you could not manipulate
private attributes in different instances of the same class from each other
(which you can currently do) and would require that if a lookup failed in
the instance namespace, the private variable name would then need to be
changed to lookup in the class namespace - both bad things.

Tim Delaney
Cross Avaya R&D
+61 2 9352 9079