Why is there no natural syntax for accessing attributes with names not being valid identifiers?

Steven D'Aprano steve at pearwood.info
Thu Dec 5 02:56:18 EST 2013


On Wed, 04 Dec 2013 12:35:14 -0800, Piotr Dobrogost wrote:


> Right. If there's already a way to have attributes with these
> "non-standard" names (which is a good thing)

No it is not a good thing. It is a bad thing, and completely an accident 
of implementation that it works at all.

Python does not support names (variable names, method names, attribute 
names, module names etc.) which are not valid identifiers except by 
accident. The right way to handle non-identifier names is to use keys in 
a dictionary, which works for any legal string.

As you correctly say in another post:

"attribute is quite a different beast then key in a dictionary"

attributes are intended to be variables, not arbitrary keys. In some 
languages, they are even called "instance variables". As they are 
variables, they should be legal identifiers:

spam = 42  # legal identifier name
spam\n-ham\n = 42  # illegal identifier name


Sticking a dot in front of the name doesn't make it any different. 
Variables, and attributes, should be legal identifiers. If I remember 
correctly (and I may not), this issue has been raised with the Python-Dev 
core developers, including Guido, and their decision was:

- allowing non-identifier attribute names is an accident of 
implementation; 

- Python implementations are allowed to optimize __dict__ to prohibit non-
valid identifiers;

- but it's probably not worth doing in CPython.

getattr already enforces that the attribute name is a string rather than 
any arbitrary object.

You've also raised the issue of linking attribute names to descriptors. 
Descriptors is certainly a good reason to use attributes, but it's not a 
good reason for allowing non-identifier names. Instead of writing:

obj.'#$^%\n-\'."'

just use a legal identifier name! The above is an extreme example, but 
the principle applies to less extreme examples. It might be slightly 
annoying to write obj.foo_bar when you actually want of obj.'foo.bar' or 
obj.'foo\nbar' or some other variation, but frankly, that's just too bad 
for you.

As far as descriptors go, you can implement descriptor-like functionality 
by overriding __getitem__. Here's a basic example:

class MyDict(dict):
    def __getitem__(self, key):
        obj = super(MyDict, self).__getitem__(key)
        if hasattr(obj, '__get__'):
            obj = obj.__get__(self)


which ought to be close to (but not identical) to the semantics of 
attribute descriptors.

While I can see that there is some benefit to allowing non-identifier 
attributes, I believe such benefit is very small, and not enough to 
justify the cost by allowing non-identifier attributes. If I wanted to 
program in a language where #$^%\n-\'." was a legal name for a variable, 
I'd program in Forth.


-- 
Steven



More information about the Python-list mailing list