[Tutor] question about descriptors

Steven D'Aprano steve at pearwood.info
Sat Nov 7 09:24:58 EST 2015


On Sat, Nov 07, 2015 at 12:53:11PM +0000, Albert-Jan Roskam wrote:

[...]
> Ok, now to my question. I want to create a class with read-only 
> attribute access to the columns of a .csv file. E.g. when a file has a 
> column named 'a', that column should be returned as list by using 
> instance.a. At first I thought I could do this with the builtin 
> 'property' class, but I am not sure how. 

90% of problems involving computed attributes (including "read-only" 
attributes) are most conveniently solved with `property`, but I think 
this may be an exception. Nevertheless, I'll give you a solution in 
terms of `property` first.

I'm too busy/lazy to handle reading from a CSV file, so I'll fake it 
with a dict of columns.


class ColumnView(object):
    _data = {'a': [1, 2, 3, 4, 5, 6],
             'b': [1, 2, 4, 8, 16, 32],
             'c': [1, 10, 100, 1000, 10000, 100000],
             }
    @property
    def a(self):
        return self._data['a'][:]
    @property
    def b(self):
        return self._data['b'][:]
    @property
    def c(self):
        return self._data['c'][:]



And in use:

py> cols = ColumnView()
py> cols.a
[1, 2, 3, 4, 5, 6]
py> cols.a = []
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: can't set attribute



Now, some comments:

(1) You must inherit from `object` for this to work. (Or use Python 3.) 
It won't work if you just say "class ColumnView:", which would make it a 
so-called "classic" or "old-style" class. You don't want that.


(2) Inside the property getter functions, I make a copy of the lists 
before returning them. That is, I do:

    return self._data['c'][:]

rather than:

    return self._data['c']


The empty slice [:] makes a copy. If I did not do this, you could mutate 
the list (say, by appending a value to it, or deleting items from it) 
and that mutation would show up the next time you looked at the column.


(3) It's very tedious having to create a property for each column ahead 
of time. But we can do this instead:


def make_getter(key):
    def inner(self):
        return self._data[key][:]
    inner.__name__ = key
    return property(inner)


class ColumnView(object):
    _data = {'a': [1, 2, 3, 4, 5, 6],
             'b': [1, 2, 4, 8, 16, 32],
             'c': [1, 10, 100, 1000, 10000, 100000],
             }
    for key in _data:
        locals()[key] = make_getter(key)
    del key


and it works as above, but without all the tedious manual creation of 
property getters.

Do you understand how this operates? If not, ask, and someone will 
explain. (And yes, this is one of the few times that writing to locals() 
actually works!)


(4) But what if you don't know what the columns are called ahead of 
time? You can't use property, or descriptors, because you don't know 
what to call the damn things until you know what the column headers are, 
and by the time you know that, the class is already well and truly 
created. You might think you can do this:

class ColumnView(object):
    def __init__(self):
        # read the columns from the CSV file
        self._data = ...
        # now create properties to suit
        for key in self._data:
            setattr(self, key, property( ... ))


but that doesn't work. Properties only perform their "magic" when they 
are attached to the class itself. By setting them as attributes on the 
instance (self), they lose their power and just get treated as ordinary 
attributes. To be technical, we say that the descriptor protocol is only 
enacted when the attribute is found in the class, not in the instance.

You might be tempted to write this instead:

            setattr(self.__class__, key, property( ... ))

but that's even worse. Now, every time you create a new ColumnView 
instance, *all the other instances will change*. They will grown new 
properties, or overwrite existing properties. You don't want that.

Fortunately, Python has an mechanism for solving this problem: 
the `__getattr__` method and friends.


class ColumnView(object):
    _data = {'a': [1, 2, 3, 4, 5, 6],
             'b': [1, 2, 4, 8, 16, 32],
             'c': [1, 10, 100, 1000, 10000, 100000],
             }
    def __getattr__(self, name):
        if name in self._data:
            return self._data[name][:]
        else:
            raise AttributeError
    def __setattr__(self, name, value):
        if name in self._data:
            raise AttributeError('read-only attribute')
        super(ColumnView, self).__setattr__(name, value)
    def __delattr__(self, name):
        if name in self._data:
            raise AttributeError('read-only attribute')
        super(ColumnView, self).__delattr__(name)



-- 
Steve


More information about the Tutor mailing list