Copy with __slots__

Alex Martelli aleax at aleax.it
Wed Sep 18 12:21:19 EDT 2002


Griebel, Peer wrote:
        ...
> I'm using Python 2.2 (Cygwin). There I don't get the exception you
> mentioned.
>  
>> This is with Python 2.2.1 -- as you see, it doesn't get anywhere

I suggest you upgrade to Python 2.2.1 -- it has enhancements such
as this one, which help a lot.

>> class C2(C1):
>>     __slots__ = "s2"
>>     def __copy__(self):
>>         x = self.__class__()
>>         x.s1 = self.s1
>>         x.s2 = self.s2
>>         return x
> 
> Yes I know this works. But this hat two drawbacks (at least in my eyes):
> 
> 1. Formerly I used the approach of copy: I simply updated the dict of a
> newly created empty object. This is fast and works well - but not with the
> new classes. I think your approach runs much more slowly.

It's of course not a matter of new vs old classes -- new classes
would let you use the dictionary-update method just as well... as
long as you didn't use __slots__.

Regarding performance -- *measure* things, man!  Intuition about
what runs slowly, what rapidly, is notoriously misleading.  E.g.:

import time, copy

class noslots(object):
    pass

class withslots(object):
    __slots__ = 's1', 's2'
    def __copy__(self):
        result = self.__class__.__new__(self.__class__)
        result.s1 = self.s1
        result.s2 = self.s2
        return result

x1 = noslots()
x1.s1 = 23
x1.s2 = 'bebop'

x2 = withslots()
x2.s1 = 23
x2.s2 = 'bebop'

repetitions = [None] * 10000
def copyalot(x):
    start = time.clock()
    for i in repetitions: copy.copy(x)
    stend = time.clock()
    return '%.2f %s' % (stend-start, x.__class__.__name__)

for x in x1, x2:
    print copyalot(x)

[alex at lancelot ~]$ python -O slorno.py
1.04 noslots
0.29 withslots

"much more slowly", hmm...?  Looks over 3 times faster to *ME*...


> 2. The approach is error prone. When simply updating a dict I'm not going
> to miss a newly added attribute. But using your method I have to remember
> the method whenever I add an attribute.

Yes, that IS part of the price you pay for getting the optimization
that __slots__ affords: you do have to know exactly what you're doing.
Thus, __slots__ should be used when the program is working, in the
optimization phase, for classes such that saving a __dict__ per
instance IS an important optimization, only (classes of which MANY
instances exist, basically).

At the price of a small slow-down, if the __slots__ are all defined
in the current class, you can safeguard against this, e.g.:

class withslots(object):
    __slots__ = 's1', 's2'
    def __copy__(self):
        result = self.__class__.__new__(self.__class__)
        for s in self.__slots__:
            setattr(result, s, getattr(self, s))
        return result

This slows my timing test down from 0.29 to about 0.36.  If you
must also consider _inherited_ slots, e.g;

class withslots_base(object):
    __slots__ = ('s1',)

class yetanother(withslots_base):
    pass

class withslots(yetanother):
    __slots__ = ('s2',)
    def __copy__(self):
        result = self.__class__.__new__(self.__class__)
        for base in self.__class__.__mro__:
            for s in getattr(base, '__slots__', []):
                setattr(result, s, getattr(self, s))
        return result

the slowdown is even more severe -- to 0.62.  Still nothing
like the 1.04 you pay for NOT having the slots at all, of
course, but it's starting to undermine performance improvement,
the only real rationale for having __slots__ at all.  Still,
you pays your money, and you makes your choices.


>> The way to cover copy.copy, copy.deepcopy AND pickle is to have the
>> class supply a special method __reduce__.  __reduce__ can return a
>> pair where the first item is callable, the second item is the tuple
>> to pass as arguments to the first.  Thus, for example:
> 
> Here once again I have the two disadvantages mentioned above.

You mis-count, I think.  You have only one disadvantage: you
have to write more code and decide how fail-safe you want it
to be wrt changes to __slots__ in the inheritance hierarchy,
depending on whether you need just a LITTLE optimization wrt
not using __slots__, or really crave for all you can get.

The advantage is the optimization.  The disadvantage is that
you have to write more code, with some care, to get the
optimization, particularly if you also want flexibility (more
optimization, of course, is gained by foregoing flexibility
and nailing things down).  That's how MOST optimization
activities work, in my experience.  Is yours so different?

> Yes this would work. But I'm looking for speed (as you already abserved).
> And my constructor is a little more heavy weight. So it is much faster to
> use the old mechanism (update dict) than always calling the constructor.

So, write that little extra factory function -- that's hardly
the problem!  The ability to reuse the constructor is just a
handy work-saver when feasible, as it often is.

> Thanks Alex!

You're welcome.

> I'm still in hope to find a fast and secure solution...

I think you need to choose your preferred tradeoff point between
'fast' and 'handy'.  It's not a matter of security -- none of
these solutions run higher security risks than others -- it's
that handiness, flexibility and other admirable qualities 
(including ease of changing the software later) ARE more often
than not things you can sacrifice if and when you really need
to increase performance.


Alex




More information about the Python-list mailing list