Optimizing Inner Loop Copy

Mark E. Fenner Hobbes2176 at yahoo.com
Thu Aug 17 18:35:21 EDT 2006


Hello all,

I have a code where my inner loop looks like:

allNew = []
for params in cases:
    newObj = copy(initialObject)
    newObj.modify(params)
    allNew.append(newObj)
return allNew

and the copy is taking the majority (42%) of my execution time.
So, I'd like to speed up my copy.  I had an explicit copy method that did
what was needed and returned a new object, but this was quite a bit slower
than using the standard lib copy.copy().

Here's my class of the objects being copied:

class Rule(list):
    def __init__(self, lhs=None, rhs=None, nClasses=0, nCases=0):
        self.nClasses = nClasses
        self.nCases = nCases

        if lhs is not None:
            self.extend(lhs)

        if rhs is None:
            self.rhs=tuple()
        else:
            self.rhs=rhs

     # as I mentioned, Rule.copy is slower than copy.copy
     def copy(self):
         r = Rule(self,
                  self.rhs,
                  self.nClasses,
                  self.nCases)
         return r

Basically, the left hand side of a rule (e.g., a rule is a & b & c -> d) is
self and three other pieces of information are kept around, two ints and a
right hand side (e.g., (d, 5) meaning that d takes the value five ... all
the LHS elements are tuples as well).

As far as optimization goes, it seems I could write a custom __copy__
method, but a c.l.python search (sorry, forgot to bookmark the reference)
seemed to indicate that a special purpose __copy__ that copies all the
objects's attributes would lose out to the generic copy.copy().  This
doesn't seem quite right ... ideas?

Next option would be to rewrite the whole class in C.  I haven't done C
extensions before and since most of the class is already a builtin list,
that seems like a lot of work for little gain.  Would Pyrex be a help here? 
I don't think the relevant typing information (on the ints alone, I guess)
would get me very far ... and again, recoding the list stuff (by hand or by
Pyrex) in C is not going to get any gains (it might slow things down?).

It seems to me that shallow copies (of objects built from immutable types)
should be able to be speed up by memory mapping (somehow!).  The top-level
list/rule should be the only new reference that needs to be created.

Any quick answers or most likely directions to explore, would be greatly
appreciated.

Regards,
Mark



More information about the Python-list mailing list