Optimizing Inner Loop Copy

Paul McGuire ptmcg at austin.rr._bogus_.com
Fri Aug 18 08:35:59 EDT 2006


"Mark E. Fenner" <Hobbes2176 at yahoo.com> wrote in message
news:Ja6Fg.17560$Ji1.17100 at trnddc05...
> Hello all,
>
<snip>
>
> Here's my class of the objects being copied:
>
> class Rule(list):
>     def __init__(self, lhs=None, rhs=None, nClasses=0, nCases=0):
>         self.nClasses = nClasses
>         self.nCases = nCases
>
Ok, so here are some "bigger picture" kind of questions.

1. Do you need to construct all these Rule objects in the first place?  One
of the optimizations I did in pyparsing was to pre-construct exception
objects, and to reuse them over and over instead of creating them once,
raising them, and then discarding them.  (There is a trade-off with
thread-safety when you do this, but we deal with that separately.)  This
gave me about a 30% reduction in processing time, since pyparsing does a lot
of exception raising/catching internally.  So you might look at your larger
app and see if you could memoize your Rule objects or something similar, and
avoid the whole object create/init/delete overhead in the first place.

2. More of an OO question than a performance question, but why does Rule
inherit from list in the first place?  Is Rule *really* a list, or is it
just implemented using a list?  If the latter, then you might look at moving
Rule's list contents into an instance variable, maybe something called
self.contents.  Then you can be more explicit about appending to
self.contents when you want to add lhs to the contents of the Rule.  For
example, why are you calling extend, instead of just using slice notation to
copy lhs?  Ah, because then you would have to write something like "self =
lhs[:]", which doesn't look like it will work very well.  On the other hand,
if you use containment/delegation instead of inheritance, you can use the
more explicit "self.contents = lhs[:]".  In fact now you have much more
control over the assemblage of rules from other rules.

In the original post, you state: "the left hand side of a rule (e.g., a rule
is a & b & c -> d) is self and three other pieces of information are kept
around, two ints and a right hand side"

What other aspects of list are you using in Rule?  Are you iterating over
its contents somewhere?  Then implement __iter__ and return
iter(self.contents).  Are you using "if rule1:" and implicitly testing if
its length is nonzero?  Then implement __nonzero__ and return
operator.truth(self.contents).  Do you want to build up rules incrementally
using += operator?  Then implement __iadd__ and do
self.contents.extend(other.contents), or self.contents += other.contents[:]
(no need to test for None-ness of other.contents, we ensure in our
constructor that self.contents is always a list, even if its an empty one).

Save inheritance for the true "is-a" relationships among your problem domain
classes.  For instance, define a base Rule class, and then you can extend it
with things like DeterministicRule, ProbabilisticRule, ArbitraryRule, etc.
But don't confuse "is-implemented-using-a" with "is-a".

-- Paul





More information about the Python-list mailing list