Optimizing Inner Loop Copy

Mark E. Fenner mfenner at gmail.com
Fri Aug 18 10:04:46 EDT 2006


Paul McGuire wrote:

> "Mark E. Fenner" <Hobbes2176 at yahoo.com> wrote in message
> news:Ja6Fg.17560$Ji1.17100 at trnddc05...
>> Hello all,
>>
> <snip>
>>
>> Here's my class of the objects being copied:
>>
>> class Rule(list):
>>     def __init__(self, lhs=None, rhs=None, nClasses=0, nCases=0):
>>         self.nClasses = nClasses
>>         self.nCases = nCases
>>
> Ok, so here are some "bigger picture" kind of questions.
> 
> 1. Do you need to construct all these Rule objects in the first place? 

Sorry, I had a nice response written out ... and I lost it.  Here's the
summary.  Unfortunately, yes, the algorithm this code implements goes to
great lengths to ensure that each possible Rule is generated only once.

> 
> 2. More of an OO question than a performance question, but why does Rule
> inherit from list in the first place?  Is Rule *really* a list, or is it
> just implemented using a list?  

It is the later.  And it is a moderate abuse to say that the Rule _isa_
list.  Aside from the two integer statistics (nClasses, nCounts), the RHS
and LHS can be merged and we can say that the rule is [(tuple1), ...,
(tupleN), (tupleRHS)] with the RHS being Rule[-1].  However, in that case,
I went for OO principles and kept the two logical parts separate.  

B/c of the frequency of list operations, it's better to be able to use
thisRule.append() or thisRule[i] then thisRule.lhs.append() or
thisRule.lhs[i].  Where speed doesn't matter, I have the more appropriate
thisRule.addLHS().

Yes, I could do "dot removal" (i.e., localize the attribute lookups), but
when we iterate over a whole set of Rules, this still leave the
localization assignment inside an inner loop.

> If the latter, then you might look at 
> moving Rule's list contents into an instance variable, maybe something
> called
> self.contents.  Then you can be more explicit about appending to
> self.contents when you want to add lhs to the contents of the Rule.  For
> example, why are you calling extend, instead of just using slice notation
> to
> copy lhs?  Ah, because then you would have to write something like "self =
> lhs[:]", which doesn't look like it will work very well.  On the other
> hand, if you use containment/delegation instead of inheritance, you can
> use the
> more explicit "self.contents = lhs[:]".  In fact now you have much more
> control over the assemblage of rules from other rules.
> 

Fortunately, the Rules are only assembled in one way ... appending to that
copy that started the thread.  Incidentally, by speeding up Rule.__init__
and inlining it (instead of calling Rule.copy which called Rule.__init__),
that is no longer my bottleneck *phew*.


> In the original post, you state: "the left hand side of a rule (e.g., a
> rule is a & b & c -> d) is self and three other pieces of information are
> kept around, two ints and a right hand side"
> 
> What other aspects of list are you using in Rule?  Are you iterating over
> its contents somewhere?  Then implement __iter__ and return
> iter(self.contents).  Are you using "if rule1:" and implicitly testing if
> its length is nonzero?  Then implement __nonzero__ and return
> operator.truth(self.contents).  Do you want to build up rules
> incrementally
> using += operator?  Then implement __iadd__ and do
> self.contents.extend(other.contents), or self.contents +=
> other.contents[:] (no need to test for None-ness of other.contents, we
> ensure in our constructor that self.contents is always a list, even if its
> an empty one).
> 

Well, I do several of these things ... many of them inside inner loops
(there's a sequence of inner loops ... the algorithms choice, not mine). 
Each of these that are implemented (and not left to Python's builtin list
methods), is an extra level of function call overhead.  This has been a
secondary project for me over a number of years ... so I have to look at my
early revisions, but I think my original architecture was an explicit
self.lhs = [] attribute.  And it was REALLY slow.  When Python allowed
inheritence from builtin objects, I tried it and got a nice win (I think!).

> Save inheritance for the true "is-a" relationships among your problem
> domain
> classes.  For instance, define a base Rule class, and then you can extend
> it with things like DeterministicRule, ProbabilisticRule, ArbitraryRule,
> etc. But don't confuse "is-implemented-using-a" with "is-a".
> 

Well, I know the difference and choose to break the rule intentionally
*chuckle*.  After the initial prototype, speed trumped OO.

> -- Paul

Thanks for your comments, Paul.

Regards,
Mark



More information about the Python-list mailing list