Optimizing Inner Loop Copy

danielx danielwong at berkeley.edu
Sat Aug 19 13:26:46 EDT 2006


Mark E. Fenner wrote:
> Paul McGuire wrote:
>
> > "Mark E. Fenner" <Hobbes2176 at yahoo.com> wrote in message
> > news:Ja6Fg.17560$Ji1.17100 at trnddc05...
> >> Hello all,
> >>
> > <snip>
> >>
> >> Here's my class of the objects being copied:
> >>
> >> class Rule(list):
> >>     def __init__(self, lhs=None, rhs=None, nClasses=0, nCases=0):
> >>         self.nClasses = nClasses
> >>         self.nCases = nCases
> >>
> > Ok, so here are some "bigger picture" kind of questions.
> >
> > 1. Do you need to construct all these Rule objects in the first place?
>
> Sorry, I had a nice response written out ... and I lost it.  Here's the
> summary.  Unfortunately, yes, the algorithm this code implements goes to
> great lengths to ensure that each possible Rule is generated only once.
>
> >
> > 2. More of an OO question than a performance question, but why does Rule
> > inherit from list in the first place?  Is Rule *really* a list, or is it
> > just implemented using a list?
>
> It is the later.  And it is a moderate abuse to say that the Rule _isa_
> list.  Aside from the two integer statistics (nClasses, nCounts), the RHS
> and LHS can be merged and we can say that the rule is [(tuple1), ...,
> (tupleN), (tupleRHS)] with the RHS being Rule[-1].  However, in that case,
> I went for OO principles and kept the two logical parts separate.
>
> B/c of the frequency of list operations, it's better to be able to use
> thisRule.append() or thisRule[i] then thisRule.lhs.append() or
> thisRule.lhs[i].  Where speed doesn't matter, I have the more appropriate
> thisRule.addLHS().
>
> Yes, I could do "dot removal" (i.e., localize the attribute lookups), but
> when we iterate over a whole set of Rules, this still leave the
> localization assignment inside an inner loop.

Then maybe you can assign an unbound method (eg, "append =
list.append"), then pass in your instance as the first argument (eg
"append( myList, newElement )"). Even if you localize a bound method,
it should save some time, no?

>
> > If the latter, then you might look at
> > moving Rule's list contents into an instance variable, maybe something
> > called
> > self.contents.  Then you can be more explicit about appending to
> > self.contents when you want to add lhs to the contents of the Rule.  For
> > example, why are you calling extend, instead of just using slice notation
> > to
> > copy lhs?  Ah, because then you would have to write something like "self =
> > lhs[:]", which doesn't look like it will work very well.  On the other
> > hand, if you use containment/delegation instead of inheritance, you can
> > use the
> > more explicit "self.contents = lhs[:]".  In fact now you have much more
> > control over the assemblage of rules from other rules.
> >
>
> Fortunately, the Rules are only assembled in one way ... appending to that
> copy that started the thread.  Incidentally, by speeding up Rule.__init__
> and inlining it (instead of calling Rule.copy which called Rule.__init__),
> that is no longer my bottleneck *phew*.
>
>
> > In the original post, you state: "the left hand side of a rule (e.g., a
> > rule is a & b & c -> d) is self and three other pieces of information are
> > kept around, two ints and a right hand side"
> >
> > What other aspects of list are you using in Rule?  Are you iterating over
> > its contents somewhere?  Then implement __iter__ and return
> > iter(self.contents).  Are you using "if rule1:" and implicitly testing if
> > its length is nonzero?  Then implement __nonzero__ and return
> > operator.truth(self.contents).  Do you want to build up rules
> > incrementally
> > using += operator?  Then implement __iadd__ and do
> > self.contents.extend(other.contents), or self.contents +=
> > other.contents[:] (no need to test for None-ness of other.contents, we
> > ensure in our constructor that self.contents is always a list, even if its
> > an empty one).
> >
>
> Well, I do several of these things ... many of them inside inner loops
> (there's a sequence of inner loops ... the algorithms choice, not mine).
> Each of these that are implemented (and not left to Python's builtin list
> methods), is an extra level of function call overhead.  This has been a
> secondary project for me over a number of years ... so I have to look at my
> early revisions, but I think my original architecture was an explicit
> self.lhs = [] attribute.  And it was REALLY slow.  When Python allowed
> inheritence from builtin objects, I tried it and got a nice win (I think!).
>
> > Save inheritance for the true "is-a" relationships among your problem
> > domain
> > classes.  For instance, define a base Rule class, and then you can extend
> > it with things like DeterministicRule, ProbabilisticRule, ArbitraryRule,
> > etc. But don't confuse "is-implemented-using-a" with "is-a".
> >
>
> Well, I know the difference and choose to break the rule intentionally
> *chuckle*.  After the initial prototype, speed trumped OO.
> 
> > -- Paul
> 
> Thanks for your comments, Paul.
> 
> Regards,
> Mark




More information about the Python-list mailing list