Custom objects & performance
Neel Krishnaswami
neelk at brick.cswv.com
Tue Oct 12 22:24:08 EDT 1999
On Tue, 12 Oct 1999 20:53:53 -0400, Eric Jacobs <x at x.x> wrote:
>Recently I've been working on some Python classes which I need to be
>fairly quick. Here's the conundrum I've been running into: in order
>to make the objects compute faster, I often use auxiliary lists and
>dictionaries as caches, so if a particular result is needed
>frequently, I don't have to recompute it every time. So far, so
>good; the trouble is invalidating these caches when the input info
>changes. Here's the three options I've been exploring:
[snip the bad ideas ;) ]
>2. UserList / UserDict type things. So in the example above,
> foo.list would be an object which emulates a list and
> provides special processing whenever a value is written
> to it. That works great, except of course that it takes
> a large performance hit especially when there are a lot
> more reads than writes--and speed is the reason why I'm
> trying to do this at all.
This is the Right Thing in terms of not fighting the language. So the
first question is: have you written a prototype of this approach and
profiled it to make sure that __getitem__ calls are a dominant cost?
(Python makes profiling dead easy; anytime you have even the slightest
curiosity about performance profile your program before doing anything
else.)
If your function computations are even moderately expensive, then I
would be quite shocked if the benefits of memoizing them don't vastly
outweigh the costs of __getitem__ calls. Python function calls are
heavyweight, but not /that/ bad -- even my P2-233 can manage around
330,000 function calls/second.
If does turn out that __getitem__ calls are too expensive, then a)
rethink whether caching the results is at all worthwhile, and b) try 2
or 5 different prototypes in Python, and c) write a C extension type
for your program based on the fastest prototype.
Neel
More information about the Python-list
mailing list