Custom objects & performance

Neel Krishnaswami neelk at brick.cswv.com
Tue Oct 12 22:24:08 EDT 1999


On Tue, 12 Oct 1999 20:53:53 -0400, Eric Jacobs <x at x.x> wrote:
>Recently I've been working on some Python classes which I need to be
>fairly quick. Here's the conundrum I've been running into: in order
>to make the objects compute faster, I often use auxiliary lists and
>dictionaries as caches, so if a particular result is needed
>frequently, I don't have to recompute it every time. So far, so
>good; the trouble is invalidating these caches when the input info
>changes. Here's the three options I've been exploring:

[snip the bad ideas ;) ]

>2. UserList / UserDict type things. So in the example above,
>     foo.list would be an object which emulates a list and
>     provides special processing whenever a value is written
>     to it. That works great, except of course that it takes
>     a large performance hit especially when there are a lot
>     more reads than writes--and speed is the reason why I'm
>     trying to do this at all.

This is the Right Thing in terms of not fighting the language.  So the
first question is: have you written a prototype of this approach and
profiled it to make sure that __getitem__ calls are a dominant cost? 
(Python makes profiling dead easy; anytime you have even the slightest
curiosity about performance profile your program before doing anything
else.)

If your function computations are even moderately expensive, then I
would be quite shocked if the benefits of memoizing them don't vastly
outweigh the costs of __getitem__ calls. Python function calls are
heavyweight, but not /that/ bad -- even my P2-233 can manage around
330,000 function calls/second.

If does turn out that __getitem__ calls are too expensive, then a)
rethink whether caching the results is at all worthwhile, and b) try 2
or 5 different prototypes in Python, and c) write a C extension type
for your program based on the fastest prototype.


Neel




More information about the Python-list mailing list