Custom objects & performance

Tue Oct 12 20:53:53 EDT 1999

Recently I've been working on some Python classes which I need to be
fairly quick. Here's the conundrum I've been running into: in order
to make the objects compute faster, I often use auxiliary lists and
dictionaries as caches, so if a particular result is needed
frequently, I don't have to recompute it every time. So far, so
good; the trouble is invalidating these caches when the input info
changes. Here's the three options I've been exploring:

1. A non-standard interface to make changes to the object (or a
     property of the object.) So instead of

      foo.list[x] = y

     I might do

      foo.invalidate_cache()
      foo.list[x] = y

     or

      foo.list_setval(x,y)

   I don't really like either of these solutions. The first one
   is very error-prone (don't forget to invalidate the cache!).
   The second one is more like C or C++, where you have a
   different external interface wherever the internals might
   differ. One of the best aspects of Python is that the
   abstractions can remain the same even when the implementation
   is different. I know that there are some cases when I would
   like to write a function that either takes a normal list or
   a list that's a property of an object that uses a cache, 
   and I don't want to have to rewrite the function to do that.

2. UserList / UserDict type things. So in the example above,
     foo.list would be an object which emulates a list and
     provides special processing whenever a value is written
     to it. That works great, except of course that it takes
     a large performance hit especially when there are a lot
     more reads than writes--and speed is the reason why I'm
     trying to do this at all.

   Now some of you probably think that this is going to be
     a plug for letting Python code derive classes off of
     built-in types--and I do think that would be a great
     idea--but it still wouldn't solve all of these
     problems (although it would make it easier.)
     Two-dimensional lists and dictionaries still would
     be tough.

3. Trying to detect a change at a certain point and decide
     whether recalculation is necessary. This is tricky
     because it can get slow, and speed is the bottom line
     here. Lists and dictionaries are not hashable; even
     if they were, that speed might be significant.

Anyone have any ideas??