Speed: bytecode vz C API calls

Tue Dec 9 09:34:44 EST 2003

In article <tyfu14az6so.fsf at pcepsft001.cern.ch>,
Jacek Generowicz  <jacek.generowicz at cern.ch> wrote:
>aahz at pythoncraft.com (Aahz) writes:
>> In article <tyf7k171jbq.fsf at pcepsft001.cern.ch>,
>> Jacek Generowicz  <jacek.generowicz at cern.ch> wrote:
>>>
>>>I have a program in which I make very good use of a memoizer:
>>>
>>>  def memoize(callable):
>>>      cache = {}
>>>      def proxy(*args):
>>>          try: return cache[args]
>>>          except KeyError: return cache.setdefault(args, callable(*args))
>>>      return proxy
>>>
>>>I've got to the stage where my program is still not fast enough, and
>>>calls to the memoizer proxy are topping profiler output table. So I
>>>thought I'd try to see whether I can speed it up by recoding it in C.
>> 
>> I'm wondering whether you're tackling something the wrong way 
>
>There _is_ always that possibility, but I don't think that's what my
>problem is here (it may well be elsewhere in my program).
>
>> or perhaps you need to throw hardware at the problem.
>
>The definition of "not fast enough" in this case, is "much slower than
>a competing C++ implementation", so throwing hardware at it advances
>me not one bit, I'm afraid.
>
>> Dicts are one of the most highly optimized parts of Python,
>
>Which is exactly why I'm using them extensively.
>
>> and if your cache hit rate is higher than 90%, you'll be dealing
>> with exceptions only rarely,
>
>My cache hit rate is well over 90%.

All your comments taken as a set make no sense.  There's just no way
that a bunch of dict accesses in Python can be much slower than a C++
program.  (C program, *maybe* -- not C++)

>> so the speed of cache.setdefault() isn't where your time is going.
>> I'm wondering whether you're hitting some kind of memory issue or
>> something, or possibly you're having cache effects (in CPU/memory),
>> or maybe even your key is poorly constructed so that you're getting
>> hash duplicates.
>
>No, I think that my memoizer works just fine. The "problem" is that
>I've memoized a lot of functions, and quite a few of them are being
>called in my inner loops. If I want to compete on speed with C++, I'm
>simply going have to eliminate Python bytecode from my inner loops[*].
>
>[*] Unless my overall algorithm is completely wrong, of course.

Waitasec ....  okay, I see what your problem is, assuming you're
describing it correctly.  The problem is that you're calling Python
functions in an inner loop.  I think you're going to need to write a
solution that doesn't call functions (function unrolling, so to speak).
Instead of returning a memoizer function, just use a dict of dicts.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.