profile stats interpretation

Tue Aug 10 03:37:35 EDT 2004

Robert Brewer wrote:

> I've been working on optimizing some code (yes, it really needs it--it's
> too slow ;) -- decided to use hotshot. I'm assuming things about the
> output of hotshot.stats that I want to verify before I make decisions
> off of them.
> 
> Here's an example of output I'm getting. I coded the same function 3
> different ways--it's basically a type coercer. Each method results in
> different stats (for the same request):
> 
>    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>     17582    0.670    0.000    1.428    0.000 logic.py:133(coerce)
> 
>    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>     17582    0.509    0.000    1.829    0.000 logic.py:133(coerce)
> 
>    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>     17582    0.604    0.000    1.202    0.000 logic.py:133(coerce)
> 
> The question is: which of these three should I keep? Is "tottime" the
> time of the code within coerce(), without regard to functions called
> from coerce()? If so, it seems method #2 is superior. Finally, why might

Judging from http://docs.python.org/lib/module-profile.html I would think
so. However, You should always pick a function based on its cumulated time.
If tottime is low in relation to cumtime that would merely be a hint that
you should rather optimize the called functions (not an option in your
example) or replace them which may require other changes to the caller
(which is what you did).

> #3 have a much lower cumtime but higher tottime than #2, given that I
> didn't change any other code? Hmmm.
> 
> 
> FWIW, here's the function.
> 
> Method #1:
>     
>     def coerce(self, value, valuetype=None):
>         if valuetype is None:

Instead of

>             valuetype = type(value)

              valuetype =  value.__class__

might work, too.

>         try:
>             xform = self.processors[valuetype]
>         except KeyError:
>             xform = self.default_processor

              # assuming the normal type/value ratio the following
              # line could drastically increase your hit rate.
              self.processors[valuetype] = xform

>         return xform(value)
> 
> Method #2:
>     
>     def coerce(self, value, valuetype=None):
>         if valuetype is None:
>             valuetype = type(value)
>         xform = self.processors.get(valuetype, self.default_processor)
>         return xform(value)
> 
> Method #3:
> 
>     def coerce(self, value, valuetype=None):
>         if valuetype is None:
>             valuetype = type(value)
>         if valuetype in self.processors:
>             xform = self.processors[valuetype]
>         else:
>             xform = self.default_processor
>         return xform(value)
> 
> 
> Any advice would be appreciated.

Once you have spotted a single slow function in heavy usage you can resort
to a micro-optimization tool like timeit. A function is "slow" when your
app spends a long time in it and the subroutine calls are necessary and
cannot be optimized themselves. Then picking the variant with the smallest
cumulated time should be a no-brainer.

Put another way, hotshot is useful to find the hotspots, i. e. the functions
that need optimizing, but not the optimization itself.

(Silly disclaimer: I have not yet worked with hotshot, so take this with
caution - the experts all seem to be redecorating :-)

Peter