python profiling, hotspot and strange execution time

cournape at gmail.com cournape at gmail.com
Thu Sep 8 03:45:06 EDT 2005



> OK - first of all, as someone else has asked, what platform are you
> running?  I'm assuming it's windows since you're referring to
> time.clock() and then later saying "wall clock".

Actually, no. I am working on a x86 linux (HT disabled for this
testing, as I thought it may introduce some subtilities). I am not sure
aht you mean by wall clock ?
>
> Next, what are you hoping that the profiler will give you?  If you're
> expecting it to give you the big picture of your application's
> performance and give you "real runtime numbers", you may be
> disappointed.  It is a deterministic profiler and will give you CPU time
> spent in different areas of code rather than and overall "how long did
> this thing take to run?".

I am not sure to understand the big difference between "time spent in
different areas of code" and "how long did this thing take to run?".
Looking at python doc for deterministic profiling, I understand the
implementation difference, and the performance implications, but I
don't see why deterministic profiling would not give me an overall
picture ?

> Well, let's just add more confusion to the pot, shall we?  Look at this
> example (a slight hack from yours)::
>
> import time
> import hotshot
> import hotshot.stats
>
>
> def run_foo():
>     print time.clock()
>     print time.time()
>
>     time.sleep(5)
>
>     print time.clock()
>     print time.time()
>
> prof = hotshot.Profile("essai.prof")
> benchtime= prof.runcall(run_foo)
> prof.close()
> stats = hotshot.stats.load("essai.prof")
> stats.strip_dirs()
> stats.sort_stats('time', 'calls')
> stats.print_stats(20)
>
> and the output::
>
> 0.24
> 1126011669.55
> 0.24
> 1126011674.55
>          1 function calls in 0.000 CPU seconds
>
>    Ordered by: internal time, call count
>
>    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>         1    0.000    0.000    0.000    0.000 tmphQNKbq.py:6(run_foo)
>         0    0.000             0.000          profile:0(profiler)
>
>
>
> I inserted a time.time() call since I'm on Linux and time.clock()
> returns a process's CPU time and wanted to show the "wall clock time" as
> it were.  So, the stats show 0 time taken, whereas time.time() shows 5
> seconds.  It's because the time.sleep() took a negligable amount of CPU
> time which is what the profiler looks at.

Well, your example make actually more sense to me :) I understand the
difference between CPU time and time spent in the python code (even if
I was not clear in my previous post about it...). But this case does
not apply to my code, as my code is never "idled", takes 100 % of the
cpu, with no other CPU consuming task

> I would attribute the wall clock and profile time difference to the
> overhead of hotshot.  While hotshot is miles better than the "regular"
> profiler, it can still take a little time to profile code.

Well, if hotshot reported a timing which is longer than the execution
time without it, I would have considered that to be normal. Even in C,
using gprof has a non negligeable overhead, most of the time.

What I don't understand is why hotshot reports that do_foo is executed
in 2 seconds whereas it effectively takes more than 10 seconds ? Is it
because I don't understand what deterministic profiling is about ?

David




More information about the Python-list mailing list