[SciPy-User] line_profiler and for-loops !?

Dag Sverre Seljebotn dagss at student.matnat.uio.no
Tue Mar 16 04:49:11 EDT 2010


Sebastian Haase wrote:
> Hi,
>
> I was starting to use Robert's line_profiler. I seems to work great,
> and I already found one easy way do half my execution time.
> But now it claims that 33% of the time is spent (directly) in the
> "for"-line and another 36% in a very simple "if"-line. See parts of
> the output here:
> <snip>
> Function: doTracing at line 1135
> Total time: 23.9171 s
>
> Line #      Hits         Time  Per Hit   % Time  Line Contents
> ==============================================================
> <snip>
>   1185                                                       # iterate
> over all tracks, and find close points
>   1186   3853362      8024186      2.1     33.5              for
> tracki,track in enumerate(self.tracks):
>   1187   3853063      8639273      2.2     36.1                  if
> self.tracks_tLast[tracki] == t-1:
>   1188
>                   # track went on until t-1 (so far)
>   1189
>                   #    --  otherwise, skip "old" tracks
>   1190     62150       130277      2.1      0.5
>    pi_t_1 = track[-1] # index in last time section
> <snip>
>
> (The object of the function is to connect closest points found in an
> image sequence into tracks connecting the points by shortest steps.)
>
> Anyhow, my question is, is this just an artifact of line_profiler, or
> is the fact that those two lines are hit almost 4e6 times really
> resulting in more than 50% of the time being spent here !?
> (Calculating the actual Euclidean distance matrix over all point pairs
> takes supposedly only 15% of the time, for comparison).
> I tried to separate out the "enumerate(self.tracks)" into a separate
> line before the "for"-line, but the time spent was still unchanged on
> the "for".
> Does this mean "python is slow" here - and I should try cython (which
> i have never done so far ...) ?
>   
Well, that for-line contains enumerate, which is a very non-trivial 
thing (a generator) and results in several functions calls being made. 
And that if-test line contains self.tracks_tLast[...] which is rather 
costly (one attribute lookup to get tracks_tLast (which you can put in a 
function local variable for a minor speedup), and then element indexing 
certainly isn't free either).

At least for the latter, put the value lookup into a temporary variable 
in a seperate line to get better profiling results.

But at one point the running times are so low that line-profiling gets 
very inaccurate, and if you still need more speed that's when you want 
to use Cython. If these lookups indeed dominate, Cython should give good 
results.

Dag Sverre



More information about the SciPy-User mailing list