[SciPy-User] line_profiler and for-loops !?
Dag Sverre Seljebotn
dagss at student.matnat.uio.no
Tue Mar 16 04:49:11 EDT 2010
Sebastian Haase wrote:
> Hi,
>
> I was starting to use Robert's line_profiler. I seems to work great,
> and I already found one easy way do half my execution time.
> But now it claims that 33% of the time is spent (directly) in the
> "for"-line and another 36% in a very simple "if"-line. See parts of
> the output here:
> <snip>
> Function: doTracing at line 1135
> Total time: 23.9171 s
>
> Line # Hits Time Per Hit % Time Line Contents
> ==============================================================
> <snip>
> 1185 # iterate
> over all tracks, and find close points
> 1186 3853362 8024186 2.1 33.5 for
> tracki,track in enumerate(self.tracks):
> 1187 3853063 8639273 2.2 36.1 if
> self.tracks_tLast[tracki] == t-1:
> 1188
> # track went on until t-1 (so far)
> 1189
> # -- otherwise, skip "old" tracks
> 1190 62150 130277 2.1 0.5
> pi_t_1 = track[-1] # index in last time section
> <snip>
>
> (The object of the function is to connect closest points found in an
> image sequence into tracks connecting the points by shortest steps.)
>
> Anyhow, my question is, is this just an artifact of line_profiler, or
> is the fact that those two lines are hit almost 4e6 times really
> resulting in more than 50% of the time being spent here !?
> (Calculating the actual Euclidean distance matrix over all point pairs
> takes supposedly only 15% of the time, for comparison).
> I tried to separate out the "enumerate(self.tracks)" into a separate
> line before the "for"-line, but the time spent was still unchanged on
> the "for".
> Does this mean "python is slow" here - and I should try cython (which
> i have never done so far ...) ?
>
Well, that for-line contains enumerate, which is a very non-trivial
thing (a generator) and results in several functions calls being made.
And that if-test line contains self.tracks_tLast[...] which is rather
costly (one attribute lookup to get tracks_tLast (which you can put in a
function local variable for a minor speedup), and then element indexing
certainly isn't free either).
At least for the latter, put the value lookup into a temporary variable
in a seperate line to get better profiling results.
But at one point the running times are so low that line-profiling gets
very inaccurate, and if you still need more speed that's when you want
to use Cython. If these lookups indeed dominate, Cython should give good
results.
Dag Sverre
More information about the SciPy-User
mailing list