[Numpy-discussion] profiling line by line

Thu Sep 18 06:46:29 EDT 2008

On Thu, Sep 18, 2008 at 1:29 AM, Robert Kern <robert.kern at gmail.com> wrote:
> On Wed, Sep 17, 2008 at 18:09, Ondrej Certik <ondrej at certik.cz> wrote:
>> On Wed, Sep 17, 2008 at 3:56 AM, Robert Kern <robert.kern at gmail.com> wrote:
>>> On Mon, Sep 15, 2008 at 11:13, Arnar Flatberg <arnar.flatberg at gmail.com> wrote:
>>>> That would make me an extremely happy user, I've been looking for this for
>>>> years!
>>>> I can't imagine I'm the only one who profiles some hundred lines of code and
>>>> ends up with 90% of total time in the dot-function
>>>
>>> For the time being, you can grab it here:
>>>
>>> http://www.enthought.com/~rkern/cgi-bin/hgwebdir.cgi/line_profiler/
>>>
>>> It requires Cython and a C compiler to build. I'm still debating
>>> myself about the desired workflow for using it, but for now, it only
>>> profiles functions which you have registered with it. I have made the
>>> profiler work as a decorator to make this easy. E.g.,
>>>
>>>  from line_profiler import LineProfiler
>>>
>>>  profile = LineProfiler()
>>>
>>>  @profile
>>>  def my_slow_func():
>>>      ...
>>>
>>>  profile.dump_stats('my_slow_func.lprof')
>>>
>>> This is kind of inconvenient, so I have a little script that handles
>>> everything except for the @profile itself. It started as a script to
>>> run cProfile nicely, so it actually does that by default.
>>>
>>> I took pystone.py from the Python source and added a couple of
>>> @profile decorators to demonstrate:
>>>
>>>
>>> [line_profiler]$ ./kernprof.py --help
>>> Usage: ./kernprof.py [-s setupfile] [-o output_file_path] scriptfile [arg] ...
>>>
>>> Options:
>>>  -h, --help            show this help message and exit
>>>  -l, --line-by-line    Use the line-by-line profiler from the line_profiler
>>>                        module instead of cProfile. Implies --builtin.
>>>  -b, --builtin         Put 'profile' in the builtins. Use 'profile.enable()'
>>>                        and 'profile.disable()' in your code to turn it on and
>>>                        off, or '@profile' to decorate a single function, or
>>>                        'with profile:' to profile a single section of code.
>>>  -o OUTFILE, --outfile=OUTFILE
>>>                        Save stats to <outfile>
>>>  -s SETUP, --setup=SETUP
>>>                        Code to execute before the code to profile
>>>
>>> [line_profiler]$ ./kernprof.py -l pystone.py
>>> Pystone(1.1) time for 50000 passes = 11.34
>>> This machine benchmarks at 4409.17 pystones/second
>>> Wrote profile results to pystone.py.lprof
>>>
>>> [line_profiler]$ ./view_line_prof.py pystone.py.lprof
>>> Timer unit: 1e-06 s
>>>
>>> File: pystone.py
>>> Function: Proc0 at line 79
>>> Total time: 8.46516 s
>> [...]
>>
>> This is what I am getting:
>>
>> $ ./kernprof.py -l pystone.py
>> Wrote profile results to pystone.py.lprof
>> $ ./view_line_prof.py pystone.py.lprof
>> Timer unit: 1e-06 s
>>
>> $
>>
>> So I think you meant:
>>
>> $ ./kernprof.py -l mystone.py
>> 20628
>> Wrote profile results to mystone.py.lprof
>> $ ./view_line_prof.py mystone.py.lprof
>> Timer unit: 1e-06 s
>>
>> File: pystone.py
>> Function: Proc0 at line 79
>> Total time: 13.0803 s
>> [...]
>>
>> Now it works.
>
> No, I meant pystone.py. My script-finding code may have (incorrectly)
> found a different, uninstrumented pystone.py file somewhere else,
> though. Try with "./pystone.py".
>
>> This is an excellent piece of software! Nice job. Just today I needed
>> such a thing!
>>
>>
>> How do you easily install it?
>
> "python setup.py install" should have installed the module. I haven't
> done anything with the scripts, yet.
>
>> I usually do "python setup.py install
>> --home=~/lib" and I have the PYTHONPATH + PATH setup in my .bashrc,
>> but I then need to manually remove the stuff from my ~/lib if I want
>> to uninstall, which sucks. So this time I just did "python setup.py
>> build" and moved the .so file manually to the current dir. But there
>> must be a better way. What is your workflow?
>
> For things I am developing on, I build them in-place, using the
> setuptools "develop" command to add the appropriate path to the
> easy-install.pth file. To remove, I would just edit that file.
>
> For thing I'm not developing on, I usually build and install an egg if
> at all possible.
>
> But then, I'm typically on a single-user box where I'm root, so I
> sometimes do nasty and unsanitary things like chown -R rkern:rkern
> /usr/local/.
>
>> Anyway, so I used it on my code and here is what I got:
>>
>> File: hermes1d/mesh.py
>> Function: integrate_function at line 119
>> Total time: 0.647412 s
>>
>> Line #      Hits         Time   % Time  Line Contents
>> =====================================================
>>   119                                      @profile
>>   120                                      def integrate_function(self, f):
>>   121                                          """
>>   122                                          Integrate the function
>> "f" on the element.
>>   123                                          """
>>   124        96         1091      0.2          from numpy import array
>>   125        96       461070     71.2          from scipy.integrate
>> import quadrature
>>   126        96          496      0.1          a, b =
>> self.nodes[0].x, self.nodes[1].x
>>   127        96          418      0.1          def func(x):
>>   128                                              #print x
>>   129                                              #print
>> array([f.f(y) for y in x])
>>   130                                              return
>> array([f.f(y) for y in x])
>>   131        96       183903     28.4          val, err =
>> quadrature(func, a, b)
>>   132                                          #print val, a, b
>>   133                                          #stop
>>   134        96          434      0.1          return val
>>
>>
>>
>> This is interesting, because 70% of time is spent on "from
>> scipy.integrate import quadrature". So this is not good.
>>
>> So I moved it out and voila! My code is 3.4x faster. I really didn't
>> know that importing (scipy.integrate) is *so* slow.
>
> It should only be slow the first time. If it's not, then there's a
> problem somewhere (and I wouldn't put it past being in the profiler;
> it might screw up something in the import mechanism). Can you do the
> import both outside the function *and* inside to see if that changes
> the result?

Sure. "b.py":

from scipy.integrate import quadrature

---------------
def a():
    b = 1

for x in range(1000000):
    a()
---------------

"a.py":

---------------
def a():
    from scipy.integrate import quadrature
    b = 1

for x in range(1000000):
    a()
---------------

and:

ondra at pc232:~/repos/hermes1d$ time python a.py

real	0m2.655s
user	0m2.612s
sys	0m0.044s

ondra at pc232:~/repos/hermes1d$ time python b.py

real	0m0.620s
user	0m0.540s
sys	0m0.080s

So I guess that speaks for itself. Don't have time to dig more, but I
think it's this line in scipy/integrate/__init__:

__all__ = filter(lambda s:not s.startswith('_'),dir())

But maybe also something else, I don't know.

Ondrej

Ondrej