[Numpy-discussion] Speeding up Numeric
Rory Yorke
ryorke at telkomsa.net
Mon Jan 24 10:54:16 EST 2005
Todd Miller <jmiller at stsci.edu> writes:
> This looked fantastic so I tried it over the weekend. On Fedora Core 3,
> I couldn't get any information about numarray runtime (in the shared
> libraries), only Python. Ditto with Numeric, although from your post
> you apparently got great results including information on Numeric .so's.
> I'm curious: has anyone else tried this for numarray (or Numeric) on
> Fedora Core 3? Does anyone have a working profile script?
I think you need to have --separate=lib when invoking opcontrol. (See
later for an example.)
Some comments on oprofile:
- I think the oprofile tools (opcontrol, opreport etc.) are separate
from the oprofile module, which is part of the kernel. I installed
oprofile-0.8.1 from source, and it works with my standard Ubuntu
kernel. It is easy to install it in a non-standard location
($HOME/usr on my system).
- I think opstack is part of oprofile 0.8 (or maybe 0.8.1) -- it
wasn't in the 0.7.1 package available for Ubuntu. Also, to actually
get callgraphs (from opstack), you need a patched kernel; see here:
http://oprofile.sf.net/patches/
- I think you probably *shouldn't* compile with -pg if you use
oprofile, but you should use -g.
To profile shared libraries, I also tried the following:
- sprof. Some sort of dark art glibc tool. I couldn't get this to work
with dlopen()'ed libraries (in which class I believe Python C
extensions fall).
- qprof (http://www.hpl.hp.com/research/linux/qprof/). Almost worked,
but I couldn't get it to identify symbols in shared libraries. Their
page has a list of other profilers.
I also tried the Python 2.4 profile module; it does support
C-extension functions as advertised, but it seemed to miss object
instantiation calls (_numarray._numarray's instantiation, in this
case).
Sample oprofile usage on my Ubuntu box:
rory at foo:~/hack/numarray/profile $ cat longadd.py
import numarray as na
a = na.arange(1000.0)
b = na.arange(1000.0)
for i in xrange(1000000):
a + b
rory at foo:~/hack/numarray/profile $ sudo modprobe oprofile
Password:
rory at foo:~/hack/numarray/profile $ sudo ~/usr/bin/opcontrol --start --separate=lib
Using 2.6+ OProfile kernel interface.
Using log file /var/lib/oprofile/oprofiled.log
Daemon started.
Profiler running.
rory at foo:~/hack/numarray/profile $ sudo ~/usr/bin/opcontrol --reset
Signalling daemon... done
rory at foo:~/hack/numarray/profile $ python2.4 longadd.py
rory at foo:~/hack/numarray/profile $ sudo ~/usr/bin/opcontrol --shutdown
Stopping profiling.
Killing daemon.
rory at foo:~/hack/numarray/profile $ opreport -t 2 -l $(which python2.4)
CPU: Athlon, speed 1836.45 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000
samples % image name symbol name
47122 11.2430 _ufuncFloat64.so add_ddxd_vvxv
26731 6.3778 python2.4 PyEval_EvalFrame
24122 5.7553 libc-2.3.2.so memset
21228 5.0648 python2.4 lookdict_string
10583 2.5250 python2.4 PyObject_GenericGetAttr
9131 2.1786 libc-2.3.2.so mcount
9026 2.1535 python2.4 PyDict_GetItem
8968 2.1397 python2.4 PyType_IsSubtype
(The idea wasn't really to discuss the results, but anyway: The
prominence of memset is a little odd -- are destination arrays zeroed
before being assigned the sum result?)
To get the libc symbols you need a libc with debug symbols -- on
Ubuntu this is the libc-dbg package; I don't know what it'll be on
Fedora or other systems. Set the LD_LIBRARY_PATH variable to force
these debug libraries to be loaded:
export LD_LIBRARY_PATH=/usr/lib/debug
This is probably not all that useful -- I suppose it might be
interesting if one generates callgraphs. I don't (yet) have a modified
kernel, so I haven't tried this.
Have fun,
Rory
More information about the NumPy-Discussion
mailing list