Question about scientific calculations in Python
Martin Kaufmann
martinkaufmann at yahoo.com
Wed Mar 13 06:40:38 EST 2002
On Tue, 12 Mar 2002 22:15:43 -0700, "Andrew Dalke"
<dalke at dalkescientific.com> wrote:
Thanks a lot for you long answer!
>Here's some performance improvements you can make
[lots of very helpful stuff snipped...]
>Some of these changes are only faster for sufficiently long arrays,
>so you should test them out yourself. The biggest trick is to
>precompute as much as possible (Python doesn't optimize constant
>expressions). This includes precomputing lookups of functions
>references, like the "local_sin".
Yes, I heard about this "local" trick, but I didn't try it yet.
>That code is almost identical with the previous code, so you can do
>the same optimization tweaks. You could also precompute which hist
>elements have entry[1] == 0 so you don't do that check every time
>in the loop. (As it is now, the "if" check is probably more expensive
>than the calculation, even if you know it's zero, unless there are a
>lot of zeros.)
I didn't quite understand this part. How should I precompute the zero
entries? How do I know where they are? Or should I just cut them out
before and access the histogram differently?
>If your lists are really long (O(100,000) or more?) then you might find
>that the append starts taking too long -- append is asymptotically O(n**n)
>in Python -- so that precomputing
>
> i_vector = [None] * len(s_vector - 1)
>
>then keeping an index 'n' to replace
> i_vector.append( ...)
>with
> i_vector[n] = ...
>is faster. But this is unlikely.
The list are not that long. Probably only a couple of thousands
entries.
>Probably the best optimization you can do, if PyInline is an acceptable
>approach, is to write put that sinc sum into C code. It reduces to
>taking two lists and returning a float.
I tried to use weave.inline. Here is my code. But put your coffee away
before you read it (you might spill it otherwise over your keyboard)
and _please_ don't laugh out loudly.. I really didn't know what I was
doing, but it somehow worked...
----------- code begin -------------
code = """
PyObject *intens=PyList_New(s_length);
double sum, x, s, h, f_s, i;
for (long t=1;t<s_length;t++) {
s=py_to_float(PyList_GetItem(s_vector.ptr(),t),"s");
sum=0;
for (long r=1;r<hist_length;r++) {
x=2*PI*s*r/hist_mult;
h=py_to_int(PyList_GetItem(hist.ptr(),r),"h");
sum=sum+(2*h*sin(x)/x);
}
f_s=py_to_float(PyList_GetItem(f_s_vector.ptr(),t),"f_s");
i=atoms*f_s*f_s*(1+sum/atoms);
PyList_SetItem(intens, t, PyFloat_FromDouble(i));
}
return_val = Py::new_reference_to(intens);
"""
intens = weave.inline(code, ['s_length', 's_vector',
'hist_length', 'hist', 'hist_mult', 'PI','atoms',
'f_s_vector'],
verbose=2, force=0, compiler = 'gcc')
------------- code end ------------
Today I did a couple of test, but only on small clusters. The timing
routine measured not only the double loop but also the radius
calculation before. Here the results:
------------- results
56 Atom Bi Cluster
running calculate_nohist 3 times took 26.393 s (8.798 s per run)
running calculate_hist 3 times took 17.300 s (5.767 s per run)
running calculate_hist_sci 3 times took 33.052 s (11.017 s per run)
running calculate_hist_weave 3 times took 3.101 s (1.034 s per run)
124 Atom Bi Cluster
running calculate_nohist 3 times took 126.595 s (42.198 s per run)
running calculate_hist 3 times took 21.728 s (7.243 s per run)
running calculate_hist_sci 3 times took 58.345 s (19.448 s per run)
running calculate_hist_weave 3 times took 4.247 s (1.416 s per run)
------------
The "nohist" is the first solution (slow), the "hist" is my second
solution with histograms, the "hist_sci" is the solution using
Scientific Python histograms, and the "hist_weave" is the solution
presented above. The speed of the weave solution is quite incredible!
Thanks again,
Martin
More information about the Python-list
mailing list