Question about scientific calculations in Python

Wed Mar 13 00:15:43 EST 2002

Martin Kaufmann:
>    for s in s_vector[1:]:
>        sum = 0
>        for r in r_vector[1:]:
>            x = 2 * pi * r * s
>            sum = sum + (2 * (sin(x))/x)
>        intensity = atoms * pow(f_s_vector[n], 2) * (1 + sum / atoms)
>        i_vector.append(intensity)
>        n = n + 1

Here's some performance improvements you can make

def ...
  # Precompute the 2*pi*r, which is a constant inside of the s loop
  precomputed_r1_vector = [2 * pi * r for r in r_vector[1:]]
  local_sin = math.sin  # local variables have fast lookup

  # using the zip gets rid of the extra n+1 and [n], at the expense
  # of making a new list.  I suspect it's faster, but I'm not sure
  for s, f_s_vector in zip(s_vector[1:], f_s_vector):
    sum = 0.0
    for scaled_r in precomputed_r1_vector:
      x = scaled_r * s
      sum = sum + 2 * (local_sin(x)/x)

    # Did some algebra here to merge the first and last terms,
    # which gets rid of the division.  Also replaced the function call
    # to 'pow' with the equivalent use of **2
    i_vector.append((f_s_vector**2) * (atoms + sum))

Some of these changes are only faster for sufficiently long arrays,
so you should test them out yourself.  The biggest trick is to
precompute as much as possible (Python doesn't optimize constant
expressions).  This includes precomputing lookups of functions
references, like the "local_sin".  Also, you could precompute the
lookup of i_vector.append, as in
  i_vector_append = i_vector.append
  ...
  for ...
    i_vector_append((f_s_vector**2) * (atoms + sum))

>Now the histogram code where the radii are binned in the histogram
>hist and only the bins with a value are used:

That code is almost identical with the previous code, so you can do
the same optimization tweaks.  You could also precompute which hist
elements have entry[1] == 0 so you don't do that check every time
in the loop.  (As it is now, the "if" check is probably more expensive
than the calculation, even if you know it's zero, unless there are a
lot of zeros.)

If your lists are really long (O(100,000) or more?) then you might find
that the append starts taking too long -- append is asymptotically O(n**n)
in Python -- so that precomputing

  i_vector = [None] * len(s_vector - 1)

then keeping an index 'n' to replace
  i_vector.append( ...)
with
  i_vector[n] = ...
is faster.  But this is unlikely.

Probably the best optimization you can do, if PyInline is an acceptable
approach, is to write put that sinc sum into C code.  It reduces to
taking two lists and returning a float.

                    Andrew
                    dalke at dalkescientific.com