Use of Python with GDAL. How to speed up ?

Serge Orlov Serge.Orlov at gmail.com
Sat Mar 25 00:03:25 EST 2006


Julien Fiore wrote:
> Thank you Serge for this generous reply,
>
> Vectorized code seems a great choice to compute the distance. If I
> understand well, vectorized code can only work when you don't need to
> access the values of the array, but only need to know the indices.

You can access array elements in any order if you wrap your head around
how indexing array with arrays works. Here is I'm transposing matrix on
the fly

> This
> works well for the distance, but I need to access the array values in
> the openness function.

What really kills vectorization is conditional control flow
constructions (if, break, continue). Some of them can be still be
converted if you change algorithm. But some kill vectorization. For
example first continue where you exclude edges is avoidable: you need
to break you main loop in two, first loop will initialize edges, the
second loop will work on the main image. But the second continue is a
killer. Another example of algorightm change is small loops inside the
main loop:

E = 180.0
angle = numpy.min(90.0 -
numpy.arctan((win[R+1:2*R][R]-win[R][R])/dist[R+1:2*R][R]))
if angle<E: E=angle

Though maybe it will be slower for small R. You need to profile it.

Another thing not related to vectorization is that python almost
doesn't do optimizations you would expect if this code was written in
C++ or Fortran:
1. Common subexpression elimination:
win[R][R] is evaluated many times in your code, assign it to temp
variable before usage.
R + 1, R - 1, 2 * R should be computed before all the loops.

2. to get the value of numpy.arctan python has to do two dictionary
lookup in your hottest loops. Assign it to local variable at the top of
your function:
from numpy import arctan
If you are going to use min from my example, import it as well.

>
> As regards array.array, it seems a bit complicated to reduce all my 2D
> arrays to 1D arrays because I am working with the gdal library, which
> works only with 'numeric' arrays.

You can try to convert numeric array into array.array. 

  Serge.




More information about the Python-list mailing list