[Numpy-discussion] Fwd: GPU Numpy

Sat Aug 22 05:50:28 EDT 2009

Erik Tollerud skrev:
>> NumPy arrays on the GPU memory is an easy task. But then I would have to
>> write the computation in OpenCL's dialect of C99? 
> This is true to some extent, but also probably difficult to do given
> the fact that paralellizable algorithms are generally more difficult
> to formulate in striaghtforward ways. 
Then you have misunderstood me completely. Creating an ndarray that has 
a buffer in graphics memory is not too difficult, given that graphics 
memory can be memory mapped. This has nothing to do with parallelizable 
algorithms or not. It is just memory management. We could make an 
ndarray subclass that quickly puts is content in a buffer accessible to 
the GPU. That is not difficult. But then comes the question of what you 
do with it.

I think many here misunderstands the issue here:

Teraflops peak performance of modern GPUs is impressive. But NumPy 
cannot easily benefit from that. In fact, there is little or nothing to 
gain from optimising in that end. In order for a GPU to help, 
computation must be the time-limiting factor. It is not. There is not 
more to say about using GPUs in NumPy right now.

Take a look at the timings here: http://www.scipy.org/PerformancePython 
It shows that computing with NumPy is more than ten times slower than 
using plain C. This is despite NumPy being written in C. The NumPy code 
does not incur 10 times more floating point operations than the C code. 
The floating point unit does not run in turtle mode when using NumPy. 
NumPy's relative slowness compared to C has nothing to do with floating 
point computation. It is due to inferior memory use (temporary buffers, 
multiple buffer traversals) and memory access being slow. Moving 
computation to the GPU can only make this worse.

Improved memory usage - e.g. through lazy evaluation and JIT compilaton 
of expressions - can give up to a tenfold increase in performance. That 
is where we must start optimising to get a faster NumPy. Incidentally, 
this will  also make it easier to leverage on modern GPUs.

Sturla Molden