[Numpy-discussion] Accelerating NumPy computations [Was: GPU Numpy]

Fri Aug 21 14:51:50 EDT 2009

Hi,

> Indeed. In the future, if OpenCL is the way to go, it may even be
> helpful to have Numpy using OpenCL directly, as AMD provides an SDK
> for OpenCL, and with Larrabee approaching, Intel will surely provide
> one of its own.

I was just in a lecture by one of the Intel people about OpenCL:

http://parlab.eecs.berkeley.edu/bootcampagenda
http://parlab.eecs.berkeley.edu/sites/all/parlab/files/OpenCL_Mattson.pdf

He offered no schedule for an Intel OpenCL implementation, but said
that they were committed to it.

The lectures in general were effective in pointing out what a
time-consuming effort it can be moving algorithms into the the
parallel world - including GPUs.  The lecture just passed cited the
example of a CUDA-based BLAS implementation on the GPU that was slower
than the CPU version.    Making BLAS go faster required a lot of work
to find optimal strategies for blocking, transfer between CPU / GPU
shared memory / GPU registers, vector sizes and so on - this on a
specific NVIDIA architecture.

I can imagine Numpy being useful for scripting in this
C-and-assembler-centric world, making it easier to write automated
testers, or even generate C code.

Is anyone out there working on this kind of stuff?  I ask only because
there seems to be considerable interest here on the Berkeley campus.

Best,

Matthew