[SciPy-User] Using GEMM with specified output-array
Sturla Molden
sturla.molden at gmail.com
Tue Mar 11 14:23:27 EDT 2014
Thomas Unterthiner <thomas_unterthiner at web.de> wrote:
> Hi there!
>
> I would like to use *GEMM without allocating an output-array. So I used
> the functions from scipy.linalg.blas, however it seems as if the
> 'overwrite_c' argument gets completely ignored:
>
> from scipy.linalg.blas import sgemm
> X = np.random.randn(30, 50).astype(np.float32)
> Y = np.random.randn(50, 30).astype(np.float32)
> Z = np.zeros((X.shape[0], Y.shape[1]), dtype=np.float32)
> res = f(1.0, X, Y, c=Z, trans_a=0, trans_b=0, overwrite_c=1)
> assert res == np.dot(X, Y).all()
> print res is Z
>
> prints "False", meaning a new array is allocated for the result.
> However, I want my result to be written into 'Z'. Before anyone asks: I
> can't use np.dot's 'out' argument since I also want to specify 'alpha'.
>
> Is there anything I'm missing here? (I'm using scipy 0.13.3 / numpy
> 1.8.0 with OpenBLAS on Ubuntu 13.10).
You are passing C contiguous arrays to Fortran. f2py will copy and
transpose them. If you want to avoid this, always pass Fortran order arrays
to scipy.linalg.* functions. You create a Fortran order view of a C array
by taking .T (transpose is O(1) in NumPy) or create the array with keyword
argument order='F'. Check that .flags['F_CONTIGUOUS'] is True for all the
arrays you pass to scipy.linalg.
The "overwrite_c" option in GEMM is just a suggestion. It is not enforced.
If f2py has to copy and transpose your C input array it does nothing.
Sturla
More information about the SciPy-User
mailing list