[Numpy-discussion] NumPy-Discussion OpenBLAS and dotblas

Nathaniel Smith njs at pobox.com
Tue Aug 12 10:26:09 EDT 2014


Hi Matt,

On Mon, Aug 11, 2014 at 10:46 PM, Matti Picus <matti.picus at gmail.com> wrote:
> Hi Nathaniel.
> Thanks for your prompt reply. I think numpy is a wonderful project, and you
> all do a great job moving it forward.
> If you ask what would my vision for maturing numpy, I would like to see a
> grouping of linalg matrix-operation functionality into a python level
> package, exactly the opposite of more tightly tying linalg into the core of
> numpy.

As I understood it (though I admit Chuck was pretty terse, maybe he'll
correct me :-)), what he was proposing was basically just a build
system reorganization -- it's much easier to call between C functions
that are in the same Python module than C functions that are in
different modules, so we end up with lots of boilerplate gunk for the
latter. I don't think it would involve any tighter coupling than we
already have in practice.

> The orthagonality would allow goups like PyOpenCL to reuse the matrix
> operations on data located off the CPU's RAM, just to give one example; and
> make it easier for non-numpy developers to create a complete replacement of
> lapack with other implementations.

I guess I don't really understand what you're suggesting. If we have a
separate package that is the same as current np.linalg, then how does
that allow PyOpenCL to suddenly run the np.linalg code on the GPU?
What kind of re-use are you envisioning? The important kind of re-use
that comes to mind for me is that I should be able to write code that
can accept either a RAM matrix or a GPU matrix and works the same. But
the key feature to enable this is that there should be a single API
that works on both types of objects -- e.g. np.dot(a, b) should work
even if a, b are on the GPU. But this is exactly what __numpy_ufunc__
is designed to enable, and that has nothing to do with splitting
linalg off into a separate package...

And of course if someone has a better idea about how to implement
lapack, then they should do that work in the numpy repo so everyone
can benefit, not go off and reimplement their own version from scratch
that no-one will use :-).

> Much of the linalg package would of
> course be implemented in c or fortran, but the interface to ndarray would
> use the well-established idea of contiguous matrices with shapes, strides,
> and a single memory store, supporting only numeric number types.

It's actually possible today for third-party users to add support for
third-party dtypes to most linalg operations, b/c most linalg
operations are implemented using the numpy ufunc machinery.

> I suggested cffi since it provides a convienent and efficient interface to
> ndarray. Thus python could remain as a thin wrapper over the calls out to
> c-based libraries much like lapack_lite does today, but at the python level
> rather that the capi level.
> Yes, a python-based interface would slows the code down a bit, but I would
> argue that
> 1. the current state of lapack_litemodule.c and umath_linalg.c.src, with its
> myriad of compile-time macros and complex code paths, scares people away
> from contributing to the ongoing maintenance of the library while tying the
> code very closely to the lapack routines, and

I agree that simple is better than complex, but I don't see how moving
those macros and code paths into a separate package decreases
complexity. If anything it would increase complexity, because now we
have two repos instead of one, two release schedules instead of one,
and n^2 combinations of (linalg version, numpy version) to test
against.

-n

> 2. matrices larger than 3x3 or so should be spending most of the computation
> time in the underlying lapack/blas library irregardless of whether the
> interface is python-based or capi-based.
> Matti
>
> On 10/08/2014 8:00 PM, numpy-discussion-request at scipy.org wrote:
>>
>>
>> Date: Sat, 9 Aug 2014 21:11:19 +0100
>> From: Nathaniel Smith <njs at pobox.com>
>> Subject: Re: [Numpy-discussion] NumPy-Discussion OpenBLAS and dotblas
>> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
>>
>>
>> On Sat, Aug 9, 2014 at 8:35 PM, Matti Picus <matti.picus at gmail.com> wrote:
>>>
>>> Hi. I am working on numpy in pypy. It would be much more challenging for
>>> me if you merged more code into the core of numpy,
>>
>> Hi Matti,
>>
>>
>> I can definitely see how numpy changes cause trouble for you, and
>> sympathize. But, can you elaborate on what kind of changes would make
>> your life easier *that also* help make numpy proper better in their
>> own right? Because unfortunately, I don't see how we can reasonably
>> pass up on improvements to numpy if the only justification is to make
>> numpypy's life easier. (I'd also love to see pypy become usable for
>> general numerical work, but not only is it not there now, I don't see
>> how numpypy will ultimately get us there even if we do help it along
>> -- almost none of the ecosystem can get by numpy's python-level APIs
>> alone.) But obviously if there are changes that are mutually
>> beneficial, well then, that's a lot easier to justify :-)
>>
>> -n
>
>



-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org



More information about the NumPy-Discussion mailing list