[Numpy-discussion] reorganizing numpy internal extensions (was: Re: Should we drop support for "one file" compilation mode?)

David Cournapeau cournape at gmail.com
Tue Oct 6 14:52:11 EDT 2015


On Tue, Oct 6, 2015 at 7:30 PM, Nathaniel Smith <njs at pobox.com> wrote:

> [splitting this off into a new thread]
>
> On Tue, Oct 6, 2015 at 3:00 AM, David Cournapeau <cournape at gmail.com>
> wrote:
> [...]
> > I also agree the current situation is not sustainable -- as we discussed
> > privately before, cythonizing numpy.core is made quite more complicated
> by
> > this. I have myself quite a few issues w/ cythonizing the other parts of
> > umath. I would also like to support the static link better than we do now
> > (do we know some static link users we can contact to validate our
> approach
> > ?)
> >
> > Currently, what we have in numpy core is the following:
> >
> > numpy.core.multiarray -> compilation units in numpy/core/src/multiarray/
> +
> > statically link npymath
> > numpy.core.umath -> compilation units in numpy/core/src/umath +
> statically
> > link npymath/npysort + some shenanigans to use things in
> > numpy.core.multiarray
>
> There are also shenanigans in the other direction - supposedly umath
> is layered "above" multiarray, but in practice there are circular
> dependencies (see e.g. np.set_numeric_ops).
>

Indeed, I am not arguing about merging umath and multiarray.


> > I would suggest to have a more layered approach, to enable both 'normal'
> > build and static build, without polluting the public namespace too much.
> > This is an approach followed by most large libraries (e.g. MKL), and is
> > fairly flexible.
> >
> > Concretely, we could start by putting more common functionalities (aka
> the
> > 'core' library) into its own static library. The API would be considered
> > private to numpy (no stability guaranteed outside numpy), and every
> exported
> > symbol from that library would be decorated appropriately to avoid
> potential
> > clashes (e.g. '_npy_internal_').
>
> I don't see why we need this multi-layered complexity, though.
>

For several reasons:

 - when you want to cythonize either extension, it is much easier to
separate it as cython for CPython API, C for the rest.
 - if numpy.core.multiarray.so is built as cython-based .o + a 'large' C
static library, it should become much simpler to support static link.
 - maybe that's just personal, but I find the whole multiarray + umath
quite beyond manageable in terms of intertwined complexity. You may argue
it is not that big, and we all have different preferences in terms of
organization, but if I look at the binary size of multiarray + umath, it is
quite larger than the median size of the .so I have in my /usr/lib.

I am also hoping that splitting up numpy.core in separate elements that
communicate through internal APIs would make participating into numpy
easier.

We could also swap the argument: assuming it does not make the build more
complex, and that it does help static linking, why not doing it ?

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20151006/f034ac05/attachment.html>


More information about the NumPy-Discussion mailing list