[SciPy-Dev] Numba as a dependency for SciPy?

Charles R Harris charlesr.harris at gmail.com
Tue Mar 6 20:00:25 EST 2018


On Mon, Mar 5, 2018 at 9:06 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:

> Hi all,
>
> Goal of this email: start a discussion to decide whether we'd be okay with
> relying on Numba as a dependency, now or in 1-2 years' time.
>
> Context: in https://github.com/pydata/sparse/issues/126 a discussion is
> ongoing about whether to adopt Cython or Numba, with Numba being preferred
> by the majority. That `sparse` package is meant to provide sparse *arrays*
> that down the line should either be replacing our current sparse *matrices*
> or at least be integrated in scipy.sparse in addition to them. See
> https://github.com/scipy/scipy/issues/8162 and https://github.com/
> hameerabbasi/sparse-ndarray-protocols for more details on that.
>
> Also related is the question from Serge Guelton some weeks ago about
> whether we'd want to rely on Pythran: https://mail.python.org/
> pipermail/scipy-dev/2018-January/022325.html
>
> On that Pythran thread I commented that we'd want to take these aspects
> into account:
> - portability
> - performance
> - maturity
> - maintenance status (active devs, how quick do bugs get fixed after a
> release with an issue)
> - ease of use (@jit vs. Pythran comments vs. translate to .pyx syntax)
> - size of generated binaries
> - templating support for multiple dtypes
> - debugging and optimization experience/tool
>
> Debugging is one of the ones where I'd say Numba is still worse than
> Cython, however that's being resolved as we speak:
> https://github.com/numba/numba/issues/2788
>
> One thing I missed in the above list is dependencies: while our use of
> Cython only adds a build-time dependency, Numba would add a run-time
> dependency. Given that binary wheels and conda packages for all major
> platforms are available that's not a showstopper, but it matters.
>
> Overall I'd say that:
> - Numba is better than Cython at: performance, ease of use, size of
> generated binaries, and templating support for multiple dtypes. Possibly
> also maintenance status right now.
> - Numba and Cython are about equally good at portability (I think, not
> much data about exotic platforms for Numba).
> - Cython is better than Numba at: maturity, debugging (but not for long
> anymore probably), dependencies.
>
> I'm usually pretty conservative in these things, but considering the above
> I'm leaning towards saying use of Numba should be allowed in the future.
> The added run-time dependency is the one major downside that's going to
> stay, however compared to our Fortran headaches that's a relatively small
> issue.
>

I like the idea of using Numba, but remain a bit skeptical about the
dependencies and long term maintenance. I suppose the same could have been
said about NumPy and SciPy ten years ago, the continued maintenance and
availability of both was not a foregone conclusion. It is probably best to
wait a bit to see how things shake out, but I'm not opposed to the use of
either Pythran or Numba on technical grounds. There have been other such
attempts, weave and that other tensor code -- I forget the name -- were
both present in early releases and have since disappeared.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20180306/80e2a4a8/attachment.html>


More information about the SciPy-Dev mailing list