[SciPy-Dev] Boost for stats

Evgeni Burovski evgeny.burovskiy at gmail.com
Sat Feb 13 14:32:03 EST 2021


Hi,

Borrowing from Boost.Math sounds great indeed. (Great if it seems
advantageous by boost devs, too).
There is really no reason to keep using parts of e.g. cdflib which are
superseded by Boost.Math.

However, playing devil's advocate somewhat:

- does the scipy PR need the whole Boost.Math? If it only needs a
select subset (e.g., do we need root-finding etc?), then maybe the
size can be reduced.
- do we need the whole thing? e.g. ufunc loops only need a select
subset of types.
- if we do go this route of taking parts / applying scipy specific
patches, what is easier to do or better maintenance-wise: vendor
original code + patches, or do the work once by porting relevant parts
to standalone C or C++ subset?

Obviously, all these should be weighted with other implications of
adding a dependency. The immediate concerns are distribution size and
build times.

Cheers,

Evgeni

On Sat, Feb 13, 2021 at 4:06 PM Hans Dembinski <hans.dembinski at gmail.com> wrote:
>
> Hi Nicholas,
>
> as a Boost developer (I wrote Boost.Histogram and contributed to several other Boost libs), I think it would be great to build SciPy on Boost.Math, it is a win-win.
>
> > On 12. Feb 2021, at 22:50, Nicholas McKibben <nicholas.bgp at gmail.com> wrote:
> >
> > The initial PR includes the zipped Boost headers only (~24MB zipped), but adding Boost as a submodule might be a more maintainable approach if changes to Boost need to be made in the future.
>
> Including it as a submodule seems like a good approach.
>
> > Inclusion of the entire Boost library is a virtual necessity for the Boost.Math module. Manual attempts to strip away unnecessary files and bcp (Boost's utility to provide stripped down installations) fail to create smaller sizes.
>
> I was a bit shocked to hear this, but you are right:
> https://pdimov.github.io/boostdep-report/master/math.html
> Math depends on everything.
>
> We have a long-term goal to reduce the coupling between Boost libs, but this also incurs costs. Library maintainers then have to copy the relevant bits from other Boost libraries to not depend on them, which is actually a terrible idea: you loose the synergies offered by a rich shared code base. In my view, the coupling is not a bug, it is a feature.
>
> It is impressive to see how you use generators to create the binding code in Cython. I had a lot of trouble with Cython as it does not support all C++ features. The best way to wrap (modern) C++ is pybind11, which is a painless experience. It does the code generation at compile-time with TMP.
>
> Best regards,
> Hans
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev


More information about the SciPy-Dev mailing list