[SciPy-Dev] Boost for stats

Ralf Gommers ralf.gommers at gmail.com
Mon Feb 22 04:35:36 EST 2021


On Mon, Feb 22, 2021 at 5:36 AM Nicholas McKibben <nicholas.bgp at gmail.com>
wrote:

> Local testing of inclusion of Boost as a submodule has revealed some
> undesirable side effects:
> - all sources, documentation, etc. regardless of relevance to SciPy must
> be fetched
> - recursive submodule initialization can take quite a while (~10 minutes
> on my machine and internet connection)
> - lots of churn when running commands like `git status`
>

Ouch, that's a lot slower than I expected. I'm not sure I understand it
though, there should be no `git status` churn at all (unless the build
process messes with files in-place?) and it's faster than cloning our own
repo:

$ time git clone git at github.com:boostorg/boost.git
Cloning into 'boost'...
remote: Enumerating objects: 15, done.
remote: Counting objects: 100% (15/15), done.
remote: Compressing objects: 100% (11/11), done.
remote: Total 254626 (delta 8), reused 11 (delta 4), pack-reused 254611
Receiving objects: 100% (254626/254626), 62.02 MiB | 7.47 MiB/s, done.
Resolving deltas: 100% (163071/163071), done.

real 0m12.221s
user 0m5.959s
sys 0m2.725s


$ time git clone git at github.com:scipy/scipy.git
Cloning into 'scipy'...
remote: Enumerating objects: 178585, done.
remote: Total 178585 (delta 0), reused 0 (delta 0), pack-reused 178585
Receiving objects: 100% (178585/178585), 104.61 MiB | 6.56 MiB/s, done.
Resolving deltas: 100% (137836/137836), done.

real 0m21.492s
user 0m9.620s
sys 0m3.231s


What should I test to reproduce the problem?

Cheers,
Ralf


> Of course we will also need to see how this impacts the CI pipelines.
> This extra overhead may initially cause some timeouts.  Another option that
> will alleviate some of these pains is to create a header only repo similar
> to this one: https://github.com/povilasb/boost-header-only.  It could
> live in the SciPy github account and would be easy to update -- simply
> download the Boost tarball release and copy over the include directory only
> (or build a specific commit locally and do the same thing).  It is more
> maintenance than simply checking out the latest tagged release of Boost and
> updating the submodules (adds an extra step of updating the header only
> repo), but it minimizes space and bandwidth usage.  Thoughts?
>
> On Thu, Feb 18, 2021 at 3:01 AM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>>
>> On Thu, Feb 18, 2021 at 3:45 AM Nicholas McKibben <nicholas.bgp at gmail.com>
>> wrote:
>>
>>>
>>> > Probably good to make sure that aarch64 build times remain relatively
>>> stable
>>>
>>> Good point!  Can this be checked via a PR to scipy-wheels?
>>>
>>
>> Don't worry about this one, if compile time increase on other platforms
>> is minor, it'll be fine for aarch64 too. We have limited TravisCI credits
>> (actual status of that is a little unclear), so no need to burn them for
>> this. We anyway should be moving CI providers for aarch64 at some point,
>> probably to https://www.drone.io/
>>
>> Cheers,
>> Ralf
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210222/082721c8/attachment.html>


More information about the SciPy-Dev mailing list