From ilhanpolat at gmail.com  Tue Feb  2 17:59:46 2021
From: ilhanpolat at gmail.com (Ilhan Polat)
Date: Tue, 2 Feb 2021 23:59:46 +0100
Subject: [SciPy-Dev] Yes, we don't want any scipy modules BUT
Message-ID: <CAEBuzr-EmBy_7y1xeTQ7_O+373_DLxQGeOzUvX48NVarejWw1w@mail.gmail.com>

Hi everyone,

This is an odd ball of an subject so bear with me for a paragraph.
Currently we have lots of control related functions on scipy.signal with
varying production grade some are there almost just as a placeholder some
are pretty good. However, many things don't come with the box such as MIMO
support, internal delay representations, time and bode plotting (properly
spaced and considerably dense) and so on. Now of course we have
python-control and (shameless plug) harold packages that can do some and
fail to do others. Frankly in my particular case scipy is eating all my OSS
time. And python-control has their own roadmap. I provide lots of MIMO
stuff but lacking the academic catalogue functions like root-locus and
other academic torture tools and python-control is mostly lacking MIMO
support and a bit short of advanced stuff.

In the mean time, there is a very nice Fortran library SLICOT which also
powers some matlab functions in production however it is not open source.
But they moved to GitHub recently and released its earlier version 5.7
under BSD3. Previously 5.0 was released under GPL and that was the one
python-control vendored but 5.7 is already pretty capable and actually
caused me to write this up. This library is quite diverse and written by
very very high caliber researchers. The reason why I always avoided was
obviously GPL but apparently they changed their mind which is personally
fantastic news for me.

So coming back to the meat of this discussion: I have looked at the LTI
parts and very very closely and I don't see any way to overhaul them
without extremely painful deprecation cycles and breakage. But I sincerely
believe that together with PocketFFT scipy can serve a better quality LTI
tools. In its current state it's a bit academic-ish and not production
ready. So this brings us to three concrete options

1- Status quo : I don't like touching that many funcs and waking the
sleeping dogs
2- Whatever we do we do it on the current functions: It doesn't matter if
it takes 4 years, we don't want any adventures
3- Make a new module and lighten up the signal module which was probably
not exactly the right place.

Please make it as blunt as possible, no hard feelings but I think this
discussion has to be done at least once and maybe for all. A tiny bit of it
has already happened last year in https://github.com/scipy/scipy/pull/4515
but it barely grazed.

Cheers,
ilhan

Current catalogue
https://docs.scipy.org/doc/scipy/reference/signal.html

python-control vendored version
https://github.com/python-control/Slycot

New BSD3 version
https://github.com/SLICOT/SLICOT-Reference
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210202/52475806/attachment.html>

From bussonniermatthias at gmail.com  Thu Feb  4 14:40:41 2021
From: bussonniermatthias at gmail.com (Matthias Bussonnier)
Date: Thu, 4 Feb 2021 11:40:41 -0800
Subject: [SciPy-Dev] Messages stuck in moderation ?
Message-ID: <CANJQusXXn9ijq00OnGtVT5m2zngREC+yx63G9a72oqXvmWQbuA@mail.gmail.com>

Hello everyone,

I am working with a student (Arthur Volant) on his first contribution to SciPy,
It looks like his mails are not reaching the mailing list (they do not
currently appear in archive[0]).

Does anybody have access to the moderation panel to see if they are
stuck in moderation or if something else is rejecting them ? Or is
there a problem on mailman which does not display them, but
subscribers to the mailing list still did receive them ?

Below is attached a copy of the messages sent.

Thanks,
--
Matthias


from: arthurvolant at gmail.com
Hello Scipy-dev,

I have been working for a couple of days now on the following PR[1].
The origin of this PR is this issue[2], asking to add Barnard and
Boschloo test, which are two exact statistical tests.
While working on it, I found that Fisher's exact test was already implemented.
Barnard and Boshloo are two tests more powerful than Fisher's one.
My `barnard_exact` implementation is so far working well. It is a
little bit slower than Fisher exact test, but not that much, with an
average execution time of 1.12 ms

I was wondering though where to put my codes. It seems that there are
two possible files :
either in `scipy/stats/_hypotests.py` or either in
`scipy/stats/contingency.py` which contains already `chi2_contingency`
function. What would you advise me to do?

I thank you for your time and answers,
Arthur

0: https://mail.python.org/pipermail/scipy-dev/2021-February/date.html#start
1: https://github.com/scipy/scipy/pull/13441
2: https://github.com/scipy/scipy/issues/11014

From stefanv at berkeley.edu  Thu Feb  4 19:23:11 2021
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Thu, 04 Feb 2021 16:23:11 -0800
Subject: [SciPy-Dev] ENH: improve RBF interpolation
In-Reply-To: <CACOuGAMBNyQcKe7a9tM7faB=mr_UwPNGiCuVn=WML7wT9jo7Qg@mail.gmail.com>
References: <CACOuGAMBNyQcKe7a9tM7faB=mr_UwPNGiCuVn=WML7wT9jo7Qg@mail.gmail.com>
Message-ID: <274ab45e-a9cc-4710-a40e-65b6f96c0b4a@www.fastmail.com>

Hi Trever,

On Fri, Jan 29, 2021, at 05:13, Trever Hines wrote:
> I would like to contribute code to scipy to address some issues regarding RBF interpolation. The code can be found on my branch _here_ <https://github.com/treverhines/scipy/blob/improve-rbf-interpolation/scipy/interpolate/rbfinterp.py>. My contribution consists of two new classes for scattered N-D interpolation:


This is fantastic; thank you so much for sharing your expertise on RBFs!

>  1. `RBFInterpolator`: This is intended to be a replacement for `Rbf` that addresses issues mentioned in 9904 <https://github.com/scipy/scipy/issues/9904> and  4790 <https://github.com/scipy/scipy/issues/4790>. Namely, the major differences with `Rbf` are 1) the usage is similar to `NearestNDInterpolator` and `LinearNDInterpolator` making it easier to swap out different interpolation methods, 2) the sign of the smoothing parameter is correct (see page 10 of these lecture notes <http://pages.stat.wisc.edu/~wahba/stat860public/lect/lect8/lect8.pdf>), and 3) the interpolant includes polynomial terms.  
> For some RBF choices (values of ?linear?, ?thin_plate?, ?cubic?, ?quintic?, or ?multiquadratic? for `function` in `Rbf`), the additional polynomial terms are needed to ensure that the interpolation problem is well-posed (see theorem 3.2.7 in this document <http://amadeus.math.iit.edu/~fass/603_ch3.pdf>). Without the additional polynomial terms for these RBFs, I have noticed that some values for the smoothing parameter (with the corrected sign) result in an obviously erroneous interpolant. Even when the chosen RBF does not require additional polynomial terms, they still can improve the quality of the interpolant. In particular, the polynomial terms are able to accommodate shifts or linear trends in data, which the RBFs tend to struggle with by themselves.

Is there any advantage to keeping the old interface, or should this eventually replace Rbf entirely?

>  1. `KNearestRBFInterpolator`: This class performs RBF interpolation using only the k neaest data points to each interpolation point (which was suggested in 5180 <https://github.com/scipy/scipy/issues/5180>). This class is useful when there are too many observations for `RBFInterpolator` (on the order of tens of thousands) and you want an interpolant that* looks** *smoother than what you get with `NearestNDInterpolator` or `LinearNDInterpolator`.  
> My concern with interpolation using the k nearest neighbors is that it is a bit of an ad hoc strategy to work around computational limitations. That being said, I have seen a similar strategy used in the Kriging <https://pro.arcgis.com/en/pro-app/latest/help/analysis/geostatistical-analyst/search-neighborhoods.htm> world (Kriging is a form of RBF interpolation). 

Superb!  I've been trying to do RBF interpolation with N>1000 and with the N^2 memory requirement it gets you pretty quickly.  This makes the algorithm much more pragmatic to apply to, e.g., images.  We can always add other picking strategies later on.

> I wou;d appreciate your feedback on whether you think these would be valuable contributions to scipy.  If so, I will make the pull request after adding benchmarks, unit tests, and more docs.


I'd say 100% yes; thank you again.

Best regards,
St?fan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210204/f4928e0d/attachment.html>

From ralf.gommers at gmail.com  Fri Feb  5 07:05:51 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Fri, 5 Feb 2021 13:05:51 +0100
Subject: [SciPy-Dev] Messages stuck in moderation ?
In-Reply-To: <CANJQusXXn9ijq00OnGtVT5m2zngREC+yx63G9a72oqXvmWQbuA@mail.gmail.com>
References: <CANJQusXXn9ijq00OnGtVT5m2zngREC+yx63G9a72oqXvmWQbuA@mail.gmail.com>
Message-ID: <CABL7CQj2O--94iTzWoATGe=7jdyXV8_avaDH9LbYgd5bnSU9sw@mail.gmail.com>

On Thu, Feb 4, 2021 at 8:41 PM Matthias Bussonnier <
bussonniermatthias at gmail.com> wrote:

> Hello everyone,
>
> I am working with a student (Arthur Volant) on his first contribution to
> SciPy,
> It looks like his mails are not reaching the mailing list (they do not
> currently appear in archive[0]).
>
> Does anybody have access to the moderation panel to see if they are
> stuck in moderation or if something else is rejecting them ? Or is
> there a problem on mailman which does not display them, but
> subscribers to the mailing list still did receive them ?
>

Hey Mathias, no one on this list has admin access. Best would be to submit
an issue through the email address at the bottom of
https://mail.python.org/mailman/listinfo/scipy-dev.


> Below is attached a copy of the messages sent.
>
> Thanks,
> --
> Matthias
>
>
> from: arthurvolant at gmail.com
> Hello Scipy-dev,
>
> I have been working for a couple of days now on the following PR[1].
> The origin of this PR is this issue[2], asking to add Barnard and
> Boschloo test, which are two exact statistical tests.
> While working on it, I found that Fisher's exact test was already
> implemented.
> Barnard and Boshloo are two tests more powerful than Fisher's one.
> My `barnard_exact` implementation is so far working well. It is a
> little bit slower than Fisher exact test, but not that much, with an
> average execution time of 1.12 ms
>

Both of these tests seem like a good idea to add. Feedback on the issue was
positive, and the papers have ~350 and ~150 citations, respectively. So
that should be good enough.


> I was wondering though where to put my codes. It seems that there are
> two possible files :
> either in `scipy/stats/_hypotests.py` or either in
> `scipy/stats/contingency.py` which contains already `chi2_contingency`
> function. What would you advise me to do?
>

The PR as it is, adding it to _hypotests.py, seems fine to me.

Thanks for working on this Arthur!

Cheers,
Ralf


> I thank you for your time and answers,
> Arthur
>
> 0:
> https://mail.python.org/pipermail/scipy-dev/2021-February/date.html#start
> 1: https://github.com/scipy/scipy/pull/13441
> 2: https://github.com/scipy/scipy/issues/11014
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210205/d98a93d5/attachment-0001.html>

From ralf.gommers at gmail.com  Sat Feb  6 12:12:53 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 6 Feb 2021 18:12:53 +0100
Subject: [SciPy-Dev] SLICOT, LTI functionality and scipy.signal (WAS: Yes,
 we don't want any scipy modules BUT)
In-Reply-To: <CAEBuzr-EmBy_7y1xeTQ7_O+373_DLxQGeOzUvX48NVarejWw1w@mail.gmail.com>
References: <CAEBuzr-EmBy_7y1xeTQ7_O+373_DLxQGeOzUvX48NVarejWw1w@mail.gmail.com>
Message-ID: <CABL7CQiURDuthW8GUasxpn0PbKe+J=2+2S17L+efwi74m3Yz7g@mail.gmail.com>

On Wed, Feb 3, 2021 at 12:00 AM Ilhan Polat <ilhanpolat at gmail.com> wrote:

> Hi everyone,
>
> This is an odd ball of an subject so bear with me for a paragraph.
> Currently we have lots of control related functions on scipy.signal with
> varying production grade some are there almost just as a placeholder some
> are pretty good. However, many things don't come with the box such as MIMO
> support, internal delay representations, time and bode plotting (properly
> spaced and considerably dense) and so on. Now of course we have
> python-control and (shameless plug) harold packages that can do some and
> fail to do others. Frankly in my particular case scipy is eating all my OSS
> time. And python-control has their own roadmap. I provide lots of MIMO
> stuff but lacking the academic catalogue functions like root-locus and
> other academic torture tools and python-control is mostly lacking MIMO
> support and a bit short of advanced stuff.
>
> In the mean time, there is a very nice Fortran library SLICOT which also
> powers some matlab functions in production however it is not open source.
> But they moved to GitHub recently and released its earlier version 5.7
> under BSD3. Previously 5.0 was released under GPL and that was the one
> python-control vendored but 5.7 is already pretty capable and actually
> caused me to write this up. This library is quite diverse and written by
> very very high caliber researchers. The reason why I always avoided was
> obviously GPL but apparently they changed their mind which is personally
> fantastic news for me.
>

That's very interesting. python-control itself is BSD-3, but the
recommended optional dependency slycot is indeed GPL v2.

So coming back to the meat of this discussion: I have looked at the LTI
> parts and very very closely and I don't see any way to overhaul them
> without extremely painful deprecation cycles and breakage. But I sincerely
> believe that together with PocketFFT scipy can serve a better quality LTI
> tools. In its current state it's a bit academic-ish and not production
> ready. So this brings us to three concrete options
>

I agree. The scipy.signal module is of varying quality, and the LTI parts
are indeed not great.


> 1- Status quo : I don't like touching that many funcs and waking the
> sleeping dogs
> 2- Whatever we do we do it on the current functions: It doesn't matter if
> it takes 4 years, we don't want any adventures
> 3- Make a new module and lighten up the signal module which was probably
> not exactly the right place.
>
> Please make it as blunt as possible, no hard feelings but I think this
> discussion has to be done at least once and maybe for all. A tiny bit of it
> has already happened last year in https://github.com/scipy/scipy/pull/4515
> but it barely grazed.
>

On reflection, creating a new scipy.control module worries me a little.
What Eric Quintero said on gh-4515 is probably true:

*"I'm a user and a big fan of python-control, but I don't think I'm quite
on board with merging it into scipy. The scope of capabilities that users
may expect from a controls package is a little bigger than what I imagine
for scipy submodules. I think there is an advantage to be had to being a
standalone module that can set its own schedule, deprecation policy, etc."*

Control theory is a little specialized/niche compared to most of the topics
covered by other SciPy submodules, and the combination of that
domain-specific knowledge plus a large amount of Fortran code is not very
appealing.

I think my ideal outcome here would be that python-control and harold
merge, and we recommend that one well-designed/maintained package to users
over and above the LTI and filter design functionality in scipy.signal. We
can't deprecate scipy.signal, because it's too widely used. But it could
have a similar relation to the python-control-harold package as
scipy.cluster has to scikit-learn: we offer the basics, and for
higher-performance or more state-of-the-art stuff, go elsewhere.

Cheers,
Ralf


> Cheers,
> ilhan
>
> Current catalogue
> https://docs.scipy.org/doc/scipy/reference/signal.html
>
> python-control vendored version
> https://github.com/python-control/Slycot
>
> New BSD3 version
> https://github.com/SLICOT/SLICOT-Reference
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210206/33d30d1d/attachment.html>

From ilhanpolat at gmail.com  Sat Feb  6 12:42:26 2021
From: ilhanpolat at gmail.com (Ilhan Polat)
Date: Sat, 6 Feb 2021 18:42:26 +0100
Subject: [SciPy-Dev] SLICOT, LTI functionality and scipy.signal (WAS: Yes,
 we don't want any scipy modules BUT)
In-Reply-To: <CABL7CQiURDuthW8GUasxpn0PbKe+J=2+2S17L+efwi74m3Yz7g@mail.gmail.com>
References: <CAEBuzr-EmBy_7y1xeTQ7_O+373_DLxQGeOzUvX48NVarejWw1w@mail.gmail.com>
 <CABL7CQiURDuthW8GUasxpn0PbKe+J=2+2S17L+efwi74m3Yz7g@mail.gmail.com>
Message-ID: <CAEBuzr_bywALh7NZpsEt5=u7sYtv-SOXAMaQQGk+5c3NNjEptA@mail.gmail.com>

I think python-control doesn't need harold at all and in pretty good shape
by itself. So we can start pointing to them anyways,

However, I don't know if it is luck but very recent issues like

https://github.com/scipy/scipy/issues/13496
https://github.com/scipy/scipy/issues/13498

and many more in the backlog are a bit worrisome to me. But then again,
regardless of the issues I don't mind any of these outcomes from this
discussion.


On Sat, Feb 6, 2021 at 6:13 PM Ralf Gommers <ralf.gommers at gmail.com> wrote:

>
>
> On Wed, Feb 3, 2021 at 12:00 AM Ilhan Polat <ilhanpolat at gmail.com> wrote:
>
>> Hi everyone,
>>
>> This is an odd ball of an subject so bear with me for a paragraph.
>> Currently we have lots of control related functions on scipy.signal with
>> varying production grade some are there almost just as a placeholder some
>> are pretty good. However, many things don't come with the box such as MIMO
>> support, internal delay representations, time and bode plotting (properly
>> spaced and considerably dense) and so on. Now of course we have
>> python-control and (shameless plug) harold packages that can do some and
>> fail to do others. Frankly in my particular case scipy is eating all my OSS
>> time. And python-control has their own roadmap. I provide lots of MIMO
>> stuff but lacking the academic catalogue functions like root-locus and
>> other academic torture tools and python-control is mostly lacking MIMO
>> support and a bit short of advanced stuff.
>>
>> In the mean time, there is a very nice Fortran library SLICOT which also
>> powers some matlab functions in production however it is not open source.
>> But they moved to GitHub recently and released its earlier version 5.7
>> under BSD3. Previously 5.0 was released under GPL and that was the one
>> python-control vendored but 5.7 is already pretty capable and actually
>> caused me to write this up. This library is quite diverse and written by
>> very very high caliber researchers. The reason why I always avoided was
>> obviously GPL but apparently they changed their mind which is personally
>> fantastic news for me.
>>
>
> That's very interesting. python-control itself is BSD-3, but the
> recommended optional dependency slycot is indeed GPL v2.
>
> So coming back to the meat of this discussion: I have looked at the LTI
>> parts and very very closely and I don't see any way to overhaul them
>> without extremely painful deprecation cycles and breakage. But I sincerely
>> believe that together with PocketFFT scipy can serve a better quality LTI
>> tools. In its current state it's a bit academic-ish and not production
>> ready. So this brings us to three concrete options
>>
>
> I agree. The scipy.signal module is of varying quality, and the LTI parts
> are indeed not great.
>
>
>> 1- Status quo : I don't like touching that many funcs and waking the
>> sleeping dogs
>> 2- Whatever we do we do it on the current functions: It doesn't matter if
>> it takes 4 years, we don't want any adventures
>> 3- Make a new module and lighten up the signal module which was probably
>> not exactly the right place.
>>
>> Please make it as blunt as possible, no hard feelings but I think this
>> discussion has to be done at least once and maybe for all. A tiny bit of it
>> has already happened last year in
>> https://github.com/scipy/scipy/pull/4515 but it barely grazed.
>>
>
> On reflection, creating a new scipy.control module worries me a little.
> What Eric Quintero said on gh-4515 is probably true:
>
> *"I'm a user and a big fan of python-control, but I don't think I'm quite
> on board with merging it into scipy. The scope of capabilities that users
> may expect from a controls package is a little bigger than what I imagine
> for scipy submodules. I think there is an advantage to be had to being a
> standalone module that can set its own schedule, deprecation policy, etc."*
>
> Control theory is a little specialized/niche compared to most of the
> topics covered by other SciPy submodules, and the combination of that
> domain-specific knowledge plus a large amount of Fortran code is not very
> appealing.
>
> I think my ideal outcome here would be that python-control and harold
> merge, and we recommend that one well-designed/maintained package to users
> over and above the LTI and filter design functionality in scipy.signal. We
> can't deprecate scipy.signal, because it's too widely used. But it could
> have a similar relation to the python-control-harold package as
> scipy.cluster has to scikit-learn: we offer the basics, and for
> higher-performance or more state-of-the-art stuff, go elsewhere.
>
> Cheers,
> Ralf
>
>
>
>> Cheers,
>> ilhan
>>
>> Current catalogue
>> https://docs.scipy.org/doc/scipy/reference/signal.html
>>
>> python-control vendored version
>> https://github.com/python-control/Slycot
>>
>> New BSD3 version
>> https://github.com/SLICOT/SLICOT-Reference
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210206/7009beed/attachment-0001.html>

From charlesr.harris at gmail.com  Sun Feb  7 16:23:04 2021
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 7 Feb 2021 14:23:04 -0700
Subject: [SciPy-Dev] NumPy 1.20.1 released.
Message-ID: <CAB6mnxJYuauVDMO1+WP-FKfhgc+R1gsUjF_aHWMUmZz7xsVgzA@mail.gmail.com>

Hi All,

On behalf of the NumPy team I am pleased to announce the release of NumPy
1.20.1. NumPy 1.20.1 is a rapid bugfix release fixing several bugs and
regressions reported after the 1.20.0 release. The Python versions
supported for this release are 3.7-3.9. Wheels can be downloaded from PyPI
<https://pypi.org/project/numpy/1.20.1/>; source archives, release notes,
and wheel hashes are available on Github
<https://github.com/numpy/numpy/releases/tag/v1.20.1>. Linux users will
need pip >= 0.19.3 in order to install manylinux2010 and manylinux2014
wheels.

*Highlights*

   -  The distutils bug that caused problems with downstream projects is
   fixed.
   -  The ``random.shuffle`` regression is fixed.

*Contributors*

A total of 8 people contributed to this release.  People with a "+" by their
names contributed a patch for the first time.

   - Bas van Beek
   - Charles Harris
   - Nicholas McKibben +
   - Pearu Peterson
   - Ralf Gommers
   - Sebastian Berg
   - Tyler Reddy
   - @Aerysv +

*Pull requests merged*

A total of 15 pull requests were merged for this release.

   - gh-18306: MAINT: Add missing placeholder annotations
   - gh-18310: BUG: Fix typo in ``numpy.__init__.py``
   - gh-18326: BUG: don't mutate list of fake libraries while iterating
   over...
   - gh-18327: MAINT: gracefully shuffle memoryviews
   - gh-18328: BUG: Use C linkage for random distributions
   - gh-18336: CI: fix when GitHub Actions builds trigger, and allow ci
   skips
   - gh-18337: BUG: Allow unmodified use of isclose, allclose, etc. with
   timedelta
   - gh-18345: BUG: Allow pickling all relevant DType types/classes
   - gh-18351: BUG: Fix missing signed_char dependency. Closes #18335.
   - gh-18352: DOC: Change license date 2020 -> 2021
   - gh-18353: CI: CircleCI seems to occasionally time out, increase the
   limit
   - gh-18354: BUG: Fix f2py bugs when wrapping F90 subroutines.
   - gh-18356: MAINT: crackfortran regex simplify
   - gh-18357: BUG: threads.h existence test requires GLIBC > 2.12.
   - gh-18359: REL: Prepare for the NumPy 1.20.1 release.

Cheers,

Charles Harris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210207/9d15a2b5/attachment.html>

From nicholas.bgp at gmail.com  Fri Feb 12 16:50:22 2021
From: nicholas.bgp at gmail.com (Nicholas McKibben)
Date: Fri, 12 Feb 2021 14:50:22 -0700
Subject: [SciPy-Dev] Boost for stats
Message-ID: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>

Hi all,

Many stats distributions in SciPy have outstanding issues with difficult
solutions in legacy code.  We've been working on replacing existing
statistical distributions with those found in Boost.Math.  The initial
implementation resolves almost a dozen issues for scipy.stats with
potential for resolving several more in scipy.stats and scipy.special in
future PRs.

Initial PR: https://github.com/scipy/scipy/pull/1332
<https://github.com/scipy/scipy/pull/13328>
This PR includes the ability to easily add Boost functionality through
generated ufuncs.

Boost is a large library and would incur the cost of one of the follow
- an additional dependency (e,g, boostinator
https://github.com/mckib2/boostinator) that outsources the packaging of the
Boost libraries
- the inclusion of Boost within SciPy either as a "clone and own" or
submodule

The initial PR includes the zipped Boost headers only (~24MB zipped), but
adding Boost as a submodule might be a more maintainable approach if
changes to Boost need to be made in the future.

Inclusion of the entire Boost library is a virtual necessity for the
Boost.Math module. Manual attempts to strip away unnecessary files and bcp
(Boost's utility to provide stripped down installations) fail to create
smaller sizes.  The increase in size would be similar to the following:

SciPy master repo ~177 MB
Boost branch: ~221 MB

Built: ~939 MB
Built With Boost: ~1090 MB

Wheel size should not be significantly impacted because Boost is used as a
header-only library.

I have no relationship with the Boost libraries other than as a user and
bug reporter.  I find them to be impressive and well-maintained with
tremendous support from both industry and open source developers.  SciPy
would benefit from the efficient, well-tested and maintained
implementations of stats and special algorithms.

Thanks,
Nicholas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210212/f34e525b/attachment.html>

From hans.dembinski at gmail.com  Sat Feb 13 08:06:17 2021
From: hans.dembinski at gmail.com (Hans Dembinski)
Date: Sat, 13 Feb 2021 14:06:17 +0100
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
References: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
Message-ID: <C98548B8-3C0F-4314-9895-2030ECABF5AF@gmail.com>

Hi Nicholas,

as a Boost developer (I wrote Boost.Histogram and contributed to several other Boost libs), I think it would be great to build SciPy on Boost.Math, it is a win-win.

> On 12. Feb 2021, at 22:50, Nicholas McKibben <nicholas.bgp at gmail.com> wrote:
> 
> The initial PR includes the zipped Boost headers only (~24MB zipped), but adding Boost as a submodule might be a more maintainable approach if changes to Boost need to be made in the future.

Including it as a submodule seems like a good approach.

> Inclusion of the entire Boost library is a virtual necessity for the Boost.Math module. Manual attempts to strip away unnecessary files and bcp (Boost's utility to provide stripped down installations) fail to create smaller sizes.

I was a bit shocked to hear this, but you are right:
https://pdimov.github.io/boostdep-report/master/math.html
Math depends on everything.

We have a long-term goal to reduce the coupling between Boost libs, but this also incurs costs. Library maintainers then have to copy the relevant bits from other Boost libraries to not depend on them, which is actually a terrible idea: you loose the synergies offered by a rich shared code base. In my view, the coupling is not a bug, it is a feature.

It is impressive to see how you use generators to create the binding code in Cython. I had a lot of trouble with Cython as it does not support all C++ features. The best way to wrap (modern) C++ is pybind11, which is a painless experience. It does the code generation at compile-time with TMP.

Best regards,
Hans

From andrea.cortis at gmail.com  Sat Feb 13 11:45:01 2021
From: andrea.cortis at gmail.com (Andrea Cortis)
Date: Sat, 13 Feb 2021 10:45:01 -0600
Subject: [SciPy-Dev] Hyper-dual numbers
Message-ID: <AE70AE4B-4AF8-4BFF-9649-A210CE1FFD0F@gmail.com>

Hello, first time here. I was wondering if there are plans for the definition of a ?hyper-dual? type in numpy. I think that would be most useful for neural nets training, and  optimization in general.

Andrea

Sent from my iPad

From evgeny.burovskiy at gmail.com  Sat Feb 13 13:18:59 2021
From: evgeny.burovskiy at gmail.com (Evgeni Burovski)
Date: Sat, 13 Feb 2021 21:18:59 +0300
Subject: [SciPy-Dev] Hyper-dual numbers
In-Reply-To: <AE70AE4B-4AF8-4BFF-9649-A210CE1FFD0F@gmail.com>
References: <AE70AE4B-4AF8-4BFF-9649-A210CE1FFD0F@gmail.com>
Message-ID: <CAMRo0ivri7adYWTmF82G9DmrE-goOf7L4cT8+tSJ4F-261M6Bg@mail.gmail.com>

Hi Andrea, welcome!

Since you're asking about numpy, you likely want the numpy-discussion
mailing list. (the overlap is non-zero, but nevertheless).

Out of curiosity, what's a hyper-dual type?

Cheers,

Evgeni

On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis <andrea.cortis at gmail.com> wrote:
>
> Hello, first time here. I was wondering if there are plans for the definition of a ?hyper-dual? type in numpy. I think that would be most useful for neural nets training, and  optimization in general.
>
> Andrea
>
> Sent from my iPad
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev

From jfoxrabinovitz at gmail.com  Sat Feb 13 13:46:15 2021
From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz)
Date: Sat, 13 Feb 2021 13:46:15 -0500
Subject: [SciPy-Dev] Hyper-dual numbers
In-Reply-To: <CAMRo0ivri7adYWTmF82G9DmrE-goOf7L4cT8+tSJ4F-261M6Bg@mail.gmail.com>
References: <AE70AE4B-4AF8-4BFF-9649-A210CE1FFD0F@gmail.com>
 <CAMRo0ivri7adYWTmF82G9DmrE-goOf7L4cT8+tSJ4F-261M6Bg@mail.gmail.com>
Message-ID: <CAAa1KPYA9HCRZLO0Gy1bJ_qoC4PCVaQWmQLj2ov1OuZutXFXXw@mail.gmail.com>

I directed Andrea here from Stack Overflow (
https://stackoverflow.com/q/66179855/2988730). Based on the Wikipedia
article (https://en.m.wikipedia.org/wiki/Dual_number), it seems like scipy
is a much more likely place to look than numpy.

- Joe


On Sat, Feb 13, 2021, 13:19 Evgeni Burovski <evgeny.burovskiy at gmail.com>
wrote:

> Hi Andrea, welcome!
>
> Since you're asking about numpy, you likely want the numpy-discussion
> mailing list. (the overlap is non-zero, but nevertheless).
>
> Out of curiosity, what's a hyper-dual type?
>
> Cheers,
>
> Evgeni
>
> On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis <andrea.cortis at gmail.com>
> wrote:
> >
> > Hello, first time here. I was wondering if there are plans for the
> definition of a ?hyper-dual? type in numpy. I think that would be most
> useful for neural nets training, and  optimization in general.
> >
> > Andrea
> >
> > Sent from my iPad
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev at python.org
> > https://mail.python.org/mailman/listinfo/scipy-dev
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210213/aaf9cfe6/attachment.html>

From evgeny.burovskiy at gmail.com  Sat Feb 13 14:20:34 2021
From: evgeny.burovskiy at gmail.com (Evgeni Burovski)
Date: Sat, 13 Feb 2021 22:20:34 +0300
Subject: [SciPy-Dev] Hyper-dual numbers
In-Reply-To: <CAAa1KPYA9HCRZLO0Gy1bJ_qoC4PCVaQWmQLj2ov1OuZutXFXXw@mail.gmail.com>
References: <AE70AE4B-4AF8-4BFF-9649-A210CE1FFD0F@gmail.com>
 <CAMRo0ivri7adYWTmF82G9DmrE-goOf7L4cT8+tSJ4F-261M6Bg@mail.gmail.com>
 <CAAa1KPYA9HCRZLO0Gy1bJ_qoC4PCVaQWmQLj2ov1OuZutXFXXw@mail.gmail.com>
Message-ID: <CAMRo0itWDua50xzGKSsAvezHrGaqUFaH_o8WtehpoSTQGxX2Aw@mail.gmail.com>

Ah, these. Suspected it, but wanted to make sure.

IMO, these are best implemented as a numpy dtype. I'm biased though
--- here's a branch which makes a start, based on Mike Boyle's version
of quaternion dtype (Mike and other authors, if you're reading this
--- thanks a ton!)

https://github.com/ev-br/quaternion/tree/dual

Now, I don't think scipy should carry around additional numpy dtypes.
Cannot speak for the numpy project, but I strongly suspect this is
best implemented as a separate repository/project.

Cheers,

Evgeni

On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz
<jfoxrabinovitz at gmail.com> wrote:
>
> I directed Andrea here from Stack Overflow (https://stackoverflow.com/q/66179855/2988730). Based on the Wikipedia article (https://en.m.wikipedia.org/wiki/Dual_number), it seems like scipy is a much more likely place to look than numpy.
>
> - Joe
>
>
> On Sat, Feb 13, 2021, 13:19 Evgeni Burovski <evgeny.burovskiy at gmail.com> wrote:
>>
>> Hi Andrea, welcome!
>>
>> Since you're asking about numpy, you likely want the numpy-discussion
>> mailing list. (the overlap is non-zero, but nevertheless).
>>
>> Out of curiosity, what's a hyper-dual type?
>>
>> Cheers,
>>
>> Evgeni
>>
>> On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis <andrea.cortis at gmail.com> wrote:
>> >
>> > Hello, first time here. I was wondering if there are plans for the definition of a ?hyper-dual? type in numpy. I think that would be most useful for neural nets training, and  optimization in general.
>> >
>> > Andrea
>> >
>> > Sent from my iPad
>> > _______________________________________________
>> > SciPy-Dev mailing list
>> > SciPy-Dev at python.org
>> > https://mail.python.org/mailman/listinfo/scipy-dev
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev

From evgeny.burovskiy at gmail.com  Sat Feb 13 14:32:03 2021
From: evgeny.burovskiy at gmail.com (Evgeni Burovski)
Date: Sat, 13 Feb 2021 22:32:03 +0300
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <C98548B8-3C0F-4314-9895-2030ECABF5AF@gmail.com>
References: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
 <C98548B8-3C0F-4314-9895-2030ECABF5AF@gmail.com>
Message-ID: <CAMRo0iu+COMUjBdaEaFC+7NE91usHZf9eCOa9tfFiA9M7L33ow@mail.gmail.com>

Hi,

Borrowing from Boost.Math sounds great indeed. (Great if it seems
advantageous by boost devs, too).
There is really no reason to keep using parts of e.g. cdflib which are
superseded by Boost.Math.

However, playing devil's advocate somewhat:

- does the scipy PR need the whole Boost.Math? If it only needs a
select subset (e.g., do we need root-finding etc?), then maybe the
size can be reduced.
- do we need the whole thing? e.g. ufunc loops only need a select
subset of types.
- if we do go this route of taking parts / applying scipy specific
patches, what is easier to do or better maintenance-wise: vendor
original code + patches, or do the work once by porting relevant parts
to standalone C or C++ subset?

Obviously, all these should be weighted with other implications of
adding a dependency. The immediate concerns are distribution size and
build times.

Cheers,

Evgeni

On Sat, Feb 13, 2021 at 4:06 PM Hans Dembinski <hans.dembinski at gmail.com> wrote:
>
> Hi Nicholas,
>
> as a Boost developer (I wrote Boost.Histogram and contributed to several other Boost libs), I think it would be great to build SciPy on Boost.Math, it is a win-win.
>
> > On 12. Feb 2021, at 22:50, Nicholas McKibben <nicholas.bgp at gmail.com> wrote:
> >
> > The initial PR includes the zipped Boost headers only (~24MB zipped), but adding Boost as a submodule might be a more maintainable approach if changes to Boost need to be made in the future.
>
> Including it as a submodule seems like a good approach.
>
> > Inclusion of the entire Boost library is a virtual necessity for the Boost.Math module. Manual attempts to strip away unnecessary files and bcp (Boost's utility to provide stripped down installations) fail to create smaller sizes.
>
> I was a bit shocked to hear this, but you are right:
> https://pdimov.github.io/boostdep-report/master/math.html
> Math depends on everything.
>
> We have a long-term goal to reduce the coupling between Boost libs, but this also incurs costs. Library maintainers then have to copy the relevant bits from other Boost libraries to not depend on them, which is actually a terrible idea: you loose the synergies offered by a rich shared code base. In my view, the coupling is not a bug, it is a feature.
>
> It is impressive to see how you use generators to create the binding code in Cython. I had a lot of trouble with Cython as it does not support all C++ features. The best way to wrap (modern) C++ is pybind11, which is a painless experience. It does the code generation at compile-time with TMP.
>
> Best regards,
> Hans
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev

From andrea.cortis at gmail.com  Sat Feb 13 14:35:44 2021
From: andrea.cortis at gmail.com (Andrea Cortis)
Date: Sat, 13 Feb 2021 13:35:44 -0600
Subject: [SciPy-Dev] Hyper-dual numbers
In-Reply-To: <CAMRo0itWDua50xzGKSsAvezHrGaqUFaH_o8WtehpoSTQGxX2Aw@mail.gmail.com>
References: <AE70AE4B-4AF8-4BFF-9649-A210CE1FFD0F@gmail.com>
 <CAMRo0ivri7adYWTmF82G9DmrE-goOf7L4cT8+tSJ4F-261M6Bg@mail.gmail.com>
 <CAAa1KPYA9HCRZLO0Gy1bJ_qoC4PCVaQWmQLj2ov1OuZutXFXXw@mail.gmail.com>
 <CAMRo0itWDua50xzGKSsAvezHrGaqUFaH_o8WtehpoSTQGxX2Aw@mail.gmail.com>
Message-ID: <CAPEkXa=rRZGW03DcQ4-4yQHcg_W1bDhPqBzMyYbjBF=CSuQ6wA@mail.gmail.com>

I am no mathematician and cannot comment on the equivalence of quaternions
vs hyper-dual numbers, even though they seem quite different to me at a
first glance.

For sure, I know that the algebra of dual-numbers is different from the
algebra of complex numbers.

It seems to me therefore that having a numpy `dtype=dual` would be
extremely advantageous when looking to construct *exact* values of first
(and n-th) derivatives.
Say that one would want to replace automatic differentiation for
backpropagation neural networks like in this blogpost

https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/

then we would have a plethora of different implementations, say for
pytorch, tensorflow, etc. and no unifying framework.

Best,

Andre


On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski <evgeny.burovskiy at gmail.com>
wrote:

> Ah, these. Suspected it, but wanted to make sure.
>
> IMO, these are best implemented as a numpy dtype. I'm biased though
> --- here's a branch which makes a start, based on Mike Boyle's version
> of quaternion dtype (Mike and other authors, if you're reading this
> --- thanks a ton!)
>
> https://github.com/ev-br/quaternion/tree/dual
>
> Now, I don't think scipy should carry around additional numpy dtypes.
> Cannot speak for the numpy project, but I strongly suspect this is
> best implemented as a separate repository/project.
>
> Cheers,
>
> Evgeni
>
> On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz
> <jfoxrabinovitz at gmail.com> wrote:
> >
> > I directed Andrea here from Stack Overflow (
> https://stackoverflow.com/q/66179855/2988730). Based on the Wikipedia
> article (https://en.m.wikipedia.org/wiki/Dual_number), it seems like
> scipy is a much more likely place to look than numpy.
> >
> > - Joe
> >
> >
> > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski <evgeny.burovskiy at gmail.com>
> wrote:
> >>
> >> Hi Andrea, welcome!
> >>
> >> Since you're asking about numpy, you likely want the numpy-discussion
> >> mailing list. (the overlap is non-zero, but nevertheless).
> >>
> >> Out of curiosity, what's a hyper-dual type?
> >>
> >> Cheers,
> >>
> >> Evgeni
> >>
> >> On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis <andrea.cortis at gmail.com>
> wrote:
> >> >
> >> > Hello, first time here. I was wondering if there are plans for the
> definition of a ?hyper-dual? type in numpy. I think that would be most
> useful for neural nets training, and  optimization in general.
> >> >
> >> > Andrea
> >> >
> >> > Sent from my iPad
> >> > _______________________________________________
> >> > SciPy-Dev mailing list
> >> > SciPy-Dev at python.org
> >> > https://mail.python.org/mailman/listinfo/scipy-dev
> >> _______________________________________________
> >> SciPy-Dev mailing list
> >> SciPy-Dev at python.org
> >> https://mail.python.org/mailman/listinfo/scipy-dev
> >
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev at python.org
> > https://mail.python.org/mailman/listinfo/scipy-dev
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210213/6d720896/attachment.html>

From andrea.cortis at gmail.com  Sat Feb 13 14:46:12 2021
From: andrea.cortis at gmail.com (Andrea Cortis)
Date: Sat, 13 Feb 2021 13:46:12 -0600
Subject: [SciPy-Dev] Hyper-dual numbers
In-Reply-To: <CAPEkXa=rRZGW03DcQ4-4yQHcg_W1bDhPqBzMyYbjBF=CSuQ6wA@mail.gmail.com>
References: <AE70AE4B-4AF8-4BFF-9649-A210CE1FFD0F@gmail.com>
 <CAMRo0ivri7adYWTmF82G9DmrE-goOf7L4cT8+tSJ4F-261M6Bg@mail.gmail.com>
 <CAAa1KPYA9HCRZLO0Gy1bJ_qoC4PCVaQWmQLj2ov1OuZutXFXXw@mail.gmail.com>
 <CAMRo0itWDua50xzGKSsAvezHrGaqUFaH_o8WtehpoSTQGxX2Aw@mail.gmail.com>
 <CAPEkXa=rRZGW03DcQ4-4yQHcg_W1bDhPqBzMyYbjBF=CSuQ6wA@mail.gmail.com>
Message-ID: <CAPEkXamVsTxjMKc_62OaTF1su52VqbUXDJBOvf2FruaMjahy5w@mail.gmail.com>

Anyways, should I then move this conversation to the numpy-dev mailing list?

On Sat, Feb 13, 2021 at 1:35 PM Andrea Cortis <andrea.cortis at gmail.com>
wrote:

> I am no mathematician and cannot comment on the equivalence of quaternions
> vs hyper-dual numbers, even though they seem quite different to me at a
> first glance.
>
> For sure, I know that the algebra of dual-numbers is different from the
> algebra of complex numbers.
>
> It seems to me therefore that having a numpy `dtype=dual` would be
> extremely advantageous when looking to construct *exact* values of first
> (and n-th) derivatives.
> Say that one would want to replace automatic differentiation for
> backpropagation neural networks like in this blogpost
>
>
> https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/
>
> then we would have a plethora of different implementations, say for
> pytorch, tensorflow, etc. and no unifying framework.
>
> Best,
>
> Andre
>
>
>
>
> On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski <
> evgeny.burovskiy at gmail.com> wrote:
>
>> Ah, these. Suspected it, but wanted to make sure.
>>
>> IMO, these are best implemented as a numpy dtype. I'm biased though
>> --- here's a branch which makes a start, based on Mike Boyle's version
>> of quaternion dtype (Mike and other authors, if you're reading this
>> --- thanks a ton!)
>>
>> https://github.com/ev-br/quaternion/tree/dual
>>
>> Now, I don't think scipy should carry around additional numpy dtypes.
>> Cannot speak for the numpy project, but I strongly suspect this is
>> best implemented as a separate repository/project.
>>
>> Cheers,
>>
>> Evgeni
>>
>> On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz
>> <jfoxrabinovitz at gmail.com> wrote:
>> >
>> > I directed Andrea here from Stack Overflow (
>> https://stackoverflow.com/q/66179855/2988730). Based on the Wikipedia
>> article (https://en.m.wikipedia.org/wiki/Dual_number), it seems like
>> scipy is a much more likely place to look than numpy.
>> >
>> > - Joe
>> >
>> >
>> > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski <evgeny.burovskiy at gmail.com>
>> wrote:
>> >>
>> >> Hi Andrea, welcome!
>> >>
>> >> Since you're asking about numpy, you likely want the numpy-discussion
>> >> mailing list. (the overlap is non-zero, but nevertheless).
>> >>
>> >> Out of curiosity, what's a hyper-dual type?
>> >>
>> >> Cheers,
>> >>
>> >> Evgeni
>> >>
>> >> On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis <andrea.cortis at gmail.com>
>> wrote:
>> >> >
>> >> > Hello, first time here. I was wondering if there are plans for the
>> definition of a ?hyper-dual? type in numpy. I think that would be most
>> useful for neural nets training, and  optimization in general.
>> >> >
>> >> > Andrea
>> >> >
>> >> > Sent from my iPad
>> >> > _______________________________________________
>> >> > SciPy-Dev mailing list
>> >> > SciPy-Dev at python.org
>> >> > https://mail.python.org/mailman/listinfo/scipy-dev
>> >> _______________________________________________
>> >> SciPy-Dev mailing list
>> >> SciPy-Dev at python.org
>> >> https://mail.python.org/mailman/listinfo/scipy-dev
>> >
>> > _______________________________________________
>> > SciPy-Dev mailing list
>> > SciPy-Dev at python.org
>> > https://mail.python.org/mailman/listinfo/scipy-dev
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210213/406a7a7a/attachment-0001.html>

From evgeny.burovskiy at gmail.com  Sat Feb 13 14:47:34 2021
From: evgeny.burovskiy at gmail.com (Evgeni Burovski)
Date: Sat, 13 Feb 2021 22:47:34 +0300
Subject: [SciPy-Dev] Hyper-dual numbers
In-Reply-To: <CAPEkXa=rRZGW03DcQ4-4yQHcg_W1bDhPqBzMyYbjBF=CSuQ6wA@mail.gmail.com>
References: <AE70AE4B-4AF8-4BFF-9649-A210CE1FFD0F@gmail.com>
 <CAMRo0ivri7adYWTmF82G9DmrE-goOf7L4cT8+tSJ4F-261M6Bg@mail.gmail.com>
 <CAAa1KPYA9HCRZLO0Gy1bJ_qoC4PCVaQWmQLj2ov1OuZutXFXXw@mail.gmail.com>
 <CAMRo0itWDua50xzGKSsAvezHrGaqUFaH_o8WtehpoSTQGxX2Aw@mail.gmail.com>
 <CAPEkXa=rRZGW03DcQ4-4yQHcg_W1bDhPqBzMyYbjBF=CSuQ6wA@mail.gmail.com>
Message-ID: <CAMRo0iv14N6ftZJAf0q=6QtyJEvN5aQVuBES673L9B0PJ_+qLA@mail.gmail.com>

Sure, these are very different. The use of the quaternions package
here is only that it implements the machinery. All I did was to change
the multiplication table from the quaternion one to the dual number
one. And it sort of works --- the branch above lets you define arrays
with `dype=dual_number` and do basic arithmetics. Making it usable is
reasonably straightforward, but I did not manage to find time since
last November.

Help's welcome --- even checking out the branch and trying corner
cases in arithmetics (there certainly are bugs) would be helpful.

On Sat, Feb 13, 2021 at 10:36 PM Andrea Cortis <andrea.cortis at gmail.com> wrote:
>
> I am no mathematician and cannot comment on the equivalence of quaternions vs hyper-dual numbers, even though they seem quite different to me at a first glance.
>
> For sure, I know that the algebra of dual-numbers is different from the algebra of complex numbers.
>
> It seems to me therefore that having a numpy `dtype=dual` would be extremely advantageous when looking to construct *exact* values of first (and n-th) derivatives.
> Say that one would want to replace automatic differentiation for backpropagation neural networks like in this blogpost
>
> https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/
>
> then we would have a plethora of different implementations, say for pytorch, tensorflow, etc. and no unifying framework.
>
> Best,
>
> Andre
>
>
>
>
> On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski <evgeny.burovskiy at gmail.com> wrote:
>>
>> Ah, these. Suspected it, but wanted to make sure.
>>
>> IMO, these are best implemented as a numpy dtype. I'm biased though
>> --- here's a branch which makes a start, based on Mike Boyle's version
>> of quaternion dtype (Mike and other authors, if you're reading this
>> --- thanks a ton!)
>>
>> https://github.com/ev-br/quaternion/tree/dual
>>
>> Now, I don't think scipy should carry around additional numpy dtypes.
>> Cannot speak for the numpy project, but I strongly suspect this is
>> best implemented as a separate repository/project.
>>
>> Cheers,
>>
>> Evgeni
>>
>> On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz
>> <jfoxrabinovitz at gmail.com> wrote:
>> >
>> > I directed Andrea here from Stack Overflow (https://stackoverflow.com/q/66179855/2988730). Based on the Wikipedia article (https://en.m.wikipedia.org/wiki/Dual_number), it seems like scipy is a much more likely place to look than numpy.
>> >
>> > - Joe
>> >
>> >
>> > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski <evgeny.burovskiy at gmail.com> wrote:
>> >>
>> >> Hi Andrea, welcome!
>> >>
>> >> Since you're asking about numpy, you likely want the numpy-discussion
>> >> mailing list. (the overlap is non-zero, but nevertheless).
>> >>
>> >> Out of curiosity, what's a hyper-dual type?
>> >>
>> >> Cheers,
>> >>
>> >> Evgeni
>> >>
>> >> On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis <andrea.cortis at gmail.com> wrote:
>> >> >
>> >> > Hello, first time here. I was wondering if there are plans for the definition of a ?hyper-dual? type in numpy. I think that would be most useful for neural nets training, and  optimization in general.
>> >> >
>> >> > Andrea
>> >> >
>> >> > Sent from my iPad
>> >> > _______________________________________________
>> >> > SciPy-Dev mailing list
>> >> > SciPy-Dev at python.org
>> >> > https://mail.python.org/mailman/listinfo/scipy-dev
>> >> _______________________________________________
>> >> SciPy-Dev mailing list
>> >> SciPy-Dev at python.org
>> >> https://mail.python.org/mailman/listinfo/scipy-dev
>> >
>> > _______________________________________________
>> > SciPy-Dev mailing list
>> > SciPy-Dev at python.org
>> > https://mail.python.org/mailman/listinfo/scipy-dev
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev

From sebastian at sipsolutions.net  Sat Feb 13 15:09:13 2021
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sat, 13 Feb 2021 14:09:13 -0600
Subject: [SciPy-Dev] Hyper-dual numbers
In-Reply-To: <CAMRo0iv14N6ftZJAf0q=6QtyJEvN5aQVuBES673L9B0PJ_+qLA@mail.gmail.com>
References: <AE70AE4B-4AF8-4BFF-9649-A210CE1FFD0F@gmail.com>
 <CAMRo0ivri7adYWTmF82G9DmrE-goOf7L4cT8+tSJ4F-261M6Bg@mail.gmail.com>
 <CAAa1KPYA9HCRZLO0Gy1bJ_qoC4PCVaQWmQLj2ov1OuZutXFXXw@mail.gmail.com>
 <CAMRo0itWDua50xzGKSsAvezHrGaqUFaH_o8WtehpoSTQGxX2Aw@mail.gmail.com>
 <CAPEkXa=rRZGW03DcQ4-4yQHcg_W1bDhPqBzMyYbjBF=CSuQ6wA@mail.gmail.com>
 <CAMRo0iv14N6ftZJAf0q=6QtyJEvN5aQVuBES673L9B0PJ_+qLA@mail.gmail.com>
Message-ID: <999035c87362532e946b4288aaef7639d7441316.camel@sipsolutions.net>

On Sat, 2021-02-13 at 22:47 +0300, Evgeni Burovski wrote:
> Sure, these are very different. The use of the quaternions package
> here is only that it implements the machinery. All I did was to
> change
> the multiplication table from the quaternion one to the dual number

It looks like someone has already done it:
https://github.com/anandpratap/PyHD

As to NumPy, I doubt it should be part of NumPy it seems far from
"basic".  Should this be in SciPy?  My opinion is: maybe, but quite
certainly not right now.

The long story is, that I am doing a pretty big revamp of how NumPy
does DTypes [1]. That will remove a lot of quirks and limitations. But
those quirks should not be forbidding for a hyper-dual number dtype
(and probably are not, I suspect that project above just works fairly
well!).

Should SciPy be in the business of providing new dtypes? Honestly, I
hope that the answer may be "yes" at least after NumPy has this new API
available and it has been proven/smoothened out.  If a dtype is useful
enough in general SciPy terms of course.
At this time, I think it is probably best to do in stand-alone packages
for a while longer and hope that is not actually a limitation at all.

Cheers,

Sebastian


[1] https://numpy.org/neps/nep-0041-improved-dtype-support.html


> one. And it sort of works --- the branch above lets you define arrays
> with `dype=dual_number` and do basic arithmetics. Making it usable is
> reasonably straightforward, but I did not manage to find time since
> last November.
> 
> Help's welcome --- even checking out the branch and trying corner
> cases in arithmetics (there certainly are bugs) would be helpful.
> 
> On Sat, Feb 13, 2021 at 10:36 PM Andrea Cortis <
> andrea.cortis at gmail.com> wrote:
> > 
> > I am no mathematician and cannot comment on the equivalence of
> > quaternions vs hyper-dual numbers, even though they seem quite
> > different to me at a first glance.
> > 
> > For sure, I know that the algebra of dual-numbers is different from
> > the algebra of complex numbers.
> > 
> > It seems to me therefore that having a numpy `dtype=dual` would be
> > extremely advantageous when looking to construct *exact* values of
> > first (and n-th) derivatives.
> > Say that one would want to replace automatic differentiation for
> > backpropagation neural networks like in this blogpost
> > 
> > https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/
> > 
> > then we would have a plethora of different implementations, say for
> > pytorch, tensorflow, etc. and no unifying framework.
> > 
> > Best,
> > 
> > Andre
> > 
> > 
> > 
> > 
> > On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski <
> > evgeny.burovskiy at gmail.com> wrote:
> > > 
> > > Ah, these. Suspected it, but wanted to make sure.
> > > 
> > > IMO, these are best implemented as a numpy dtype. I'm biased
> > > though
> > > --- here's a branch which makes a start, based on Mike Boyle's
> > > version
> > > of quaternion dtype (Mike and other authors, if you're reading
> > > this
> > > --- thanks a ton!)
> > > 
> > > https://github.com/ev-br/quaternion/tree/dual
> > > 
> > > Now, I don't think scipy should carry around additional numpy
> > > dtypes.
> > > Cannot speak for the numpy project, but I strongly suspect this
> > > is
> > > best implemented as a separate repository/project.
> > > 
> > > Cheers,
> > > 
> > > Evgeni
> > > 
> > > On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz
> > > <jfoxrabinovitz at gmail.com> wrote:
> > > > 
> > > > I directed Andrea here from Stack Overflow (
> > > > https://stackoverflow.com/q/66179855/2988730). Based on the
> > > > Wikipedia article
> > > > (https://en.m.wikipedia.org/wiki/Dual_number), it seems like
> > > > scipy is a much more likely place to look than numpy.
> > > > 
> > > > - Joe
> > > > 
> > > > 
> > > > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski <
> > > > evgeny.burovskiy at gmail.com> wrote:
> > > > > 
> > > > > Hi Andrea, welcome!
> > > > > 
> > > > > Since you're asking about numpy, you likely want the numpy-
> > > > > discussion
> > > > > mailing list. (the overlap is non-zero, but nevertheless).
> > > > > 
> > > > > Out of curiosity, what's a hyper-dual type?
> > > > > 
> > > > > Cheers,
> > > > > 
> > > > > Evgeni
> > > > > 
> > > > > On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis <
> > > > > andrea.cortis at gmail.com> wrote:
> > > > > > 
> > > > > > Hello, first time here. I was wondering if there are plans
> > > > > > for the definition of a ?hyper-dual? type in numpy. I think
> > > > > > that would be most useful for neural nets training, and?
> > > > > > optimization in general.
> > > > > > 
> > > > > > Andrea
> > > > > > 
> > > > > > Sent from my iPad
> > > > > > _______________________________________________
> > > > > > SciPy-Dev mailing list
> > > > > > SciPy-Dev at python.org
> > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > > _______________________________________________
> > > > > SciPy-Dev mailing list
> > > > > SciPy-Dev at python.org
> > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > 
> > > > _______________________________________________
> > > > SciPy-Dev mailing list
> > > > SciPy-Dev at python.org
> > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > _______________________________________________
> > > SciPy-Dev mailing list
> > > SciPy-Dev at python.org
> > > https://mail.python.org/mailman/listinfo/scipy-dev
> > 
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev at python.org
> > https://mail.python.org/mailman/listinfo/scipy-dev
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210213/362e0a96/attachment.sig>

From andrea.cortis at gmail.com  Sat Feb 13 17:24:58 2021
From: andrea.cortis at gmail.com (Andrea Cortis)
Date: Sat, 13 Feb 2021 16:24:58 -0600
Subject: [SciPy-Dev] Hyper-dual numbers
In-Reply-To: <999035c87362532e946b4288aaef7639d7441316.camel@sipsolutions.net>
References: <AE70AE4B-4AF8-4BFF-9649-A210CE1FFD0F@gmail.com>
 <CAMRo0ivri7adYWTmF82G9DmrE-goOf7L4cT8+tSJ4F-261M6Bg@mail.gmail.com>
 <CAAa1KPYA9HCRZLO0Gy1bJ_qoC4PCVaQWmQLj2ov1OuZutXFXXw@mail.gmail.com>
 <CAMRo0itWDua50xzGKSsAvezHrGaqUFaH_o8WtehpoSTQGxX2Aw@mail.gmail.com>
 <CAPEkXa=rRZGW03DcQ4-4yQHcg_W1bDhPqBzMyYbjBF=CSuQ6wA@mail.gmail.com>
 <CAMRo0iv14N6ftZJAf0q=6QtyJEvN5aQVuBES673L9B0PJ_+qLA@mail.gmail.com>
 <999035c87362532e946b4288aaef7639d7441316.camel@sipsolutions.net>
Message-ID: <CAPEkXa=14j5-_dYhuL=-=Eq-gOtRmCyC+0xt0iaq_Bu6XEj4LA@mail.gmail.com>

I did try to port PyHD to python3 with 2to3
There are 38 warnings upon compiling (macos Catalina), but then when I try
to run I get

----> 1 from hyperdual import numpy_hyperdual

ValueError: Failed to register dtype for <class 'hyperdual.hyperdual'>:
Legacy user dtypes using `NPY_ITEM_IS_POINTER` or `NPY_ITEM_REFCOUNT`
areunsupported.  It is possible to create such a dtype only if it is a
structured dtype with names and fields hardcoded at registration time.
Please contact the NumPy developers if this used to work but now fails.


On Sat, Feb 13, 2021 at 2:09 PM Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Sat, 2021-02-13 at 22:47 +0300, Evgeni Burovski wrote:
> > Sure, these are very different. The use of the quaternions package
> > here is only that it implements the machinery. All I did was to
> > change
> > the multiplication table from the quaternion one to the dual number
>
> It looks like someone has already done it:
> https://github.com/anandpratap/PyHD
>
> As to NumPy, I doubt it should be part of NumPy it seems far from
> "basic".  Should this be in SciPy?  My opinion is: maybe, but quite
> certainly not right now.
>
> The long story is, that I am doing a pretty big revamp of how NumPy
> does DTypes [1]. That will remove a lot of quirks and limitations. But
> those quirks should not be forbidding for a hyper-dual number dtype
> (and probably are not, I suspect that project above just works fairly
> well!).
>
> Should SciPy be in the business of providing new dtypes? Honestly, I
> hope that the answer may be "yes" at least after NumPy has this new API
> available and it has been proven/smoothened out.  If a dtype is useful
> enough in general SciPy terms of course.
> At this time, I think it is probably best to do in stand-alone packages
> for a while longer and hope that is not actually a limitation at all.
>
> Cheers,
>
> Sebastian
>
>
>
> [1] https://numpy.org/neps/nep-0041-improved-dtype-support.html
>
>
>
> > one. And it sort of works --- the branch above lets you define arrays
> > with `dype=dual_number` and do basic arithmetics. Making it usable is
> > reasonably straightforward, but I did not manage to find time since
> > last November.
> >
> > Help's welcome --- even checking out the branch and trying corner
> > cases in arithmetics (there certainly are bugs) would be helpful.
> >
> > On Sat, Feb 13, 2021 at 10:36 PM Andrea Cortis <
> > andrea.cortis at gmail.com> wrote:
> > >
> > > I am no mathematician and cannot comment on the equivalence of
> > > quaternions vs hyper-dual numbers, even though they seem quite
> > > different to me at a first glance.
> > >
> > > For sure, I know that the algebra of dual-numbers is different from
> > > the algebra of complex numbers.
> > >
> > > It seems to me therefore that having a numpy `dtype=dual` would be
> > > extremely advantageous when looking to construct *exact* values of
> > > first (and n-th) derivatives.
> > > Say that one would want to replace automatic differentiation for
> > > backpropagation neural networks like in this blogpost
> > >
> > >
> https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/
> > >
> > > then we would have a plethora of different implementations, say for
> > > pytorch, tensorflow, etc. and no unifying framework.
> > >
> > > Best,
> > >
> > > Andre
> > >
> > >
> > >
> > >
> > > On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski <
> > > evgeny.burovskiy at gmail.com> wrote:
> > > >
> > > > Ah, these. Suspected it, but wanted to make sure.
> > > >
> > > > IMO, these are best implemented as a numpy dtype. I'm biased
> > > > though
> > > > --- here's a branch which makes a start, based on Mike Boyle's
> > > > version
> > > > of quaternion dtype (Mike and other authors, if you're reading
> > > > this
> > > > --- thanks a ton!)
> > > >
> > > > https://github.com/ev-br/quaternion/tree/dual
> > > >
> > > > Now, I don't think scipy should carry around additional numpy
> > > > dtypes.
> > > > Cannot speak for the numpy project, but I strongly suspect this
> > > > is
> > > > best implemented as a separate repository/project.
> > > >
> > > > Cheers,
> > > >
> > > > Evgeni
> > > >
> > > > On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz
> > > > <jfoxrabinovitz at gmail.com> wrote:
> > > > >
> > > > > I directed Andrea here from Stack Overflow (
> > > > > https://stackoverflow.com/q/66179855/2988730). Based on the
> > > > > Wikipedia article
> > > > > (https://en.m.wikipedia.org/wiki/Dual_number), it seems like
> > > > > scipy is a much more likely place to look than numpy.
> > > > >
> > > > > - Joe
> > > > >
> > > > >
> > > > > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski <
> > > > > evgeny.burovskiy at gmail.com> wrote:
> > > > > >
> > > > > > Hi Andrea, welcome!
> > > > > >
> > > > > > Since you're asking about numpy, you likely want the numpy-
> > > > > > discussion
> > > > > > mailing list. (the overlap is non-zero, but nevertheless).
> > > > > >
> > > > > > Out of curiosity, what's a hyper-dual type?
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Evgeni
> > > > > >
> > > > > > On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis <
> > > > > > andrea.cortis at gmail.com> wrote:
> > > > > > >
> > > > > > > Hello, first time here. I was wondering if there are plans
> > > > > > > for the definition of a ?hyper-dual? type in numpy. I think
> > > > > > > that would be most useful for neural nets training, and
> > > > > > > optimization in general.
> > > > > > >
> > > > > > > Andrea
> > > > > > >
> > > > > > > Sent from my iPad
> > > > > > > _______________________________________________
> > > > > > > SciPy-Dev mailing list
> > > > > > > SciPy-Dev at python.org
> > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > > > _______________________________________________
> > > > > > SciPy-Dev mailing list
> > > > > > SciPy-Dev at python.org
> > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > >
> > > > > _______________________________________________
> > > > > SciPy-Dev mailing list
> > > > > SciPy-Dev at python.org
> > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > _______________________________________________
> > > > SciPy-Dev mailing list
> > > > SciPy-Dev at python.org
> > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > >
> > > _______________________________________________
> > > SciPy-Dev mailing list
> > > SciPy-Dev at python.org
> > > https://mail.python.org/mailman/listinfo/scipy-dev
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev at python.org
> > https://mail.python.org/mailman/listinfo/scipy-dev
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210213/bf2e745b/attachment-0001.html>

From sebastian at sipsolutions.net  Sat Feb 13 17:37:13 2021
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sat, 13 Feb 2021 16:37:13 -0600
Subject: [SciPy-Dev] Hyper-dual numbers
In-Reply-To: <CAPEkXa=14j5-_dYhuL=-=Eq-gOtRmCyC+0xt0iaq_Bu6XEj4LA@mail.gmail.com>
References: <AE70AE4B-4AF8-4BFF-9649-A210CE1FFD0F@gmail.com>
 <CAMRo0ivri7adYWTmF82G9DmrE-goOf7L4cT8+tSJ4F-261M6Bg@mail.gmail.com>
 <CAAa1KPYA9HCRZLO0Gy1bJ_qoC4PCVaQWmQLj2ov1OuZutXFXXw@mail.gmail.com>
 <CAMRo0itWDua50xzGKSsAvezHrGaqUFaH_o8WtehpoSTQGxX2Aw@mail.gmail.com>
 <CAPEkXa=rRZGW03DcQ4-4yQHcg_W1bDhPqBzMyYbjBF=CSuQ6wA@mail.gmail.com>
 <CAMRo0iv14N6ftZJAf0q=6QtyJEvN5aQVuBES673L9B0PJ_+qLA@mail.gmail.com>
 <999035c87362532e946b4288aaef7639d7441316.camel@sipsolutions.net>
 <CAPEkXa=14j5-_dYhuL=-=Eq-gOtRmCyC+0xt0iaq_Bu6XEj4LA@mail.gmail.com>
Message-ID: <f68e547146347c9aa1c72836d03e08b1b92d2552.camel@sipsolutions.net>

On Sat, 2021-02-13 at 16:24 -0600, Andrea Cortis wrote:
> I did try to port PyHD to python3 with 2to3
> There are 38 warnings upon compiling (macos Catalina), but then when
> I try
> to run I get
> 
> ----> 1 from hyperdual import numpy_hyperdual
> 
> ValueError: Failed to register dtype for <class
> 'hyperdual.hyperdual'>:
> Legacy user dtypes using `NPY_ITEM_IS_POINTER` or `NPY_ITEM_REFCOUNT`
> areunsupported.? It is possible to create such a dtype only if it is
> a
> structured dtype with names and fields hardcoded at registration
> time.
> Please contact the NumPy developers if this used to work but now
> fails.
> 

Sounds like a simple bug, the code fails to properly initialize the
descriptor flags, which likely just means zero'ing them out.
(NumPy was really designed with being passed a static version of the
struct, although nothing wrong with the design here ? it just will
never be cleaned up).

- Sebastian


> 
> On Sat, Feb 13, 2021 at 2:09 PM Sebastian Berg <
> sebastian at sipsolutions.net>
> wrote:
> 
> > On Sat, 2021-02-13 at 22:47 +0300, Evgeni Burovski wrote:
> > > Sure, these are very different. The use of the quaternions
> > > package
> > > here is only that it implements the machinery. All I did was to
> > > change
> > > the multiplication table from the quaternion one to the dual
> > > number
> > 
> > It looks like someone has already done it:
> > https://github.com/anandpratap/PyHD
> > 
> > As to NumPy, I doubt it should be part of NumPy it seems far from
> > "basic".? Should this be in SciPy?? My opinion is: maybe, but quite
> > certainly not right now.
> > 
> > The long story is, that I am doing a pretty big revamp of how NumPy
> > does DTypes [1]. That will remove a lot of quirks and limitations.
> > But
> > those quirks should not be forbidding for a hyper-dual number dtype
> > (and probably are not, I suspect that project above just works
> > fairly
> > well!).
> > 
> > Should SciPy be in the business of providing new dtypes? Honestly,
> > I
> > hope that the answer may be "yes" at least after NumPy has this new
> > API
> > available and it has been proven/smoothened out.? If a dtype is
> > useful
> > enough in general SciPy terms of course.
> > At this time, I think it is probably best to do in stand-alone
> > packages
> > for a while longer and hope that is not actually a limitation at
> > all.
> > 
> > Cheers,
> > 
> > Sebastian
> > 
> > 
> > 
> > [1] https://numpy.org/neps/nep-0041-improved-dtype-support.html
> > 
> > 
> > 
> > > one. And it sort of works --- the branch above lets you define
> > > arrays
> > > with `dype=dual_number` and do basic arithmetics. Making it
> > > usable is
> > > reasonably straightforward, but I did not manage to find time
> > > since
> > > last November.
> > > 
> > > Help's welcome --- even checking out the branch and trying corner
> > > cases in arithmetics (there certainly are bugs) would be helpful.
> > > 
> > > On Sat, Feb 13, 2021 at 10:36 PM Andrea Cortis <
> > > andrea.cortis at gmail.com> wrote:
> > > > 
> > > > I am no mathematician and cannot comment on the equivalence of
> > > > quaternions vs hyper-dual numbers, even though they seem quite
> > > > different to me at a first glance.
> > > > 
> > > > For sure, I know that the algebra of dual-numbers is different
> > > > from
> > > > the algebra of complex numbers.
> > > > 
> > > > It seems to me therefore that having a numpy `dtype=dual` would
> > > > be
> > > > extremely advantageous when looking to construct *exact* values
> > > > of
> > > > first (and n-th) derivatives.
> > > > Say that one would want to replace automatic differentiation
> > > > for
> > > > backpropagation neural networks like in this blogpost
> > > > 
> > > > 
> > https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/
> > > > 
> > > > then we would have a plethora of different implementations, say
> > > > for
> > > > pytorch, tensorflow, etc. and no unifying framework.
> > > > 
> > > > Best,
> > > > 
> > > > Andre
> > > > 
> > > > 
> > > > 
> > > > 
> > > > On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski <
> > > > evgeny.burovskiy at gmail.com> wrote:
> > > > > 
> > > > > Ah, these. Suspected it, but wanted to make sure.
> > > > > 
> > > > > IMO, these are best implemented as a numpy dtype. I'm biased
> > > > > though
> > > > > --- here's a branch which makes a start, based on Mike
> > > > > Boyle's
> > > > > version
> > > > > of quaternion dtype (Mike and other authors, if you're
> > > > > reading
> > > > > this
> > > > > --- thanks a ton!)
> > > > > 
> > > > > https://github.com/ev-br/quaternion/tree/dual
> > > > > 
> > > > > Now, I don't think scipy should carry around additional numpy
> > > > > dtypes.
> > > > > Cannot speak for the numpy project, but I strongly suspect
> > > > > this
> > > > > is
> > > > > best implemented as a separate repository/project.
> > > > > 
> > > > > Cheers,
> > > > > 
> > > > > Evgeni
> > > > > 
> > > > > On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz
> > > > > <jfoxrabinovitz at gmail.com> wrote:
> > > > > > 
> > > > > > I directed Andrea here from Stack Overflow (
> > > > > > https://stackoverflow.com/q/66179855/2988730). Based on the
> > > > > > Wikipedia article
> > > > > > (https://en.m.wikipedia.org/wiki/Dual_number), it seems
> > > > > > like
> > > > > > scipy is a much more likely place to look than numpy.
> > > > > > 
> > > > > > - Joe
> > > > > > 
> > > > > > 
> > > > > > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski <
> > > > > > evgeny.burovskiy at gmail.com> wrote:
> > > > > > > 
> > > > > > > Hi Andrea, welcome!
> > > > > > > 
> > > > > > > Since you're asking about numpy, you likely want the
> > > > > > > numpy-
> > > > > > > discussion
> > > > > > > mailing list. (the overlap is non-zero, but
> > > > > > > nevertheless).
> > > > > > > 
> > > > > > > Out of curiosity, what's a hyper-dual type?
> > > > > > > 
> > > > > > > Cheers,
> > > > > > > 
> > > > > > > Evgeni
> > > > > > > 
> > > > > > > On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis <
> > > > > > > andrea.cortis at gmail.com> wrote:
> > > > > > > > 
> > > > > > > > Hello, first time here. I was wondering if there are
> > > > > > > > plans
> > > > > > > > for the definition of a ?hyper-dual? type in numpy. I
> > > > > > > > think
> > > > > > > > that would be most useful for neural nets training, and
> > > > > > > > optimization in general.
> > > > > > > > 
> > > > > > > > Andrea
> > > > > > > > 
> > > > > > > > Sent from my iPad
> > > > > > > > _______________________________________________
> > > > > > > > SciPy-Dev mailing list
> > > > > > > > SciPy-Dev at python.org
> > > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > > > > _______________________________________________
> > > > > > > SciPy-Dev mailing list
> > > > > > > SciPy-Dev at python.org
> > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > > > 
> > > > > > _______________________________________________
> > > > > > SciPy-Dev mailing list
> > > > > > SciPy-Dev at python.org
> > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > > _______________________________________________
> > > > > SciPy-Dev mailing list
> > > > > SciPy-Dev at python.org
> > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > 
> > > > _______________________________________________
> > > > SciPy-Dev mailing list
> > > > SciPy-Dev at python.org
> > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > _______________________________________________
> > > SciPy-Dev mailing list
> > > SciPy-Dev at python.org
> > > https://mail.python.org/mailman/listinfo/scipy-dev
> > 
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev at python.org
> > https://mail.python.org/mailman/listinfo/scipy-dev
> > 
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210213/034fba03/attachment.sig>

From andrea.cortis at gmail.com  Sat Feb 13 17:52:07 2021
From: andrea.cortis at gmail.com (Andrea Cortis)
Date: Sat, 13 Feb 2021 16:52:07 -0600
Subject: [SciPy-Dev] Hyper-dual numbers
In-Reply-To: <f68e547146347c9aa1c72836d03e08b1b92d2552.camel@sipsolutions.net>
References: <AE70AE4B-4AF8-4BFF-9649-A210CE1FFD0F@gmail.com>
 <CAMRo0ivri7adYWTmF82G9DmrE-goOf7L4cT8+tSJ4F-261M6Bg@mail.gmail.com>
 <CAAa1KPYA9HCRZLO0Gy1bJ_qoC4PCVaQWmQLj2ov1OuZutXFXXw@mail.gmail.com>
 <CAMRo0itWDua50xzGKSsAvezHrGaqUFaH_o8WtehpoSTQGxX2Aw@mail.gmail.com>
 <CAPEkXa=rRZGW03DcQ4-4yQHcg_W1bDhPqBzMyYbjBF=CSuQ6wA@mail.gmail.com>
 <CAMRo0iv14N6ftZJAf0q=6QtyJEvN5aQVuBES673L9B0PJ_+qLA@mail.gmail.com>
 <999035c87362532e946b4288aaef7639d7441316.camel@sipsolutions.net>
 <CAPEkXa=14j5-_dYhuL=-=Eq-gOtRmCyC+0xt0iaq_Bu6XEj4LA@mail.gmail.com>
 <f68e547146347c9aa1c72836d03e08b1b92d2552.camel@sipsolutions.net>
Message-ID: <CAPEkXa=Q-x-0CDmTfYHjztPScfcxagUKC8ijGy61oQqy1_AT3g@mail.gmail.com>

I will report it as such to the author

On Sat, Feb 13, 2021 at 4:37 PM Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Sat, 2021-02-13 at 16:24 -0600, Andrea Cortis wrote:
> > I did try to port PyHD to python3 with 2to3
> > There are 38 warnings upon compiling (macos Catalina), but then when
> > I try
> > to run I get
> >
> > ----> 1 from hyperdual import numpy_hyperdual
> >
> > ValueError: Failed to register dtype for <class
> > 'hyperdual.hyperdual'>:
> > Legacy user dtypes using `NPY_ITEM_IS_POINTER` or `NPY_ITEM_REFCOUNT`
> > areunsupported.  It is possible to create such a dtype only if it is
> > a
> > structured dtype with names and fields hardcoded at registration
> > time.
> > Please contact the NumPy developers if this used to work but now
> > fails.
> >
>
> Sounds like a simple bug, the code fails to properly initialize the
> descriptor flags, which likely just means zero'ing them out.
> (NumPy was really designed with being passed a static version of the
> struct, although nothing wrong with the design here ? it just will
> never be cleaned up).
>
> - Sebastian
>
>
> >
> > On Sat, Feb 13, 2021 at 2:09 PM Sebastian Berg <
> > sebastian at sipsolutions.net>
> > wrote:
> >
> > > On Sat, 2021-02-13 at 22:47 +0300, Evgeni Burovski wrote:
> > > > Sure, these are very different. The use of the quaternions
> > > > package
> > > > here is only that it implements the machinery. All I did was to
> > > > change
> > > > the multiplication table from the quaternion one to the dual
> > > > number
> > >
> > > It looks like someone has already done it:
> > > https://github.com/anandpratap/PyHD
> > >
> > > As to NumPy, I doubt it should be part of NumPy it seems far from
> > > "basic".  Should this be in SciPy?  My opinion is: maybe, but quite
> > > certainly not right now.
> > >
> > > The long story is, that I am doing a pretty big revamp of how NumPy
> > > does DTypes [1]. That will remove a lot of quirks and limitations.
> > > But
> > > those quirks should not be forbidding for a hyper-dual number dtype
> > > (and probably are not, I suspect that project above just works
> > > fairly
> > > well!).
> > >
> > > Should SciPy be in the business of providing new dtypes? Honestly,
> > > I
> > > hope that the answer may be "yes" at least after NumPy has this new
> > > API
> > > available and it has been proven/smoothened out.  If a dtype is
> > > useful
> > > enough in general SciPy terms of course.
> > > At this time, I think it is probably best to do in stand-alone
> > > packages
> > > for a while longer and hope that is not actually a limitation at
> > > all.
> > >
> > > Cheers,
> > >
> > > Sebastian
> > >
> > >
> > >
> > > [1] https://numpy.org/neps/nep-0041-improved-dtype-support.html
> > >
> > >
> > >
> > > > one. And it sort of works --- the branch above lets you define
> > > > arrays
> > > > with `dype=dual_number` and do basic arithmetics. Making it
> > > > usable is
> > > > reasonably straightforward, but I did not manage to find time
> > > > since
> > > > last November.
> > > >
> > > > Help's welcome --- even checking out the branch and trying corner
> > > > cases in arithmetics (there certainly are bugs) would be helpful.
> > > >
> > > > On Sat, Feb 13, 2021 at 10:36 PM Andrea Cortis <
> > > > andrea.cortis at gmail.com> wrote:
> > > > >
> > > > > I am no mathematician and cannot comment on the equivalence of
> > > > > quaternions vs hyper-dual numbers, even though they seem quite
> > > > > different to me at a first glance.
> > > > >
> > > > > For sure, I know that the algebra of dual-numbers is different
> > > > > from
> > > > > the algebra of complex numbers.
> > > > >
> > > > > It seems to me therefore that having a numpy `dtype=dual` would
> > > > > be
> > > > > extremely advantageous when looking to construct *exact* values
> > > > > of
> > > > > first (and n-th) derivatives.
> > > > > Say that one would want to replace automatic differentiation
> > > > > for
> > > > > backpropagation neural networks like in this blogpost
> > > > >
> > > > >
> > >
> https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/
> > > > >
> > > > > then we would have a plethora of different implementations, say
> > > > > for
> > > > > pytorch, tensorflow, etc. and no unifying framework.
> > > > >
> > > > > Best,
> > > > >
> > > > > Andre
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski <
> > > > > evgeny.burovskiy at gmail.com> wrote:
> > > > > >
> > > > > > Ah, these. Suspected it, but wanted to make sure.
> > > > > >
> > > > > > IMO, these are best implemented as a numpy dtype. I'm biased
> > > > > > though
> > > > > > --- here's a branch which makes a start, based on Mike
> > > > > > Boyle's
> > > > > > version
> > > > > > of quaternion dtype (Mike and other authors, if you're
> > > > > > reading
> > > > > > this
> > > > > > --- thanks a ton!)
> > > > > >
> > > > > > https://github.com/ev-br/quaternion/tree/dual
> > > > > >
> > > > > > Now, I don't think scipy should carry around additional numpy
> > > > > > dtypes.
> > > > > > Cannot speak for the numpy project, but I strongly suspect
> > > > > > this
> > > > > > is
> > > > > > best implemented as a separate repository/project.
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Evgeni
> > > > > >
> > > > > > On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz
> > > > > > <jfoxrabinovitz at gmail.com> wrote:
> > > > > > >
> > > > > > > I directed Andrea here from Stack Overflow (
> > > > > > > https://stackoverflow.com/q/66179855/2988730). Based on the
> > > > > > > Wikipedia article
> > > > > > > (https://en.m.wikipedia.org/wiki/Dual_number), it seems
> > > > > > > like
> > > > > > > scipy is a much more likely place to look than numpy.
> > > > > > >
> > > > > > > - Joe
> > > > > > >
> > > > > > >
> > > > > > > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski <
> > > > > > > evgeny.burovskiy at gmail.com> wrote:
> > > > > > > >
> > > > > > > > Hi Andrea, welcome!
> > > > > > > >
> > > > > > > > Since you're asking about numpy, you likely want the
> > > > > > > > numpy-
> > > > > > > > discussion
> > > > > > > > mailing list. (the overlap is non-zero, but
> > > > > > > > nevertheless).
> > > > > > > >
> > > > > > > > Out of curiosity, what's a hyper-dual type?
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > >
> > > > > > > > Evgeni
> > > > > > > >
> > > > > > > > On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis <
> > > > > > > > andrea.cortis at gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > Hello, first time here. I was wondering if there are
> > > > > > > > > plans
> > > > > > > > > for the definition of a ?hyper-dual? type in numpy. I
> > > > > > > > > think
> > > > > > > > > that would be most useful for neural nets training, and
> > > > > > > > > optimization in general.
> > > > > > > > >
> > > > > > > > > Andrea
> > > > > > > > >
> > > > > > > > > Sent from my iPad
> > > > > > > > > _______________________________________________
> > > > > > > > > SciPy-Dev mailing list
> > > > > > > > > SciPy-Dev at python.org
> > > > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > > > > > _______________________________________________
> > > > > > > > SciPy-Dev mailing list
> > > > > > > > SciPy-Dev at python.org
> > > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > SciPy-Dev mailing list
> > > > > > > SciPy-Dev at python.org
> > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > > > _______________________________________________
> > > > > > SciPy-Dev mailing list
> > > > > > SciPy-Dev at python.org
> > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > >
> > > > > _______________________________________________
> > > > > SciPy-Dev mailing list
> > > > > SciPy-Dev at python.org
> > > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > > > _______________________________________________
> > > > SciPy-Dev mailing list
> > > > SciPy-Dev at python.org
> > > > https://mail.python.org/mailman/listinfo/scipy-dev
> > >
> > > _______________________________________________
> > > SciPy-Dev mailing list
> > > SciPy-Dev at python.org
> > > https://mail.python.org/mailman/listinfo/scipy-dev
> > >
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev at python.org
> > https://mail.python.org/mailman/listinfo/scipy-dev
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210213/b0631f87/attachment-0001.html>

From andyfaff at gmail.com  Sat Feb 13 20:17:05 2021
From: andyfaff at gmail.com (Andrew Nelson)
Date: Sun, 14 Feb 2021 12:17:05 +1100
Subject: [SciPy-Dev] Hyper-dual numbers
In-Reply-To: <CAPEkXa=Q-x-0CDmTfYHjztPScfcxagUKC8ijGy61oQqy1_AT3g@mail.gmail.com>
References: <AE70AE4B-4AF8-4BFF-9649-A210CE1FFD0F@gmail.com>
 <CAMRo0ivri7adYWTmF82G9DmrE-goOf7L4cT8+tSJ4F-261M6Bg@mail.gmail.com>
 <CAAa1KPYA9HCRZLO0Gy1bJ_qoC4PCVaQWmQLj2ov1OuZutXFXXw@mail.gmail.com>
 <CAMRo0itWDua50xzGKSsAvezHrGaqUFaH_o8WtehpoSTQGxX2Aw@mail.gmail.com>
 <CAPEkXa=rRZGW03DcQ4-4yQHcg_W1bDhPqBzMyYbjBF=CSuQ6wA@mail.gmail.com>
 <CAMRo0iv14N6ftZJAf0q=6QtyJEvN5aQVuBES673L9B0PJ_+qLA@mail.gmail.com>
 <999035c87362532e946b4288aaef7639d7441316.camel@sipsolutions.net>
 <CAPEkXa=14j5-_dYhuL=-=Eq-gOtRmCyC+0xt0iaq_Bu6XEj4LA@mail.gmail.com>
 <f68e547146347c9aa1c72836d03e08b1b92d2552.camel@sipsolutions.net>
 <CAPEkXa=Q-x-0CDmTfYHjztPScfcxagUKC8ijGy61oQqy1_AT3g@mail.gmail.com>
Message-ID: <CAAbtOZdX4Vycpynvz3z-Bf9vz7xnt+qwJw4gencFqheDx=p2gg@mail.gmail.com>

IIUC The Jax package uses dual numbers to a great extent, particularly for
autodifferentiation. Autodifferentiation is great for providing gradients
and Jacobians in optimisation.


On Sun, 14 Feb 2021, 09:53 Andrea Cortis, <andrea.cortis at gmail.com> wrote:

> I will report it as such to the author
>
> On Sat, Feb 13, 2021 at 4:37 PM Sebastian Berg <sebastian at sipsolutions.net>
> wrote:
>
>> On Sat, 2021-02-13 at 16:24 -0600, Andrea Cortis wrote:
>> > I did try to port PyHD to python3 with 2to3
>> > There are 38 warnings upon compiling (macos Catalina), but then when
>> > I try
>> > to run I get
>> >
>> > ----> 1 from hyperdual import numpy_hyperdual
>> >
>> > ValueError: Failed to register dtype for <class
>> > 'hyperdual.hyperdual'>:
>> > Legacy user dtypes using `NPY_ITEM_IS_POINTER` or `NPY_ITEM_REFCOUNT`
>> > areunsupported.  It is possible to create such a dtype only if it is
>> > a
>> > structured dtype with names and fields hardcoded at registration
>> > time.
>> > Please contact the NumPy developers if this used to work but now
>> > fails.
>> >
>>
>> Sounds like a simple bug, the code fails to properly initialize the
>> descriptor flags, which likely just means zero'ing them out.
>> (NumPy was really designed with being passed a static version of the
>> struct, although nothing wrong with the design here ? it just will
>> never be cleaned up).
>>
>> - Sebastian
>>
>>
>> >
>> > On Sat, Feb 13, 2021 at 2:09 PM Sebastian Berg <
>> > sebastian at sipsolutions.net>
>> > wrote:
>> >
>> > > On Sat, 2021-02-13 at 22:47 +0300, Evgeni Burovski wrote:
>> > > > Sure, these are very different. The use of the quaternions
>> > > > package
>> > > > here is only that it implements the machinery. All I did was to
>> > > > change
>> > > > the multiplication table from the quaternion one to the dual
>> > > > number
>> > >
>> > > It looks like someone has already done it:
>> > > https://github.com/anandpratap/PyHD
>> > >
>> > > As to NumPy, I doubt it should be part of NumPy it seems far from
>> > > "basic".  Should this be in SciPy?  My opinion is: maybe, but quite
>> > > certainly not right now.
>> > >
>> > > The long story is, that I am doing a pretty big revamp of how NumPy
>> > > does DTypes [1]. That will remove a lot of quirks and limitations.
>> > > But
>> > > those quirks should not be forbidding for a hyper-dual number dtype
>> > > (and probably are not, I suspect that project above just works
>> > > fairly
>> > > well!).
>> > >
>> > > Should SciPy be in the business of providing new dtypes? Honestly,
>> > > I
>> > > hope that the answer may be "yes" at least after NumPy has this new
>> > > API
>> > > available and it has been proven/smoothened out.  If a dtype is
>> > > useful
>> > > enough in general SciPy terms of course.
>> > > At this time, I think it is probably best to do in stand-alone
>> > > packages
>> > > for a while longer and hope that is not actually a limitation at
>> > > all.
>> > >
>> > > Cheers,
>> > >
>> > > Sebastian
>> > >
>> > >
>> > >
>> > > [1] https://numpy.org/neps/nep-0041-improved-dtype-support.html
>> > >
>> > >
>> > >
>> > > > one. And it sort of works --- the branch above lets you define
>> > > > arrays
>> > > > with `dype=dual_number` and do basic arithmetics. Making it
>> > > > usable is
>> > > > reasonably straightforward, but I did not manage to find time
>> > > > since
>> > > > last November.
>> > > >
>> > > > Help's welcome --- even checking out the branch and trying corner
>> > > > cases in arithmetics (there certainly are bugs) would be helpful.
>> > > >
>> > > > On Sat, Feb 13, 2021 at 10:36 PM Andrea Cortis <
>> > > > andrea.cortis at gmail.com> wrote:
>> > > > >
>> > > > > I am no mathematician and cannot comment on the equivalence of
>> > > > > quaternions vs hyper-dual numbers, even though they seem quite
>> > > > > different to me at a first glance.
>> > > > >
>> > > > > For sure, I know that the algebra of dual-numbers is different
>> > > > > from
>> > > > > the algebra of complex numbers.
>> > > > >
>> > > > > It seems to me therefore that having a numpy `dtype=dual` would
>> > > > > be
>> > > > > extremely advantageous when looking to construct *exact* values
>> > > > > of
>> > > > > first (and n-th) derivatives.
>> > > > > Say that one would want to replace automatic differentiation
>> > > > > for
>> > > > > backpropagation neural networks like in this blogpost
>> > > > >
>> > > > >
>> > >
>> https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/
>> > > > >
>> > > > > then we would have a plethora of different implementations, say
>> > > > > for
>> > > > > pytorch, tensorflow, etc. and no unifying framework.
>> > > > >
>> > > > > Best,
>> > > > >
>> > > > > Andre
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski <
>> > > > > evgeny.burovskiy at gmail.com> wrote:
>> > > > > >
>> > > > > > Ah, these. Suspected it, but wanted to make sure.
>> > > > > >
>> > > > > > IMO, these are best implemented as a numpy dtype. I'm biased
>> > > > > > though
>> > > > > > --- here's a branch which makes a start, based on Mike
>> > > > > > Boyle's
>> > > > > > version
>> > > > > > of quaternion dtype (Mike and other authors, if you're
>> > > > > > reading
>> > > > > > this
>> > > > > > --- thanks a ton!)
>> > > > > >
>> > > > > > https://github.com/ev-br/quaternion/tree/dual
>> > > > > >
>> > > > > > Now, I don't think scipy should carry around additional numpy
>> > > > > > dtypes.
>> > > > > > Cannot speak for the numpy project, but I strongly suspect
>> > > > > > this
>> > > > > > is
>> > > > > > best implemented as a separate repository/project.
>> > > > > >
>> > > > > > Cheers,
>> > > > > >
>> > > > > > Evgeni
>> > > > > >
>> > > > > > On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz
>> > > > > > <jfoxrabinovitz at gmail.com> wrote:
>> > > > > > >
>> > > > > > > I directed Andrea here from Stack Overflow (
>> > > > > > > https://stackoverflow.com/q/66179855/2988730). Based on the
>> > > > > > > Wikipedia article
>> > > > > > > (https://en.m.wikipedia.org/wiki/Dual_number), it seems
>> > > > > > > like
>> > > > > > > scipy is a much more likely place to look than numpy.
>> > > > > > >
>> > > > > > > - Joe
>> > > > > > >
>> > > > > > >
>> > > > > > > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski <
>> > > > > > > evgeny.burovskiy at gmail.com> wrote:
>> > > > > > > >
>> > > > > > > > Hi Andrea, welcome!
>> > > > > > > >
>> > > > > > > > Since you're asking about numpy, you likely want the
>> > > > > > > > numpy-
>> > > > > > > > discussion
>> > > > > > > > mailing list. (the overlap is non-zero, but
>> > > > > > > > nevertheless).
>> > > > > > > >
>> > > > > > > > Out of curiosity, what's a hyper-dual type?
>> > > > > > > >
>> > > > > > > > Cheers,
>> > > > > > > >
>> > > > > > > > Evgeni
>> > > > > > > >
>> > > > > > > > On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis <
>> > > > > > > > andrea.cortis at gmail.com> wrote:
>> > > > > > > > >
>> > > > > > > > > Hello, first time here. I was wondering if there are
>> > > > > > > > > plans
>> > > > > > > > > for the definition of a ?hyper-dual? type in numpy. I
>> > > > > > > > > think
>> > > > > > > > > that would be most useful for neural nets training, and
>> > > > > > > > > optimization in general.
>> > > > > > > > >
>> > > > > > > > > Andrea
>> > > > > > > > >
>> > > > > > > > > Sent from my iPad
>> > > > > > > > > _______________________________________________
>> > > > > > > > > SciPy-Dev mailing list
>> > > > > > > > > SciPy-Dev at python.org
>> > > > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
>> > > > > > > > _______________________________________________
>> > > > > > > > SciPy-Dev mailing list
>> > > > > > > > SciPy-Dev at python.org
>> > > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
>> > > > > > >
>> > > > > > > _______________________________________________
>> > > > > > > SciPy-Dev mailing list
>> > > > > > > SciPy-Dev at python.org
>> > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
>> > > > > > _______________________________________________
>> > > > > > SciPy-Dev mailing list
>> > > > > > SciPy-Dev at python.org
>> > > > > > https://mail.python.org/mailman/listinfo/scipy-dev
>> > > > >
>> > > > > _______________________________________________
>> > > > > SciPy-Dev mailing list
>> > > > > SciPy-Dev at python.org
>> > > > > https://mail.python.org/mailman/listinfo/scipy-dev
>> > > > _______________________________________________
>> > > > SciPy-Dev mailing list
>> > > > SciPy-Dev at python.org
>> > > > https://mail.python.org/mailman/listinfo/scipy-dev
>> > >
>> > > _______________________________________________
>> > > SciPy-Dev mailing list
>> > > SciPy-Dev at python.org
>> > > https://mail.python.org/mailman/listinfo/scipy-dev
>> > >
>> > _______________________________________________
>> > SciPy-Dev mailing list
>> > SciPy-Dev at python.org
>> > https://mail.python.org/mailman/listinfo/scipy-dev
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210214/6cf99863/attachment-0001.html>

From tyler.je.reddy at gmail.com  Sun Feb 14 09:17:19 2021
From: tyler.je.reddy at gmail.com (Tyler Reddy)
Date: Sun, 14 Feb 2021 07:17:19 -0700
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CAMRo0iu+COMUjBdaEaFC+7NE91usHZf9eCOa9tfFiA9M7L33ow@mail.gmail.com>
References: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
 <C98548B8-3C0F-4314-9895-2030ECABF5AF@gmail.com>
 <CAMRo0iu+COMUjBdaEaFC+7NE91usHZf9eCOa9tfFiA9M7L33ow@mail.gmail.com>
Message-ID: <CAHPuU_Y1gMMSc0ZmngBSK49Oc3dfmi1SR9ZntqKVE6pOBHghhg@mail.gmail.com>

Probably good to make sure that aarch64 build times remain relatively
stable since I think we're already doing quite a few gymnastics on that
platform for wheel builds (and we're quite limited on iterations with
Travis CI resources there).

On Sat, 13 Feb 2021 at 12:32, Evgeni Burovski <evgeny.burovskiy at gmail.com>
wrote:

> Hi,
>
> Borrowing from Boost.Math sounds great indeed. (Great if it seems
> advantageous by boost devs, too).
> There is really no reason to keep using parts of e.g. cdflib which are
> superseded by Boost.Math.
>
> However, playing devil's advocate somewhat:
>
> - does the scipy PR need the whole Boost.Math? If it only needs a
> select subset (e.g., do we need root-finding etc?), then maybe the
> size can be reduced.
> - do we need the whole thing? e.g. ufunc loops only need a select
> subset of types.
> - if we do go this route of taking parts / applying scipy specific
> patches, what is easier to do or better maintenance-wise: vendor
> original code + patches, or do the work once by porting relevant parts
> to standalone C or C++ subset?
>
> Obviously, all these should be weighted with other implications of
> adding a dependency. The immediate concerns are distribution size and
> build times.
>
> Cheers,
>
> Evgeni
>
> On Sat, Feb 13, 2021 at 4:06 PM Hans Dembinski <hans.dembinski at gmail.com>
> wrote:
> >
> > Hi Nicholas,
> >
> > as a Boost developer (I wrote Boost.Histogram and contributed to several
> other Boost libs), I think it would be great to build SciPy on Boost.Math,
> it is a win-win.
> >
> > > On 12. Feb 2021, at 22:50, Nicholas McKibben <nicholas.bgp at gmail.com>
> wrote:
> > >
> > > The initial PR includes the zipped Boost headers only (~24MB zipped),
> but adding Boost as a submodule might be a more maintainable approach if
> changes to Boost need to be made in the future.
> >
> > Including it as a submodule seems like a good approach.
> >
> > > Inclusion of the entire Boost library is a virtual necessity for the
> Boost.Math module. Manual attempts to strip away unnecessary files and bcp
> (Boost's utility to provide stripped down installations) fail to create
> smaller sizes.
> >
> > I was a bit shocked to hear this, but you are right:
> > https://pdimov.github.io/boostdep-report/master/math.html
> > Math depends on everything.
> >
> > We have a long-term goal to reduce the coupling between Boost libs, but
> this also incurs costs. Library maintainers then have to copy the relevant
> bits from other Boost libraries to not depend on them, which is actually a
> terrible idea: you loose the synergies offered by a rich shared code base.
> In my view, the coupling is not a bug, it is a feature.
> >
> > It is impressive to see how you use generators to create the binding
> code in Cython. I had a lot of trouble with Cython as it does not support
> all C++ features. The best way to wrap (modern) C++ is pybind11, which is a
> painless experience. It does the code generation at compile-time with TMP.
> >
> > Best regards,
> > Hans
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev at python.org
> > https://mail.python.org/mailman/listinfo/scipy-dev
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210214/9292fc32/attachment.html>

From ralf.gommers at gmail.com  Sun Feb 14 18:01:27 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Mon, 15 Feb 2021 00:01:27 +0100
Subject: [SciPy-Dev] GSoC'21 participation SciPy
Message-ID: <CABL7CQhZanO9L6wJps4z_eNd9+J2mum1MRGSVCvWiidYbA1=1w@mail.gmail.com>

Hi all,

It's GSoC time soon! I'd like to participate again under the PSF umbrella
this year. The GSoC project durations have been halved, which means they're
also more manageable to mentor. They're 175 hours now. For more details,
see https://summerofcode.withgoogle.com/.

Also worth mentioning is that NumPy decided not to participate, but a few
people on the NumPy team are interested in helping mentor on a SciPy
project. We've also been discussing an initiative around mentoring of
people new to open source from under-represented backgrounds with the NumPy
team, and came to the conclusion that it'd be better to use SciPy project
ideas and help mentor them. So if you have ideas that don't exactly fit the
size or timeline for GSoC, please still propose them since they may be
suitable for that initiative.

We have a few more weeks till the deadline to sign up, so it'd be great to
get some ideas and potential mentors onto this ideas page:
https://github.com/scipy/scipy/wiki/GSoC-2021-project-ideas.

Any thoughts, good project ideas, or volunteers to mentor?

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210215/c0692179/attachment.html>

From warren.weckesser at gmail.com  Sun Feb 14 19:47:14 2021
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Sun, 14 Feb 2021 19:47:14 -0500
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
References: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
Message-ID: <CAGzF1ueyDj2GR8b=zXvfv7kv1S31ZYgTU6yEXPjVnzSqQi6_ow@mail.gmail.com>

On Fri, Feb 12, 2021 at 4:50 PM Nicholas McKibben <nicholas.bgp at gmail.com>
wrote:

> Hi all,
>
> Many stats distributions in SciPy have outstanding issues with difficult
> solutions in legacy code.  We've been working on replacing existing
> statistical distributions with those found in Boost.Math.  The initial
> implementation resolves almost a dozen issues for scipy.stats with
> potential for resolving several more in scipy.stats and scipy.special in
> future PRs.
>
> Initial PR: https://github.com/scipy/scipy/pull/1332
> <https://github.com/scipy/scipy/pull/13328>
> This PR includes the ability to easily add Boost functionality through
> generated ufuncs.
>
> Boost is a large library and would incur the cost of one of the follow
> - an additional dependency (e,g, boostinator
> https://github.com/mckib2/boostinator) that outsources the packaging of
> the Boost libraries
> - the inclusion of Boost within SciPy either as a "clone and own" or
> submodule
>
> The initial PR includes the zipped Boost headers only (~24MB zipped), but
> adding Boost as a submodule might be a more maintainable approach if
> changes to Boost need to be made in the future.
>
> Inclusion of the entire Boost library is a virtual necessity for the
> Boost.Math module. Manual attempts to strip away unnecessary files and bcp
> (Boost's utility to provide stripped down installations) fail to create
> smaller sizes.  The increase in size would be similar to the following:
>
> SciPy master repo ~177 MB
> Boost branch: ~221 MB
>
> Built: ~939 MB
> Built With Boost: ~1090 MB
>
> Wheel size should not be significantly impacted because Boost is used as a
> header-only library.
>
> I have no relationship with the Boost libraries other than as a user and
> bug reporter.  I find them to be impressive and well-maintained with
> tremendous support from both industry and open source developers.  SciPy
> would benefit from the efficient, well-tested and maintained
> implementations of stats and special algorithms.
>
> Thanks,
> Nicholas
>


Thanks Nick, this is a long overdue enhancement for SciPy.

For years, we've been fixing bugs in `stats` and `special` for
functions that have high quality, thoroughly tested and license-
compatible implementations in Boost.  Indeed, we often check our
results against Boost.  I think it will be worth the effort of
working out the interface issues and the packaging issues to allow
SciPy to take advantage of the excellent code in Boost.

The pull request focuses on using Boost functions to improve the
implementations of some of probability distributions in `stats`,
but in the long term, we should take advantage of Boost as much as
we can.  The next obvious module is `special`, where it looks like we
can close a bunch of issues by replacing the existing implementation
of a function with the Boost version (e.g. `erfinv` [gh-12758],
`betaincinv` [gh-12796], `jn_zeros` [gh-4690]).

There are many other parts of Boost that would be great to have
available in SciPy.  Here are few (based on a browse through the
Boost docs; I don't know if wrapping any of these would have
insurmountable technical obstacles):

* ODE solvers; in particular, Boost has symplectic solvers, which
  are not currently available in SciPy (gh-12690).
* Interpolators: it looks like Boost has a few interpolators
  that are not currently in SciPy.
* The Boost histogram library might provide some benefits over the
  existing NumPy and SciPy options.  (Hans Dembinski, the author
  of the histrogram library, has already commented in this email
  thread.)

Warren


_______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210214/af2beecb/attachment.html>

From andyfaff at gmail.com  Mon Feb 15 02:26:16 2021
From: andyfaff at gmail.com (Andrew Nelson)
Date: Mon, 15 Feb 2021 18:26:16 +1100
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CAGzF1ueyDj2GR8b=zXvfv7kv1S31ZYgTU6yEXPjVnzSqQi6_ow@mail.gmail.com>
References: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
 <CAGzF1ueyDj2GR8b=zXvfv7kv1S31ZYgTU6yEXPjVnzSqQi6_ow@mail.gmail.com>
Message-ID: <CAAbtOZeUN1VODebSYnC7QwkQBJGZSe8YDF+98qA=A=w0zVtysA@mail.gmail.com>

On Mon, 15 Feb 2021 at 11:48, Warren Weckesser <warren.weckesser at gmail.com>
wrote:

>
> Thanks Nick, this is a long overdue enhancement for SciPy.
>

It seems that these libraries make maintenance a lot simpler and close a
lot of issues.

My questions would be:

- how portable is the boost code in general?
- how easy is it to install the library.

In concept it doesn't seem all that dissimilar to the effort that needs to
be done with installing a BLAS library when building from source.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210215/999a11af/attachment.html>

From hans.dembinski at gmail.com  Mon Feb 15 07:14:45 2021
From: hans.dembinski at gmail.com (Hans Dembinski)
Date: Mon, 15 Feb 2021 13:14:45 +0100
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CAAbtOZeUN1VODebSYnC7QwkQBJGZSe8YDF+98qA=A=w0zVtysA@mail.gmail.com>
References: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
 <CAGzF1ueyDj2GR8b=zXvfv7kv1S31ZYgTU6yEXPjVnzSqQi6_ow@mail.gmail.com>
 <CAAbtOZeUN1VODebSYnC7QwkQBJGZSe8YDF+98qA=A=w0zVtysA@mail.gmail.com>
Message-ID: <AD11E7D2-8318-46EC-A1ED-682A5A17280C@gmail.com>


> On 15. Feb 2021, at 08:26, Andrew Nelson <andyfaff at gmail.com> wrote:
> 
> My questions would be:
> 
> - how portable is the boost code in general?

It is very portable. The core goal of Boost is to offer implementations with quality and portability on par with the C++ standard library implementations. Non-portable extensions are sometimes used to speed up things, but there is always a standard compliant vanilla version. In practice, maintainers test portability with CI on Windows, OSX, Linux, using various versions of gcc, clang, msvc, intel, see e.g.

https://github.com/boostorg/math/blob/develop/.github/workflows/ci.yml

and the Boost build farm from the days before free CI for OSS was easily available,

https://www.boost.org/development/tests/master/developer/move.html

Not all compilers/platforms are fully compliant, of course. Boost uses workarounds to combat that and submits bug reports on the compiler bug trackers.

> - how easy is it to install the library.

As Nicholas mentioned, Boost.Math (and Boost.Histogram) is header-only, so it is sufficient to include the headers.

Best regards,
Hans

From ndbecker2 at gmail.com  Mon Feb 15 07:23:38 2021
From: ndbecker2 at gmail.com (Neal Becker)
Date: Mon, 15 Feb 2021 07:23:38 -0500
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <AD11E7D2-8318-46EC-A1ED-682A5A17280C@gmail.com>
References: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
 <CAGzF1ueyDj2GR8b=zXvfv7kv1S31ZYgTU6yEXPjVnzSqQi6_ow@mail.gmail.com>
 <CAAbtOZeUN1VODebSYnC7QwkQBJGZSe8YDF+98qA=A=w0zVtysA@mail.gmail.com>
 <AD11E7D2-8318-46EC-A1ED-682A5A17280C@gmail.com>
Message-ID: <CAG3t+pHTnLa+EHL5G=_Esvi1unvYO0+DNnv8RxGKryuTS+jBUg@mail.gmail.com>

I have been using pybind11 (and it's predecessor before it,
boost::python) to package c++ code for python use for many years,
including some of boost libraries.
pybind11 is easy to use and is much better than e.g., cython for
packaging c++ code.  pybind11 is also header-only.

I would also like to call attention for anyone interested in
scientific software and c++ to a wonderful library (header-only),
xtensor
https://xtensor.readthedocs.io/en/latest/

On Mon, Feb 15, 2021 at 7:15 AM Hans Dembinski <hans.dembinski at gmail.com> wrote:
>
>
> > On 15. Feb 2021, at 08:26, Andrew Nelson <andyfaff at gmail.com> wrote:
> >
> > My questions would be:
> >
> > - how portable is the boost code in general?
>
> It is very portable. The core goal of Boost is to offer implementations with quality and portability on par with the C++ standard library implementations. Non-portable extensions are sometimes used to speed up things, but there is always a standard compliant vanilla version. In practice, maintainers test portability with CI on Windows, OSX, Linux, using various versions of gcc, clang, msvc, intel, see e.g.
>
> https://github.com/boostorg/math/blob/develop/.github/workflows/ci.yml
>
> and the Boost build farm from the days before free CI for OSS was easily available,
>
> https://www.boost.org/development/tests/master/developer/move.html
>
> Not all compilers/platforms are fully compliant, of course. Boost uses workarounds to combat that and submits bug reports on the compiler bug trackers.
>
> > - how easy is it to install the library.
>
> As Nicholas mentioned, Boost.Math (and Boost.Histogram) is header-only, so it is sufficient to include the headers.
>
> Best regards,
> Hans
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev


-- 
Those who don't understand recursion are doomed to repeat it

From hans.dembinski at gmail.com  Mon Feb 15 07:35:25 2021
From: hans.dembinski at gmail.com (Hans Dembinski)
Date: Mon, 15 Feb 2021 13:35:25 +0100
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CAGzF1ueyDj2GR8b=zXvfv7kv1S31ZYgTU6yEXPjVnzSqQi6_ow@mail.gmail.com>
References: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
 <CAGzF1ueyDj2GR8b=zXvfv7kv1S31ZYgTU6yEXPjVnzSqQi6_ow@mail.gmail.com>
Message-ID: <13411649-82A4-4EC1-A58C-FAA3DDFF11D1@gmail.com>


> On 15. Feb 2021, at 01:47, Warren Weckesser <warren.weckesser at gmail.com> wrote:
> 
> * The Boost histogram library might provide some benefits over the
>   existing NumPy and SciPy options.  (Hans Dembinski, the author
>   of the histrogram library, has already commented in this email
>   thread.)

I would happily support this. We currently offer a Python front-end to Boost.Histogram
https://github.com/scikit-hep/boost-histogram
which includes a numpy.histogram compatible interface.

Switching to Boost.Histogram may offer performance benefits, see
https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html

Compared to np.histogram we saw a 1.7 times increase - single threaded, more if multiple threads are used. Compared to np.histogram2d we saw a 11 times increase. These numbers should probably be checked more carefully before decisions are made.

Boost.Histogram offers generalised histograms with arbitrary accumulators per cell, so it could also replace the implementations of https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html and friends.

Best regards,
Hans

From ralf.gommers at gmail.com  Mon Feb 15 07:41:51 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Mon, 15 Feb 2021 13:41:51 +0100
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <13411649-82A4-4EC1-A58C-FAA3DDFF11D1@gmail.com>
References: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
 <CAGzF1ueyDj2GR8b=zXvfv7kv1S31ZYgTU6yEXPjVnzSqQi6_ow@mail.gmail.com>
 <13411649-82A4-4EC1-A58C-FAA3DDFF11D1@gmail.com>
Message-ID: <CABL7CQjYZh0CyA6Kx5FULw2KaYMmdrLbm0Jecztc5+4z+r8OJg@mail.gmail.com>

On Mon, Feb 15, 2021 at 1:35 PM Hans Dembinski <hans.dembinski at gmail.com>
wrote:

>
> > On 15. Feb 2021, at 01:47, Warren Weckesser <warren.weckesser at gmail.com>
> wrote:
> >
> > * The Boost histogram library might provide some benefits over the
> >   existing NumPy and SciPy options.  (Hans Dembinski, the author
> >   of the histrogram library, has already commented in this email
> >   thread.)
>
> I would happily support this. We currently offer a Python front-end to
> Boost.Histogram
> https://github.com/scikit-hep/boost-histogram
> which includes a numpy.histogram compatible interface.
>
> Switching to Boost.Histogram may offer performance benefits, see
>
> https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html
>
> Compared to np.histogram we saw a 1.7 times increase - single threaded,
> more if multiple threads are used. Compared to np.histogram2d we saw a 11
> times increase. These numbers should probably be checked more carefully
> before decisions are made.
>
> Boost.Histogram offers generalised histograms with arbitrary accumulators
> per cell, so it could also replace the implementations of
> https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html
> and friends.
>

That would be really nice. binned_statistic is currently pure Python, and
can be a performance hotspot (I've seen multiple cases of that in dealing
with image and geospatial data).

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210215/5a972b62/attachment.html>

From ndbecker2 at gmail.com  Mon Feb 15 07:47:48 2021
From: ndbecker2 at gmail.com (Neal Becker)
Date: Mon, 15 Feb 2021 07:47:48 -0500
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CABL7CQjYZh0CyA6Kx5FULw2KaYMmdrLbm0Jecztc5+4z+r8OJg@mail.gmail.com>
References: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
 <CAGzF1ueyDj2GR8b=zXvfv7kv1S31ZYgTU6yEXPjVnzSqQi6_ow@mail.gmail.com>
 <13411649-82A4-4EC1-A58C-FAA3DDFF11D1@gmail.com>
 <CABL7CQjYZh0CyA6Kx5FULw2KaYMmdrLbm0Jecztc5+4z+r8OJg@mail.gmail.com>
Message-ID: <CAG3t+pH-Eq2KfEwRbN0UVRSNmdSkY-DmhD=C-Tvo6bnPKWNH_w@mail.gmail.com>

One thing I've missed with the current scipy histogram is the ability
to do 'online' or 'incremental' collection of the histogram data.  For
this reason I have written my own histogram code.  I am often
collecting data from monte-carlo simulations and want to accumulate
stats from data that arrives in batches.
I don't know if boost-histogram supports this but if so I would find
this very welcome.

On Mon, Feb 15, 2021 at 7:42 AM Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
>
>
> On Mon, Feb 15, 2021 at 1:35 PM Hans Dembinski <hans.dembinski at gmail.com> wrote:
>>
>>
>> > On 15. Feb 2021, at 01:47, Warren Weckesser <warren.weckesser at gmail.com> wrote:
>> >
>> > * The Boost histogram library might provide some benefits over the
>> >   existing NumPy and SciPy options.  (Hans Dembinski, the author
>> >   of the histrogram library, has already commented in this email
>> >   thread.)
>>
>> I would happily support this. We currently offer a Python front-end to Boost.Histogram
>> https://github.com/scikit-hep/boost-histogram
>> which includes a numpy.histogram compatible interface.
>>
>> Switching to Boost.Histogram may offer performance benefits, see
>> https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html
>>
>> Compared to np.histogram we saw a 1.7 times increase - single threaded, more if multiple threads are used. Compared to np.histogram2d we saw a 11 times increase. These numbers should probably be checked more carefully before decisions are made.
>>
>> Boost.Histogram offers generalised histograms with arbitrary accumulators per cell, so it could also replace the implementations of https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html and friends.
>
>
> That would be really nice. binned_statistic is currently pure Python, and can be a performance hotspot (I've seen multiple cases of that in dealing with image and geospatial data).
>
> Cheers,
> Ralf
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev


-- 
Those who don't understand recursion are doomed to repeat it

From hans.dembinski at gmail.com  Mon Feb 15 08:03:54 2021
From: hans.dembinski at gmail.com (Hans Dembinski)
Date: Mon, 15 Feb 2021 14:03:54 +0100
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CAG3t+pH-Eq2KfEwRbN0UVRSNmdSkY-DmhD=C-Tvo6bnPKWNH_w@mail.gmail.com>
References: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
 <CAGzF1ueyDj2GR8b=zXvfv7kv1S31ZYgTU6yEXPjVnzSqQi6_ow@mail.gmail.com>
 <13411649-82A4-4EC1-A58C-FAA3DDFF11D1@gmail.com>
 <CABL7CQjYZh0CyA6Kx5FULw2KaYMmdrLbm0Jecztc5+4z+r8OJg@mail.gmail.com>
 <CAG3t+pH-Eq2KfEwRbN0UVRSNmdSkY-DmhD=C-Tvo6bnPKWNH_w@mail.gmail.com>
Message-ID: <2290861A-F5A6-46A1-8397-213EA0AD5F4E@gmail.com>


> On 15. Feb 2021, at 13:47, Neal Becker <ndbecker2 at gmail.com> wrote:
> 
> One thing I've missed with the current scipy histogram is the ability
> to do 'online' or 'incremental' collection of the histogram data.  For
> this reason I have written my own histogram code.  I am often
> collecting data from monte-carlo simulations and want to accumulate
> stats from data that arrives in batches.
> I don't know if boost-histogram supports this but if so I would find
> this very welcome.

I think the answer is yes, if I understood you correctly.

Boost.Histogram has an object oriented design, the histogram is an object that one can fill incrementally with input arrays. 

I personally like the functional paradigm behind np.histogram and friends, but it is not as efficient for incremental collection. When numpy.histogram is used, one has to generate a temporary array with the intermediate results, which are then added to the main array. The object-oriented approach avoids this.

In my field (high energy and astroparticle physics), incremental filling is also the default. We typically have large amounts of data that we want to convert into histograms, so the codes typically fill some histograms incrementally.

The performance issue of numpy.histogram could also be fixed by adding an "out" keyword to numpy.histogram, to allow the user to pass the array which is filled.

Best regards,
Hans

From roy.pamphile at gmail.com  Mon Feb 15 12:24:24 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Mon, 15 Feb 2021 18:24:24 +0100
Subject: [SciPy-Dev] GSoC'21 participation SciPy
Message-ID: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com>

Hi,

Thank you for putting this together!

I would have some ideas for the ideal pool :)

scipy.optimize: Would it be wanted to have a possibility to have workers to evaluate the function during an optimization?
In most industrial context, the function is not trivial and might require minutes if not hours or even days to compute.
Having a simple way to first parallelise the runs would help. We have machines with easily ten cores now and it would be great to leverage this here.

Going that direction, having a more general infrastructure to handle external workers would be great.
Sure there are external packages to do this, but then it?s not so trivial if you want to use SciPy?s optimizers.

scipy.optimize: What about another optimization method such as EGO? This would require to have a Gaussian Process regressor.

scipy.stats: there is an ANOVA section in the roadmap. But is sensitivity analysis in general something which would be of interest.
I am thinking about Sobol? indices (not related to Sobol? sequence but from the same author), moment based indices, Shapley values, cusunoro, etc.

scipy.metamodel: last but not least, a metamodel/response surface module. This is linked to the optimization or sensitivity analysis of expensive
functions. Would be sufficient to have Gaussian Process and polynomial chaos expansion. Could also include more general things like linear regression or
others things in scipy.interpolate.


Cheers,

Pamphile
@tupui


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210215/afd713c0/attachment.html>

From ralf.gommers at gmail.com  Mon Feb 15 14:01:44 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Mon, 15 Feb 2021 20:01:44 +0100
Subject: [SciPy-Dev] GSoC'21 participation SciPy
In-Reply-To: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com>
References: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com>
Message-ID: <CABL7CQjMkuCVnhWn+hRrHmxH=LA5F81FQazyEwMVu1Pw3+ijtQ@mail.gmail.com>

On Mon, Feb 15, 2021 at 6:24 PM Pamphile Roy <roy.pamphile at gmail.com> wrote:

> Hi,
>
> Thank you for putting this together!
>
> I would have some ideas for the ideal pool :)
>

Thanks Pamphile!


> *scipy.optimize:* Would it be wanted to have a possibility to have
> workers to evaluate the function during an optimization?
> In most industrial context, the function is not trivial and might require
> minutes if not hours or even days to compute.
> Having a simple way to first parallelise the runs would help. We have
> machines with easily ten cores now and it would be great to leverage this
> here.
>

Definitely - see the mention of workers under
http://scipy.github.io/devdocs/roadmap.html#performance-improvements.

Going that direction, having a more general infrastructure to handle
> external workers would be great.
>

I'm assuming you mean something like standard multiprocessing, or using a
custom Pool object, for code that's trivially parallelizable. Both are
covered by the `workers` pattern. If you're thinking about something else,
can you elaborate?

Sure there are external packages to do this, but then it?s not so trivial
> if you want to use SciPy?s optimizers.
>
> *scipy.optimize:* What about another optimization method such as EGO?
> This would require to have a Gaussian Process regressor.
>

In general we'd like to continue adding high-quality optimization methods
if they bring something extra - see
https://mail.python.org/pipermail/scipy-dev/2021-January/024489.html.

Not sure about EGO in particular (I'm not familiar with it), gaussian
processes sounds a little out of scope - that's scikit-learn territory
probably.


> *scipy.stats:* there is an ANOVA section in the roadmap. But is
> sensitivity analysis in general something which would be of interest.
> I am thinking about Sobol? indices (not related to Sobol? sequence but
> from the same author), moment based indices, Shapley values, cusunoro, etc.
>

I'm not 100% sure, let's see if someone more familiar with this topic has
an opinion. In general for new stats functionality we try to figure out if
it fits better in scipy.stats or in statsmodels. The latter doesn't have
much either right now, only:
https://www.statsmodels.org/stable/generated/statsmodels.genmod.generalized_estimating_equations.GEEResults.sensitivity_params.html


> *scipy.metamodel: *last but not least, a metamodel/response surface
> module. This is linked to the optimization or sensitivity analysis of
> expensive
> functions. Would be sufficient to have Gaussian Process and polynomial
> chaos expansion. Could also include more general things like linear
> regression or
> others things in scipy.interpolate.
>

That is out of scope I'd say, too specific for a new submodule - at the
very least it should start as a separate package first.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210215/e0b984ba/attachment.html>

From roy.pamphile at gmail.com  Mon Feb 15 15:51:37 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Mon, 15 Feb 2021 21:51:37 +0100
Subject: [SciPy-Dev] GSoC'21 participation SciPy
Message-ID: <91E68BFC-5DEA-4BBC-9BE3-A9CC44D00907@gmail.com>

> I'm assuming you mean something like standard multiprocessing, or using a
> custom Pool object, for code that's trivially parallelizable. Both are
> covered by the `workers` pattern. If you're thinking about something else,
> can you elaborate?
Yes the first step would be to do some simple multiprocessing. But IMO we should still try to have something flexible enough
so that we could plug (or the user) something else like a job scheduler. I would not see any scheduling per say in SciPy (totally out of scope and too many options: bash, slurm, kubernetes world, etc.),
but being able to access a queue would be nice. This way you could wrap anything to go take a job form a queue and return a result.

This would need some experiments, but I think that we could achieve something like this just with some simple queue from std lib.
Because I am not sure we would want to introduce, even optionally, a dependency to something like RabbitMQ. (Or maybe?).

> Not sure about EGO in particular (I'm not familiar with it), gaussian
> processes sounds a little out of scope - that's scikit-learn territory
> probably.
True. I am using scikit-optimize for that currently. It just seemed EGO was getting more and more tractions and so I though that maybe
having this in SciPy could be justified. But definitely, other famous algorithms from the list you posted would be good candidates.

> I'm not 100% sure, let's see if someone more familiar with this topic has
> an opinion. In general for new stats functionality we try to figure out if
> it fits better in scipy.stats or in statsmodels. The latter doesn't have
> much either right now, only:
> https://www.statsmodels.org/stable/generated/statsmodels.genmod.generalized_estimating_equations.GEEResults.sensitivity_params.html

I posted several time the suggestion to statsmodels mailing list but so far did get a clear answer.
Hopefully we will have more luck here :)

Agreed for the response surface part. It would be a big undertaking.

Cheers,

Pamphile
@tupui
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210215/4cf5fdad/attachment-0001.html>

From robert.kern at gmail.com  Mon Feb 15 16:19:08 2021
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 15 Feb 2021 16:19:08 -0500
Subject: [SciPy-Dev] GSoC'21 participation SciPy
In-Reply-To: <CABL7CQjMkuCVnhWn+hRrHmxH=LA5F81FQazyEwMVu1Pw3+ijtQ@mail.gmail.com>
References: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com>
 <CABL7CQjMkuCVnhWn+hRrHmxH=LA5F81FQazyEwMVu1Pw3+ijtQ@mail.gmail.com>
Message-ID: <CAF6FJiv=eNLqvyEjhfdfZYg=a_1kUe7c+CJbfdMADr6vJSgqNw@mail.gmail.com>

On Mon, Feb 15, 2021 at 2:02 PM Ralf Gommers <ralf.gommers at gmail.com> wrote:

>
> On Mon, Feb 15, 2021 at 6:24 PM Pamphile Roy <roy.pamphile at gmail.com>
> wrote:
>
>> *scipy.optimize:* Would it be wanted to have a possibility to have
>> workers to evaluate the function during an optimization?
>> In most industrial context, the function is not trivial and might require
>> minutes if not hours or even days to compute.
>> Having a simple way to first parallelise the runs would help. We have
>> machines with easily ten cores now and it would be great to leverage this
>> here.
>>
>
> Definitely - see the mention of workers under
> http://scipy.github.io/devdocs/roadmap.html#performance-improvements.
>
> Going that direction, having a more general infrastructure to handle
>> external workers would be great.
>>
>
> I'm assuming you mean something like standard multiprocessing, or using a
> custom Pool object, for code that's trivially parallelizable. Both are
> covered by the `workers` pattern. If you're thinking about something else,
> can you elaborate?
>

A standard approach for this is to organize the implementation of the
optimization algorithms in what's usually called an "ask-tell" interface.
The minimize()-style interface is easy to implement from an ask-tell
interface, but not vice-versa. Basically, you have the optimizer object
expose two methods, ask(), which returns a next point to evaluate, and
tell(), where you feed back the point and its evaluated function value.
You're in charge of evaluating that function. This gives you a lot of
flexibility in how to dispatch that function evaluation, and importantly,
we don't have to commit to any dependencies! That's the user's job!

scikit-optimize implements their optimizers in this style, for example.
It's pretty common for optimizers that are geared towards expensive
evaluations.

  https://scikit-optimize.github.io/stable/auto_examples/ask-and-tell.html

I think it might be a well-scoped GSoC project to start re-implementing a
chosen handful of the algorithms in scipy.optimize in such an interface. It
could even be a trial run as an external package (even in scikit-optimize,
if they're amenable). Then we can evaluate whether we want to adopt that
framework inside scipy.optimize and make a roadmap for re-implementing all
of the algorithms in that style. It will be a technical challenge to adapt
the FORTRAN-implemented algorithms to such an interface.

I will not be available to mentor such a project, but that's the general
approach that I would recommend. I think it would be a valuable addition.

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210215/835f595c/attachment.html>

From rcasero at gmail.com  Mon Feb 15 17:01:53 2021
From: rcasero at gmail.com (=?UTF-8?B?UmFtw7NuIENhc2VybyBDYcOxYXM=?=)
Date: Mon, 15 Feb 2021 22:01:53 +0000
Subject: [SciPy-Dev] Pull request test cancelled
Message-ID: <CA+t8hixowiSsUGWzc+4aZU-AHJVaCCiE+1mG-u9wV7iR6NddDw@mail.gmail.com>

Hi,

i submitted a pull request to speed up hdquantile_sd (
https://github.com/scipy/scipy/pull/13566).

I think all tests passed, except one that was cancelled, but I don't think
that has to do with my changes?

azure-pipelines/ scipy.scipy

Build log #L650

The operation was canceled.


Kind regards,

Ramon.

-- 
Dr Ram?n Casero, CSci, Post-Doctoral Researcher


MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit
Harwell Campus, Oxfordshire, OX11 0RD, UK.
Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210215/350f866c/attachment.html>

From ilhanpolat at gmail.com  Mon Feb 15 17:20:51 2021
From: ilhanpolat at gmail.com (Ilhan Polat)
Date: Mon, 15 Feb 2021 23:20:51 +0100
Subject: [SciPy-Dev] Pull request test cancelled
In-Reply-To: <CA+t8hixowiSsUGWzc+4aZU-AHJVaCCiE+1mG-u9wV7iR6NddDw@mail.gmail.com>
References: <CA+t8hixowiSsUGWzc+4aZU-AHJVaCCiE+1mG-u9wV7iR6NddDw@mail.gmail.com>
Message-ID: <CAEBuzr-ZJwqynLVL+jkG3EkSAMM_brQ5Tro6Af2_DzQaLXHiVA@mail.gmail.com>

After 60 minutes, the job times out. I've restarted it.

On Mon, Feb 15, 2021 at 11:02 PM Ram?n Casero Ca?as <rcasero at gmail.com>
wrote:

> Hi,
>
> i submitted a pull request to speed up hdquantile_sd (
> https://github.com/scipy/scipy/pull/13566).
>
> I think all tests passed, except one that was cancelled, but I don't think
> that has to do with my changes?
>
> azure-pipelines/ scipy.scipy
>
> Build log #L650
>
> The operation was canceled.
>
>
> Kind regards,
>
> Ramon.
>
> --
> Dr Ram?n Casero, CSci, Post-Doctoral Researcher
>
>
> MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit
> Harwell Campus, Oxfordshire, OX11 0RD, UK.
> Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210215/0f4356a8/attachment.html>

From grlee77 at gmail.com  Mon Feb 15 17:24:24 2021
From: grlee77 at gmail.com (Gregory Lee)
Date: Mon, 15 Feb 2021 17:24:24 -0500
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CAGzF1ueyDj2GR8b=zXvfv7kv1S31ZYgTU6yEXPjVnzSqQi6_ow@mail.gmail.com>
References: <CAHrOx8C1H88Lkm7zyYB261O19ZaKnQ=DVb-DS7H==MDEoK3kvQ@mail.gmail.com>
 <CAGzF1ueyDj2GR8b=zXvfv7kv1S31ZYgTU6yEXPjVnzSqQi6_ow@mail.gmail.com>
Message-ID: <CAJR3sXe0yf+QzVYz7Y-cvArdUU=cA5LJVJbJPq+xqOt0L68T7g@mail.gmail.com>

On Sun, Feb 14, 2021 at 7:47 PM Warren Weckesser <warren.weckesser at gmail.com>
wrote:

>
>
> On Fri, Feb 12, 2021 at 4:50 PM Nicholas McKibben <nicholas.bgp at gmail.com>
> wrote:
>
>> Hi all,
>>
>> Many stats distributions in SciPy have outstanding issues with difficult
>> solutions in legacy code.  We've been working on replacing existing
>> statistical distributions with those found in Boost.Math.  The initial
>> implementation resolves almost a dozen issues for scipy.stats with
>> potential for resolving several more in scipy.stats and scipy.special in
>> future PRs.
>>
>> Initial PR: https://github.com/scipy/scipy/pull/1332
>> <https://github.com/scipy/scipy/pull/13328>
>> This PR includes the ability to easily add Boost functionality through
>> generated ufuncs.
>>
>> Boost is a large library and would incur the cost of one of the follow
>> - an additional dependency (e,g, boostinator
>> https://github.com/mckib2/boostinator) that outsources the packaging of
>> the Boost libraries
>> - the inclusion of Boost within SciPy either as a "clone and own" or
>> submodule
>>
>> The initial PR includes the zipped Boost headers only (~24MB zipped), but
>> adding Boost as a submodule might be a more maintainable approach if
>> changes to Boost need to be made in the future.
>>
>> Inclusion of the entire Boost library is a virtual necessity for the
>> Boost.Math module. Manual attempts to strip away unnecessary files and bcp
>> (Boost's utility to provide stripped down installations) fail to create
>> smaller sizes.  The increase in size would be similar to the following:
>>
>> SciPy master repo ~177 MB
>> Boost branch: ~221 MB
>>
>> Built: ~939 MB
>> Built With Boost: ~1090 MB
>>
>> Wheel size should not be significantly impacted because Boost is used as
>> a header-only library.
>>
>> I have no relationship with the Boost libraries other than as a user and
>> bug reporter.  I find them to be impressive and well-maintained with
>> tremendous support from both industry and open source developers.  SciPy
>> would benefit from the efficient, well-tested and maintained
>> implementations of stats and special algorithms.
>>
>> Thanks,
>> Nicholas
>>
>
>
> Thanks Nick, this is a long overdue enhancement for SciPy.
>
> For years, we've been fixing bugs in `stats` and `special` for
> functions that have high quality, thoroughly tested and license-
> compatible implementations in Boost.  Indeed, we often check our
> results against Boost.  I think it will be worth the effort of
> working out the interface issues and the packaging issues to allow
> SciPy to take advantage of the excellent code in Boost.
>
> The pull request focuses on using Boost functions to improve the
> implementations of some of probability distributions in `stats`,
> but in the long term, we should take advantage of Boost as much as
> we can.  The next obvious module is `special`, where it looks like we
> can close a bunch of issues by replacing the existing implementation
> of a function with the Boost version (e.g. `erfinv` [gh-12758],
> `betaincinv` [gh-12796], `jn_zeros` [gh-4690]).
>
> There are many other parts of Boost that would be great to have
> available in SciPy.  Here are few (based on a browse through the
> Boost docs; I don't know if wrapping any of these would have
> insurmountable technical obstacles):
>
> * ODE solvers; in particular, Boost has symplectic solvers, which
>   are not currently available in SciPy (gh-12690).
> * Interpolators: it looks like Boost has a few interpolators
>   that are not currently in SciPy.
> * The Boost histogram library might provide some benefits over the
>   existing NumPy and SciPy options.  (Hans Dembinski, the author
>   of the histrogram library, has already commented in this email
>   thread.)
>
> Warren
>
>

I think the Boost Graph Library is also of substantial interest and could
be used in SciPy. We would likely use max-flow algorithms downstream in
scikit-image for things like image segmentation and phase unwrapping if
they were made available. Discussion in
https://github.com/scikit-image/scikit-image/issues/4832 seems to indicate
that the scipy.sparse.csgraph.maximum_flow algorithm is fairly slow, and a
faster replacement would be welcome.

- Greg


>
> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210215/f2df9ec3/attachment-0001.html>

From rcasero at gmail.com  Mon Feb 15 18:38:19 2021
From: rcasero at gmail.com (=?UTF-8?B?UmFtw7NuIENhc2VybyBDYcOxYXM=?=)
Date: Mon, 15 Feb 2021 23:38:19 +0000
Subject: [SciPy-Dev] Pull request test cancelled
In-Reply-To: <CAEBuzr-ZJwqynLVL+jkG3EkSAMM_brQ5Tro6Af2_DzQaLXHiVA@mail.gmail.com>
References: <CA+t8hixowiSsUGWzc+4aZU-AHJVaCCiE+1mG-u9wV7iR6NddDw@mail.gmail.com>
 <CAEBuzr-ZJwqynLVL+jkG3EkSAMM_brQ5Tro6Af2_DzQaLXHiVA@mail.gmail.com>
Message-ID: <CA+t8hiwAtpw+RV+TuJXUormvN_kLV2bmEpgU6AqKp17ipuvMUg@mail.gmail.com>

Thanks, checks have passed now.

Ramon.

On Mon, 15 Feb 2021 at 22:21, Ilhan Polat <ilhanpolat at gmail.com> wrote:

> After 60 minutes, the job times out. I've restarted it.
>
> On Mon, Feb 15, 2021 at 11:02 PM Ram?n Casero Ca?as <rcasero at gmail.com>
> wrote:
>
>> Hi,
>>
>> i submitted a pull request to speed up hdquantile_sd (
>> https://github.com/scipy/scipy/pull/13566).
>>
>> I think all tests passed, except one that was cancelled, but I don't
>> think that has to do with my changes?
>>
>> azure-pipelines/ scipy.scipy
>>
>> Build log #L650
>>
>> The operation was canceled.
>>
>>
>> Kind regards,
>>
>> Ramon.
>>
>> --
>> Dr Ram?n Casero, CSci, Post-Doctoral Researcher
>>
>>
>> MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit
>> Harwell Campus, Oxfordshire, OX11 0RD, UK.
>> Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>


-- 
Dr Ram?n Casero, CSci, Post-Doctoral Researcher


MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit
Harwell Campus, Oxfordshire, OX11 0RD, UK.
Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210215/047f1cfe/attachment.html>

From ralf.gommers at gmail.com  Tue Feb 16 09:02:06 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 16 Feb 2021 15:02:06 +0100
Subject: [SciPy-Dev] GSoC'21 participation SciPy
In-Reply-To: <CAF6FJiv=eNLqvyEjhfdfZYg=a_1kUe7c+CJbfdMADr6vJSgqNw@mail.gmail.com>
References: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com>
 <CABL7CQjMkuCVnhWn+hRrHmxH=LA5F81FQazyEwMVu1Pw3+ijtQ@mail.gmail.com>
 <CAF6FJiv=eNLqvyEjhfdfZYg=a_1kUe7c+CJbfdMADr6vJSgqNw@mail.gmail.com>
Message-ID: <CABL7CQgbp60t5OT9-7kraLDmy9n0r7dxrG_jYkjC6oxMZKyqrg@mail.gmail.com>

On Mon, Feb 15, 2021 at 10:19 PM Robert Kern <robert.kern at gmail.com> wrote:

> On Mon, Feb 15, 2021 at 2:02 PM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>> On Mon, Feb 15, 2021 at 6:24 PM Pamphile Roy <roy.pamphile at gmail.com>
>> wrote:
>>
>>> *scipy.optimize:* Would it be wanted to have a possibility to have
>>> workers to evaluate the function during an optimization?
>>> In most industrial context, the function is not trivial and might
>>> require minutes if not hours or even days to compute.
>>> Having a simple way to first parallelise the runs would help. We have
>>> machines with easily ten cores now and it would be great to leverage this
>>> here.
>>>
>>
>> Definitely - see the mention of workers under
>> http://scipy.github.io/devdocs/roadmap.html#performance-improvements.
>>
>> Going that direction, having a more general infrastructure to handle
>>> external workers would be great.
>>>
>>
>> I'm assuming you mean something like standard multiprocessing, or using a
>> custom Pool object, for code that's trivially parallelizable. Both are
>> covered by the `workers` pattern. If you're thinking about something else,
>> can you elaborate?
>>
>
> A standard approach for this is to organize the implementation of the
> optimization algorithms in what's usually called an "ask-tell" interface.
> The minimize()-style interface is easy to implement from an ask-tell
> interface, but not vice-versa. Basically, you have the optimizer object
> expose two methods, ask(), which returns a next point to evaluate, and
> tell(), where you feed back the point and its evaluated function value.
> You're in charge of evaluating that function. This gives you a lot of
> flexibility in how to dispatch that function evaluation, and importantly,
> we don't have to commit to any dependencies! That's the user's job!
>
> scikit-optimize implements their optimizers in this style, for example.
> It's pretty common for optimizers that are geared towards expensive
> evaluations.
>
>   https://scikit-optimize.github.io/stable/auto_examples/ask-and-tell.html
>
> I think it might be a well-scoped GSoC project to start re-implementing a
> chosen handful of the algorithms in scipy.optimize in such an interface. It
> could even be a trial run as an external package (even in scikit-optimize,
> if they're amenable). Then we can evaluate whether we want to adopt that
> framework inside scipy.optimize and make a roadmap for re-implementing all
> of the algorithms in that style. It will be a technical challenge to adapt
> the FORTRAN-implemented algorithms to such an interface.
>
> I will not be available to mentor such a project, but that's the general
> approach that I would recommend. I think it would be a valuable addition.
>

Thanks Robert, that seems like an interesting exercise. This reminds me of
the "class based optimizers" proposal. That didn't mention ask-tell, but
the "reverse-communication" may be the same idea:
https://github.com/scipy/scipy/pull/8552
https://mail.python.org/pipermail/scipy-dev/2018-February/022449.html

Your comments and the scikit-optimize link are imho a better justification
for doing this exercise than we had before.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210216/3e702c96/attachment.html>

From warren.weckesser at gmail.com  Tue Feb 16 12:01:57 2021
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Tue, 16 Feb 2021 12:01:57 -0500
Subject: [SciPy-Dev] New probability distributions in stats: the noncentral
 hypergeometric distributions
Message-ID: <CAGzF1ueiJ7WdF7rtA82g=sh3UUNBY5GKmJ2tuoXPAsr1J9v5yw@mail.gmail.com>

Hey all,

Matt Haberland's pull request with implementations of the Fisher and
Wallenius noncentral hypergeometric distributions (built on Nicholas
McKibben's wrappers of Agner Fog's biasedurn code) looks ready to merge:
https://github.com/scipy/scipy/pull/13330.

It would be nice to get some more eyes on it.  The last issue that we've
been discussing is the names of the distributions.  The current names,
`fnch` and `wnch`, are quite concise, and some longer alternatives are
being considered in the PR. Comments on that and on the PR in general would
be helpful.  Thanks!

Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210216/f8bf4ddb/attachment.html>

From rcasero at gmail.com  Tue Feb 16 12:48:51 2021
From: rcasero at gmail.com (=?UTF-8?B?UmFtw7NuIENhc2VybyBDYcOxYXM=?=)
Date: Tue, 16 Feb 2021 17:48:51 +0000
Subject: [SciPy-Dev] Pull request test cancelled
In-Reply-To: <CA+t8hizbqW4qWN1q2XfXNXvJV=2+kQ4_yxLqbE+OjzPyEHteoA@mail.gmail.com>
References: <CA+t8hizbqW4qWN1q2XfXNXvJV=2+kQ4_yxLqbE+OjzPyEHteoA@mail.gmail.com>
Message-ID: <CA+t8hizHeKi3mHcUUiP8Wx3BXcji6w3WpwtquthU+tOvLgw0GA@mail.gmail.com>

Good evening,

The scipy.scipy test timed out again after 60 min for
https://github.com/scipy/scipy/pull/13566. Could somebody restart it,
please?

Kind regards,

Ramon.


On Mon, 15 Feb 2021 at 22:00, Ram?n Casero Ca?as <rcasero at gmail.com> wrote:

> Hi,
>
> i submitted a pull request to speed up hdquantile_sd (
> https://github.com/scipy/scipy/pull/13566).
>
> I think all tests passed, except one that was cancelled, but I don't think
> that has to do with my changes?
>
> azure-pipelines/ scipy.scipy
>
> Build log #L650
>
> The operation was canceled.
>
>
> Kind regards,
>
> Ramon.
>
> --
> Dr Ram?n Casero, CSci, Post-Doctoral Researcher
>
>
> MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit
> Harwell Campus, Oxfordshire, OX11 0RD, UK.
> Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk
>


-- 
Dr Ram?n Casero, CSci, Post-Doctoral Researcher


MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit
Harwell Campus, Oxfordshire, OX11 0RD, UK.
Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210216/1232f4d0/attachment.html>

From rcasero at gmail.com  Tue Feb 16 13:55:20 2021
From: rcasero at gmail.com (=?UTF-8?B?UmFtw7NuIENhc2VybyBDYcOxYXM=?=)
Date: Tue, 16 Feb 2021 18:55:20 +0000
Subject: [SciPy-Dev] Pull request test cancelled
In-Reply-To: <CA+t8hizHeKi3mHcUUiP8Wx3BXcji6w3WpwtquthU+tOvLgw0GA@mail.gmail.com>
References: <CA+t8hizbqW4qWN1q2XfXNXvJV=2+kQ4_yxLqbE+OjzPyEHteoA@mail.gmail.com>
 <CA+t8hizHeKi3mHcUUiP8Wx3BXcji6w3WpwtquthU+tOvLgw0GA@mail.gmail.com>
Message-ID: <CA+t8hixqqUcF5pM92kks_03ePTOnWy10COvB=rCrkvLdVmkf=A@mail.gmail.com>

Thanks to whomever did it. Now, all tests have passed.

Kind regards,

Ramon.


On Tue, 16 Feb 2021 at 17:48, Ram?n Casero Ca?as <rcasero at gmail.com> wrote:

> Good evening,
>
> The scipy.scipy test timed out again after 60 min for
> https://github.com/scipy/scipy/pull/13566. Could somebody restart it,
> please?
>
> Kind regards,
>
> Ramon.
>
>
> On Mon, 15 Feb 2021 at 22:00, Ram?n Casero Ca?as <rcasero at gmail.com>
> wrote:
>
>> Hi,
>>
>> i submitted a pull request to speed up hdquantile_sd (
>> https://github.com/scipy/scipy/pull/13566).
>>
>> I think all tests passed, except one that was cancelled, but I don't
>> think that has to do with my changes?
>>
>> azure-pipelines/ scipy.scipy
>>
>> Build log #L650
>>
>> The operation was canceled.
>>
>>
>> Kind regards,
>>
>> Ramon.
>>
>> --
>> Dr Ram?n Casero, CSci, Post-Doctoral Researcher
>>
>>
>> MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit
>> Harwell Campus, Oxfordshire, OX11 0RD, UK.
>> Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk
>>
>
>
> --
> Dr Ram?n Casero, CSci, Post-Doctoral Researcher
>
>
> MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit
> Harwell Campus, Oxfordshire, OX11 0RD, UK.
> Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk
>


-- 
Dr Ram?n Casero, CSci, Post-Doctoral Researcher


MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit
Harwell Campus, Oxfordshire, OX11 0RD, UK.
Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210216/30452e74/attachment.html>

From warren.weckesser at gmail.com  Wed Feb 17 13:58:50 2021
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Wed, 17 Feb 2021 13:58:50 -0500
Subject: [SciPy-Dev] Implementation of the Alexander-Govern test
Message-ID: <CAGzF1ueDcmkZPoXGvWBQxMntKJ-s9=ttHAP=RP6UJMSpY5q7yw@mail.gmail.com>

Hey all,

The pull request https://github.com/scipy/scipy/pull/12873 to add the
Alexander-Govern test (https://www.jstor.org/stable/1165140), a test
currently in the detailed roadmap for the stats module, looks ready to
merge.  I'd like to merge on Friday, so be sure to comment in the PR
if you have any issues with the implementation.  Thanks!

Warren

From samwallan at icloud.com  Wed Feb 17 21:09:22 2021
From: samwallan at icloud.com (Sam Wallan)
Date: Wed, 17 Feb 2021 18:09:22 -0800
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <mailman.48.1613393306.14274.scipy-dev@python.org>
References: <mailman.48.1613393306.14274.scipy-dev@python.org>
Message-ID: <FA5CE67C-5E72-490D-8762-15F1AFFCDE13@icloud.com>

Hello,

I?ve been working on a spreadsheet that compares Boost and SciPy. I looked at statistical distributions, special functions, and ODE solvers. Here?s the google sheets link: 

https://docs.google.com/spreadsheets/d/1zVaau6k1_0yQNW107D81RVCirWEN8sXwcYaWj2g8UNY/edit?usp=sharing

I?ve left it on suggestion mode with that sharing link, so if anyone has any thoughts please feel free to leave a comment. It looks like Boost may have a lot to add!

Regards, 

Sam


> On Feb 15, 2021, at 4:48 AM, scipy-dev-request at python.org wrote:
> 
> Send SciPy-Dev mailing list submissions to
> 	scipy-dev at python.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://mail.python.org/mailman/listinfo/scipy-dev
> or, via email, send a message with subject or body 'help' to
> 	scipy-dev-request at python.org
> 
> You can reach the person managing the list at
> 	scipy-dev-owner at python.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of SciPy-Dev digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: Boost for stats (Neal Becker)
>   2. Re: Boost for stats (Hans Dembinski)
>   3. Re: Boost for stats (Ralf Gommers)
>   4. Re: Boost for stats (Neal Becker)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Mon, 15 Feb 2021 07:23:38 -0500
> From: Neal Becker <ndbecker2 at gmail.com>
> To: SciPy Developers List <scipy-dev at python.org>
> Subject: Re: [SciPy-Dev] Boost for stats
> Message-ID:
> 	<CAG3t+pHTnLa+EHL5G=_Esvi1unvYO0+DNnv8RxGKryuTS+jBUg at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
> 
> I have been using   (and it's predecessor before it,
> boost::python) to package c++ code for python use for many years,
> including some of boost libraries.
> pybind11 is easy to use and is much better than e.g., cython for
> packaging c++ code.  pybind11 is also header-only.
> 
> I would also like to call attention for anyone interested in
> scientific software and c++ to a wonderful library (header-only),
> xtensor
> https://xtensor.readthedocs.io/en/latest/
> 
> On Mon, Feb 15, 2021 at 7:15 AM Hans Dembinski <hans.dembinski at gmail.com> wrote:
>> 
>> 
>>> On 15. Feb 2021, at 08:26, Andrew Nelson <andyfaff at gmail.com> wrote:
>>> 
>>> My questions would be:
>>> 
>>> - how portable is the boost code in general?
>> 
>> It is very portable. The core goal of Boost is to offer implementations with quality and portability on par with the C++ standard library implementations. Non-portable extensions are sometimes used to speed up things, but there is always a standard compliant vanilla version. In practice, maintainers test portability with CI on Windows, OSX, Linux, using various versions of gcc, clang, msvc, intel, see e.g.
>> 
>> https://github.com/boostorg/math/blob/develop/.github/workflows/ci.yml
>> 
>> and the Boost build farm from the days before free CI for OSS was easily available,
>> 
>> https://www.boost.org/development/tests/master/developer/move.html
>> 
>> Not all compilers/platforms are fully compliant, of course. Boost uses workarounds to combat that and submits bug reports on the compiler bug trackers.
>> 
>>> - how easy is it to install the library.
>> 
>> As Nicholas mentioned, Boost.Math (and Boost.Histogram) is header-only, so it is sufficient to include the headers.
>> 
>> Best regards,
>> Hans
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
> 
> 
> 
> -- 
> Those who don't understand recursion are doomed to repeat it
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Mon, 15 Feb 2021 13:35:25 +0100
> From: Hans Dembinski <hans.dembinski at gmail.com>
> To: SciPy Developers List <scipy-dev at python.org>
> Subject: Re: [SciPy-Dev] Boost for stats
> Message-ID: <13411649-82A4-4EC1-A58C-FAA3DDFF11D1 at gmail.com>
> Content-Type: text/plain;	charset=us-ascii
> 
> 
>> On 15. Feb 2021, at 01:47, Warren Weckesser <warren.weckesser at gmail.com> wrote:
>> 
>> * The Boost histogram library might provide some benefits over the
>>  existing NumPy and SciPy options.  (Hans Dembinski, the author
>>  of the histrogram library, has already commented in this email
>>  thread.)
> 
> I would happily support this. We currently offer a Python front-end to Boost.Histogram
> https://github.com/scikit-hep/boost-histogram
> which includes a numpy.histogram compatible interface.
> 
> Switching to Boost.Histogram may offer performance benefits, see
> https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html
> 
> Compared to np.histogram we saw a 1.7 times increase - single threaded, more if multiple threads are used. Compared to np.histogram2d we saw a 11 times increase. These numbers should probably be checked more carefully before decisions are made.
> 
> Boost.Histogram offers generalised histograms with arbitrary accumulators per cell, so it could also replace the implementations of https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html and friends.
> 
> Best regards,
> Hans
> 
> ------------------------------
> 
> Message: 3
> Date: Mon, 15 Feb 2021 13:41:51 +0100
> From: Ralf Gommers <ralf.gommers at gmail.com>
> To: SciPy Developers List <scipy-dev at python.org>
> Subject: Re: [SciPy-Dev] Boost for stats
> Message-ID:
> 	<CABL7CQjYZh0CyA6Kx5FULw2KaYMmdrLbm0Jecztc5+4z+r8OJg at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
> 
> On Mon, Feb 15, 2021 at 1:35 PM Hans Dembinski <hans.dembinski at gmail.com>
> wrote:
> 
>> 
>>> On 15. Feb 2021, at 01:47, Warren Weckesser <warren.weckesser at gmail.com>
>> wrote:
>>> 
>>> * The Boost histogram library might provide some benefits over the
>>>  existing NumPy and SciPy options.  (Hans Dembinski, the author
>>>  of the histrogram library, has already commented in this email
>>>  thread.)
>> 
>> I would happily support this. We currently offer a Python front-end to
>> Boost.Histogram
>> https://github.com/scikit-hep/boost-histogram
>> which includes a numpy.histogram compatible interface.
>> 
>> Switching to Boost.Histogram may offer performance benefits, see
>> 
>> https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html
>> 
>> Compared to np.histogram we saw a 1.7 times increase - single threaded,
>> more if multiple threads are used. Compared to np.histogram2d we saw a 11
>> times increase. These numbers should probably be checked more carefully
>> before decisions are made.
>> 
>> Boost.Histogram offers generalised histograms with arbitrary accumulators
>> per cell, so it could also replace the implementations of
>> https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html
>> and friends.
>> 
> 
> That would be really nice. binned_statistic is currently pure Python, and
> can be a performance hotspot (I've seen multiple cases of that in dealing
> with image and geospatial data).
> 
> Cheers,
> Ralf
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210215/5a972b62/attachment-0001.html>
> 
> ------------------------------
> 
> Message: 4
> Date: Mon, 15 Feb 2021 07:47:48 -0500
> From: Neal Becker <ndbecker2 at gmail.com>
> To: SciPy Developers List <scipy-dev at python.org>
> Subject: Re: [SciPy-Dev] Boost for stats
> Message-ID:
> 	<CAG3t+pH-Eq2KfEwRbN0UVRSNmdSkY-DmhD=C-Tvo6bnPKWNH_w at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
> 
> One thing I've missed with the current scipy histogram is the ability
> to do 'online' or 'incremental' collection of the histogram data.  For
> this reason I have written my own histogram code.  I am often
> collecting data from monte-carlo simulations and want to accumulate
> stats from data that arrives in batches.
> I don't know if boost-histogram supports this but if so I would find
> this very welcome.
> 
> On Mon, Feb 15, 2021 at 7:42 AM Ralf Gommers <ralf.gommers at gmail.com> wrote:
>> 
>> 
>> 
>> On Mon, Feb 15, 2021 at 1:35 PM Hans Dembinski <hans.dembinski at gmail.com> wrote:
>>> 
>>> 
>>>> On 15. Feb 2021, at 01:47, Warren Weckesser <warren.weckesser at gmail.com> wrote:
>>>> 
>>>> * The Boost histogram library might provide some benefits over the
>>>>  existing NumPy and SciPy options.  (Hans Dembinski, the author
>>>>  of the histrogram library, has already commented in this email
>>>>  thread.)
>>> 
>>> I would happily support this. We currently offer a Python front-end to Boost.Histogram
>>> https://github.com/scikit-hep/boost-histogram
>>> which includes a numpy.histogram compatible interface.
>>> 
>>> Switching to Boost.Histogram may offer performance benefits, see
>>> https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html
>>> 
>>> Compared to np.histogram we saw a 1.7 times increase - single threaded, more if multiple threads are used. Compared to np.histogram2d we saw a 11 times increase. These numbers should probably be checked more carefully before decisions are made.
>>> 
>>> Boost.Histogram offers generalised histograms with arbitrary accumulators per cell, so it could also replace the implementations of https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html and friends.
>> 
>> 
>> That would be really nice. binned_statistic is currently pure Python, and can be a performance hotspot (I've seen multiple cases of that in dealing with image and geospatial data).
>> 
>> Cheers,
>> Ralf
>> 
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
> 
> 
> 
> -- 
> Those who don't understand recursion are doomed to repeat it
> 
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
> 
> 
> ------------------------------
> 
> End of SciPy-Dev Digest, Vol 208, Issue 13
> ******************************************


From robert.kern at gmail.com  Wed Feb 17 21:39:00 2021
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 17 Feb 2021 21:39:00 -0500
Subject: [SciPy-Dev] GSoC'21 participation SciPy
In-Reply-To: <CABL7CQgbp60t5OT9-7kraLDmy9n0r7dxrG_jYkjC6oxMZKyqrg@mail.gmail.com>
References: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com>
 <CABL7CQjMkuCVnhWn+hRrHmxH=LA5F81FQazyEwMVu1Pw3+ijtQ@mail.gmail.com>
 <CAF6FJiv=eNLqvyEjhfdfZYg=a_1kUe7c+CJbfdMADr6vJSgqNw@mail.gmail.com>
 <CABL7CQgbp60t5OT9-7kraLDmy9n0r7dxrG_jYkjC6oxMZKyqrg@mail.gmail.com>
Message-ID: <CAF6FJis5XMo5TVRySsjHLidJR53P960dwHytsT8pri2axX2fZg@mail.gmail.com>

On Tue, Feb 16, 2021 at 9:03 AM Ralf Gommers <ralf.gommers at gmail.com> wrote:

>
> On Mon, Feb 15, 2021 at 10:19 PM Robert Kern <robert.kern at gmail.com>
> wrote:
>
>> On Mon, Feb 15, 2021 at 2:02 PM Ralf Gommers <ralf.gommers at gmail.com>
>> wrote:
>>
>>>
>>> On Mon, Feb 15, 2021 at 6:24 PM Pamphile Roy <roy.pamphile at gmail.com>
>>> wrote:
>>>
>>>> *scipy.optimize:* Would it be wanted to have a possibility to have
>>>> workers to evaluate the function during an optimization?
>>>> In most industrial context, the function is not trivial and might
>>>> require minutes if not hours or even days to compute.
>>>> Having a simple way to first parallelise the runs would help. We have
>>>> machines with easily ten cores now and it would be great to leverage this
>>>> here.
>>>>
>>>
>>> Definitely - see the mention of workers under
>>> http://scipy.github.io/devdocs/roadmap.html#performance-improvements.
>>>
>>> Going that direction, having a more general infrastructure to handle
>>>> external workers would be great.
>>>>
>>>
>>> I'm assuming you mean something like standard multiprocessing, or using
>>> a custom Pool object, for code that's trivially parallelizable. Both are
>>> covered by the `workers` pattern. If you're thinking about something else,
>>> can you elaborate?
>>>
>>
>> A standard approach for this is to organize the implementation of the
>> optimization algorithms in what's usually called an "ask-tell" interface.
>> The minimize()-style interface is easy to implement from an ask-tell
>> interface, but not vice-versa. Basically, you have the optimizer object
>> expose two methods, ask(), which returns a next point to evaluate, and
>> tell(), where you feed back the point and its evaluated function value.
>> You're in charge of evaluating that function. This gives you a lot of
>> flexibility in how to dispatch that function evaluation, and importantly,
>> we don't have to commit to any dependencies! That's the user's job!
>>
>> scikit-optimize implements their optimizers in this style, for example.
>> It's pretty common for optimizers that are geared towards expensive
>> evaluations.
>>
>>
>> https://scikit-optimize.github.io/stable/auto_examples/ask-and-tell.html
>>
>> I think it might be a well-scoped GSoC project to start re-implementing a
>> chosen handful of the algorithms in scipy.optimize in such an interface. It
>> could even be a trial run as an external package (even in scikit-optimize,
>> if they're amenable). Then we can evaluate whether we want to adopt that
>> framework inside scipy.optimize and make a roadmap for re-implementing all
>> of the algorithms in that style. It will be a technical challenge to adapt
>> the FORTRAN-implemented algorithms to such an interface.
>>
>> I will not be available to mentor such a project, but that's the general
>> approach that I would recommend. I think it would be a valuable addition.
>>
>
> Thanks Robert, that seems like an interesting exercise. This reminds me of
> the "class based optimizers" proposal. That didn't mention ask-tell, but
> the "reverse-communication" may be the same idea:
> https://github.com/scipy/scipy/pull/8552
> https://mail.python.org/pipermail/scipy-dev/2018-February/022449.html
>
> Your comments and the scikit-optimize link are imho a better justification
> for doing this exercise than we had before.
>

Yes, "reverse communication" is a FORTRAN-era term for the same general
idea. In FORTRAN reverse communication APIs, you would generally call the
one optimizer subroutine over and over again, passing in the current state
and function evaluation and reading the next point to evaluate (and other
state) from "intent out" variables. "ask-tell" is a somewhat more specific
instance of that idea, just in an OO context for which it's just a fairly
obvious design pattern once you have chosen to go OO and free yourself from
the constraints of a FORTRAN subroutine.

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210217/ad7ec07a/attachment-0001.html>

From nicholas.bgp at gmail.com  Wed Feb 17 21:44:53 2021
From: nicholas.bgp at gmail.com (Nicholas McKibben)
Date: Wed, 17 Feb 2021 19:44:53 -0700
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <FA5CE67C-5E72-490D-8762-15F1AFFCDE13@icloud.com>
References: <mailman.48.1613393306.14274.scipy-dev@python.org>
 <FA5CE67C-5E72-490D-8762-15F1AFFCDE13@icloud.com>
Message-ID: <CAHrOx8BbWAxWKDY9b=5bPb9um4rAWouXxTwYMjppkUQv9HOK9A@mail.gmail.com>

Hi all,

Responding to some comments I've seen fly by on this thread in no
particular order:

> However, playing devil's advocate somewhat:
> - does the scipy PR need the whole Boost.Math? If it only needs a select
subset (e.g., do we need root-finding etc?), then maybe the size can be
reduced.

As Hans mentioned, the Boost.Math depends on the whole of Boost, so not
without a lot of pain of detangling code and losing the ability to easily
bring in upstream updates.

> - do we need the whole thing? e.g. ufunc loops only need a select subset
of types.

Virtually all Boost functions (and certainly the ones we're dealing with in
the stats distributions) are templated. The ufunc generators I've written
specialize the templates to create all the types we need for the ufuncs
(single, double, and long double precision, specifically).  float16 could
be done in principle by unpacking to floats in the ufunc loop function, but
no other distribution considers float16 so I didn't either.

> - if we do go this route of taking parts / applying scipy specific
patches, what is easier to do or better maintenance-wise: vendor original
code + patches, or do the work once by porting relevant parts to standalone
C or C++ subset?

It sounds like the prefered option (taking the discussion here and in the
PR) is to include Boost as a submodule (which precludes SciPy specific
patches, incidentally) and track specific tagged commits or commits with
bug fixes as necessary.  The problem with porting to C is that we lose the
typing extensibility and easy upstream pulling of upstream bug fixes.
Existing C ports of Boost functions could (should?) be moved to use
Boost-proper to reduce maintenance burden.

> pybind11

The only reason I didn't consider pybind11 is because I've never used it
before and could get it done with Cython.  The only troublesome C++
features I ran into were non-type template parameters, but there are easy
workarounds for this.  If anyone would like to patch my PR to use pybind11,
please do!

> Probably good to make sure that aarch64 build times remain relatively
stable

Good point!  Can this be checked via a PR to scipy-wheels?

Best,
Nicholas

On Wed, Feb 17, 2021 at 7:09 PM Sam Wallan <samwallan at icloud.com> wrote:

> Hello,
>
> I?ve been working on a spreadsheet that compares Boost and SciPy. I looked
> at statistical distributions, special functions, and ODE solvers. Here?s
> the google sheets link:
>
>
> https://docs.google.com/spreadsheets/d/1zVaau6k1_0yQNW107D81RVCirWEN8sXwcYaWj2g8UNY/edit?usp=sharing
>
> I?ve left it on suggestion mode with that sharing link, so if anyone has
> any thoughts please feel free to leave a comment. It looks like Boost may
> have a lot to add!
>
> Regards,
>
> Sam
>
>
>
> > On Feb 15, 2021, at 4:48 AM, scipy-dev-request at python.org wrote:
> >
> > Send SciPy-Dev mailing list submissions to
> >       scipy-dev at python.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> >       https://mail.python.org/mailman/listinfo/scipy-dev
> > or, via email, send a message with subject or body 'help' to
> >       scipy-dev-request at python.org
> >
> > You can reach the person managing the list at
> >       scipy-dev-owner at python.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of SciPy-Dev digest..."
> >
> >
> > Today's Topics:
> >
> >   1. Re: Boost for stats (Neal Becker)
> >   2. Re: Boost for stats (Hans Dembinski)
> >   3. Re: Boost for stats (Ralf Gommers)
> >   4. Re: Boost for stats (Neal Becker)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Mon, 15 Feb 2021 07:23:38 -0500
> > From: Neal Becker <ndbecker2 at gmail.com>
> > To: SciPy Developers List <scipy-dev at python.org>
> > Subject: Re: [SciPy-Dev] Boost for stats
> > Message-ID:
> >       <CAG3t+pHTnLa+EHL5G=_
> Esvi1unvYO0+DNnv8RxGKryuTS+jBUg at mail.gmail.com>
> > Content-Type: text/plain; charset="UTF-8"
> >
> > I have been using   (and it's predecessor before it,
> > boost::python) to package c++ code for python use for many years,
> > including some of boost libraries.
> > pybind11 is easy to use and is much better than e.g., cython for
> > packaging c++ code.  pybind11 is also header-only.
> >
> > I would also like to call attention for anyone interested in
> > scientific software and c++ to a wonderful library (header-only),
> > xtensor
> > https://xtensor.readthedocs.io/en/latest/
> >
> > On Mon, Feb 15, 2021 at 7:15 AM Hans Dembinski <hans.dembinski at gmail.com>
> wrote:
> >>
> >>
> >>> On 15. Feb 2021, at 08:26, Andrew Nelson <andyfaff at gmail.com> wrote:
> >>>
> >>> My questions would be:
> >>>
> >>> - how portable is the boost code in general?
> >>
> >> It is very portable. The core goal of Boost is to offer implementations
> with quality and portability on par with the C++ standard library
> implementations. Non-portable extensions are sometimes used to speed up
> things, but there is always a standard compliant vanilla version. In
> practice, maintainers test portability with CI on Windows, OSX, Linux,
> using various versions of gcc, clang, msvc, intel, see e.g.
> >>
> >> https://github.com/boostorg/math/blob/develop/.github/workflows/ci.yml
> >>
> >> and the Boost build farm from the days before free CI for OSS was
> easily available,
> >>
> >> https://www.boost.org/development/tests/master/developer/move.html
> >>
> >> Not all compilers/platforms are fully compliant, of course. Boost uses
> workarounds to combat that and submits bug reports on the compiler bug
> trackers.
> >>
> >>> - how easy is it to install the library.
> >>
> >> As Nicholas mentioned, Boost.Math (and Boost.Histogram) is header-only,
> so it is sufficient to include the headers.
> >>
> >> Best regards,
> >> Hans
> >> _______________________________________________
> >> SciPy-Dev mailing list
> >> SciPy-Dev at python.org
> >> https://mail.python.org/mailman/listinfo/scipy-dev
> >
> >
> >
> > --
> > Those who don't understand recursion are doomed to repeat it
> >
> >
> > ------------------------------
> >
> > Message: 2
> > Date: Mon, 15 Feb 2021 13:35:25 +0100
> > From: Hans Dembinski <hans.dembinski at gmail.com>
> > To: SciPy Developers List <scipy-dev at python.org>
> > Subject: Re: [SciPy-Dev] Boost for stats
> > Message-ID: <13411649-82A4-4EC1-A58C-FAA3DDFF11D1 at gmail.com>
> > Content-Type: text/plain;     charset=us-ascii
> >
> >
> >> On 15. Feb 2021, at 01:47, Warren Weckesser <warren.weckesser at gmail.com>
> wrote:
> >>
> >> * The Boost histogram library might provide some benefits over the
> >>  existing NumPy and SciPy options.  (Hans Dembinski, the author
> >>  of the histrogram library, has already commented in this email
> >>  thread.)
> >
> > I would happily support this. We currently offer a Python front-end to
> Boost.Histogram
> > https://github.com/scikit-hep/boost-histogram
> > which includes a numpy.histogram compatible interface.
> >
> > Switching to Boost.Histogram may offer performance benefits, see
> >
> https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html
> >
> > Compared to np.histogram we saw a 1.7 times increase - single threaded,
> more if multiple threads are used. Compared to np.histogram2d we saw a 11
> times increase. These numbers should probably be checked more carefully
> before decisions are made.
> >
> > Boost.Histogram offers generalised histograms with arbitrary
> accumulators per cell, so it could also replace the implementations of
> https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html
> and friends.
> >
> > Best regards,
> > Hans
> >
> > ------------------------------
> >
> > Message: 3
> > Date: Mon, 15 Feb 2021 13:41:51 +0100
> > From: Ralf Gommers <ralf.gommers at gmail.com>
> > To: SciPy Developers List <scipy-dev at python.org>
> > Subject: Re: [SciPy-Dev] Boost for stats
> > Message-ID:
> >       <
> CABL7CQjYZh0CyA6Kx5FULw2KaYMmdrLbm0Jecztc5+4z+r8OJg at mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > On Mon, Feb 15, 2021 at 1:35 PM Hans Dembinski <hans.dembinski at gmail.com
> >
> > wrote:
> >
> >>
> >>> On 15. Feb 2021, at 01:47, Warren Weckesser <
> warren.weckesser at gmail.com>
> >> wrote:
> >>>
> >>> * The Boost histogram library might provide some benefits over the
> >>>  existing NumPy and SciPy options.  (Hans Dembinski, the author
> >>>  of the histrogram library, has already commented in this email
> >>>  thread.)
> >>
> >> I would happily support this. We currently offer a Python front-end to
> >> Boost.Histogram
> >> https://github.com/scikit-hep/boost-histogram
> >> which includes a numpy.histogram compatible interface.
> >>
> >> Switching to Boost.Histogram may offer performance benefits, see
> >>
> >>
> https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html
> >>
> >> Compared to np.histogram we saw a 1.7 times increase - single threaded,
> >> more if multiple threads are used. Compared to np.histogram2d we saw a
> 11
> >> times increase. These numbers should probably be checked more carefully
> >> before decisions are made.
> >>
> >> Boost.Histogram offers generalised histograms with arbitrary
> accumulators
> >> per cell, so it could also replace the implementations of
> >>
> https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html
> >> and friends.
> >>
> >
> > That would be really nice. binned_statistic is currently pure Python, and
> > can be a performance hotspot (I've seen multiple cases of that in dealing
> > with image and geospatial data).
> >
> > Cheers,
> > Ralf
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL: <
> https://mail.python.org/pipermail/scipy-dev/attachments/20210215/5a972b62/attachment-0001.html
> >
> >
> > ------------------------------
> >
> > Message: 4
> > Date: Mon, 15 Feb 2021 07:47:48 -0500
> > From: Neal Becker <ndbecker2 at gmail.com>
> > To: SciPy Developers List <scipy-dev at python.org>
> > Subject: Re: [SciPy-Dev] Boost for stats
> > Message-ID:
> >       <CAG3t+pH-Eq2KfEwRbN0UVRSNmdSkY-DmhD=
> C-Tvo6bnPKWNH_w at mail.gmail.com>
> > Content-Type: text/plain; charset="UTF-8"
> >
> > One thing I've missed with the current scipy histogram is the ability
> > to do 'online' or 'incremental' collection of the histogram data.  For
> > this reason I have written my own histogram code.  I am often
> > collecting data from monte-carlo simulations and want to accumulate
> > stats from data that arrives in batches.
> > I don't know if boost-histogram supports this but if so I would find
> > this very welcome.
> >
> > On Mon, Feb 15, 2021 at 7:42 AM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >>
> >>
> >>
> >> On Mon, Feb 15, 2021 at 1:35 PM Hans Dembinski <
> hans.dembinski at gmail.com> wrote:
> >>>
> >>>
> >>>> On 15. Feb 2021, at 01:47, Warren Weckesser <
> warren.weckesser at gmail.com> wrote:
> >>>>
> >>>> * The Boost histogram library might provide some benefits over the
> >>>>  existing NumPy and SciPy options.  (Hans Dembinski, the author
> >>>>  of the histrogram library, has already commented in this email
> >>>>  thread.)
> >>>
> >>> I would happily support this. We currently offer a Python front-end to
> Boost.Histogram
> >>> https://github.com/scikit-hep/boost-histogram
> >>> which includes a numpy.histogram compatible interface.
> >>>
> >>> Switching to Boost.Histogram may offer performance benefits, see
> >>>
> https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html
> >>>
> >>> Compared to np.histogram we saw a 1.7 times increase - single
> threaded, more if multiple threads are used. Compared to np.histogram2d we
> saw a 11 times increase. These numbers should probably be checked more
> carefully before decisions are made.
> >>>
> >>> Boost.Histogram offers generalised histograms with arbitrary
> accumulators per cell, so it could also replace the implementations of
> https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html
> and friends.
> >>
> >>
> >> That would be really nice. binned_statistic is currently pure Python,
> and can be a performance hotspot (I've seen multiple cases of that in
> dealing with image and geospatial data).
> >>
> >> Cheers,
> >> Ralf
> >>
> >> _______________________________________________
> >> SciPy-Dev mailing list
> >> SciPy-Dev at python.org
> >> https://mail.python.org/mailman/listinfo/scipy-dev
> >
> >
> >
> > --
> > Those who don't understand recursion are doomed to repeat it
> >
> >
> > ------------------------------
> >
> > Subject: Digest Footer
> >
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev at python.org
> > https://mail.python.org/mailman/listinfo/scipy-dev
> >
> >
> > ------------------------------
> >
> > End of SciPy-Dev Digest, Vol 208, Issue 13
> > ******************************************
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210217/a1b79d2f/attachment-0001.html>

From tyler.je.reddy at gmail.com  Wed Feb 17 22:27:40 2021
From: tyler.je.reddy at gmail.com (Tyler Reddy)
Date: Wed, 17 Feb 2021 20:27:40 -0700
Subject: [SciPy-Dev] ANN: SciPy 1.6.1
Message-ID: <CAHPuU_Yu5c8zsKFAqPD1ww1h0s48M4TePeJS6a6CwDU65wgGAA@mail.gmail.com>

Hi all,

On behalf of the SciPy development team I'm pleased to announce
the release of SciPy 1.6.1, which is a bug fix release.

Sources and binary wheels can be found at:
https://pypi.org/project/scipy/
and at: https://github.com/scipy/scipy/releases/tag/v1.6.1

One of a few ways to install this release with pip:

pip install scipy==1.6.1

=====================
SciPy 1.6.1 Release Notes
=====================

SciPy 1.6.1 is a bug-fix release with no new features
compared to 1.6.0.

Please note that for SciPy wheels to correctly install with Pip on
macOS 11, Pip >= 20.3.3 is needed.

Authors
======

* Peter Bell
* Evgeni Burovski
* CJ Carey
* Ralf Gommers
* Peter Mahler Larsen
* Cheng H. Lee +
* Cong Ma
* Nicholas McKibben
* Nikola Forr?
* Tyler Reddy
* Warren Weckesser

A total of 11 people contributed to this release.
People with a "+" by their names contributed a patch for the first time.
This list of names is automatically generated, and may not be fully
complete.

Issues closed for 1.6.1
-------------------------------

* `#13072 <https://github.com/scipy/scipy/issues/13072>`__: BLD: Quadpack
undefined references
* `#13241 <https://github.com/scipy/scipy/issues/13241>`__: Not enough
values to unpack when passing tuple to \`blocksize\`...
* `#13329 <https://github.com/scipy/scipy/issues/13329>`__: Large sparse
matrices of big integers lose information
* `#13342 <https://github.com/scipy/scipy/issues/13342>`__: fftn crashes if
shape arguments are supplied as ndarrays
* `#13356 <https://github.com/scipy/scipy/issues/13356>`__:
LSQBivariateSpline segmentation fault when quitting the Python...
* `#13358 <https://github.com/scipy/scipy/issues/13358>`__:
scipy.spatial.transform.Rotation object can not be deepcopied...
* `#13408 <https://github.com/scipy/scipy/issues/13408>`__: Type of
\`has_sorted_indices\` property
* `#13412 <https://github.com/scipy/scipy/issues/13412>`__: Sorting
spherical Voronoi vertices leads to crash in area calculation
* `#13421 <https://github.com/scipy/scipy/issues/13421>`__:
linear_sum_assignment - support for matrices with more than 2^31...
* `#13428 <https://github.com/scipy/scipy/issues/13428>`__:
\`stats.exponnorm.cdf\` returns \`nan\` for small values of \`K\`...
* `#13465 <https://github.com/scipy/scipy/issues/13465>`__:
KDTree.count_neighbors : 0xC0000005 error for tuple of different...
* `#13468 <https://github.com/scipy/scipy/issues/13468>`__:
directed_hausdorff issue with shuffle
* `#13472 <https://github.com/scipy/scipy/issues/13472>`__: Failures on
FutureWarnings with numpy 1.20.0 for lfilter, sosfilt...
* `#13565 <https://github.com/scipy/scipy/issues/13565>`__: BUG: 32-bit
wheels repo test failure in optimize

Pull requests for 1.6.1
-----------------------------

* `#13318 <https://github.com/scipy/scipy/pull/13318>`__: REL: prepare for
SciPy 1.6.1
* `#13344 <https://github.com/scipy/scipy/pull/13344>`__: BUG: fftpack
doesn't work with ndarray shape argument
* `#13345 <https://github.com/scipy/scipy/pull/13345>`__: MAINT: Replace
scipy.take with numpy.take in FFT function docstrings.
* `#13354 <https://github.com/scipy/scipy/pull/13354>`__: BUG: optimize:
rename private functions to include leading underscore
* `#13387 <https://github.com/scipy/scipy/pull/13387>`__: BUG: Support
big-endian platforms and big-endian WAVs
* `#13394 <https://github.com/scipy/scipy/pull/13394>`__: BUG: Fix Python
crash by allocating larger array in LSQBivariateSpline
* `#13400 <https://github.com/scipy/scipy/pull/13400>`__: BUG: sparse:
Better validation for BSR ctor
* `#13403 <https://github.com/scipy/scipy/pull/13403>`__: BUG: sparse:
Propagate dtype through CSR/CSC constructors
* `#13414 <https://github.com/scipy/scipy/pull/13414>`__: BUG: maintain
dtype of SphericalVoronoi regions
* `#13422 <https://github.com/scipy/scipy/pull/13422>`__: FIX: optimize:
use npy_intp to store array dims for lsap
* `#13425 <https://github.com/scipy/scipy/pull/13425>`__: BUG: spatial:
make Rotation picklable
* `#13426 <https://github.com/scipy/scipy/pull/13426>`__: BUG:
\`has_sorted_indices\` and \`has_canonical_format\` should...
* `#13430 <https://github.com/scipy/scipy/pull/13430>`__: BUG: stats: Fix
exponnorm.cdf and exponnorm.sf for small K
* `#13470 <https://github.com/scipy/scipy/pull/13470>`__: MAINT: silence
warning generated by \`spatial.directed_hausdorff\`
* `#13473 <https://github.com/scipy/scipy/pull/13473>`__: TST: fix failures
due to new FutureWarnings in NumPy 1.21.dev0
* `#13479 <https://github.com/scipy/scipy/pull/13479>`__: MAINT: update
directed_hausdorff Cython code
* `#13485 <https://github.com/scipy/scipy/pull/13485>`__: BUG: KDTree
weighted count_neighbors doesn't work between two...
* `#13503 <https://github.com/scipy/scipy/pull/13503>`__: TST: fix
\`test_fortranfile_read_mixed_record\` on big-endian...
* `#13518 <https://github.com/scipy/scipy/pull/13518>`__: DOC: document
that pip >= 20.3.3 is needed for macOS 11
* `#13520 <https://github.com/scipy/scipy/pull/13520>`__: BLD: update reqs
based on oldest-supported-numpy in pyproject.toml
* `#13567 <https://github.com/scipy/scipy/pull/13567>`__: TST, BUG: adjust
tol on test_equivalence

Checksums
=========

MD5
~~~

6312f6644420a0ad11f9dfb80aaa0560
 scipy-1.6.1-cp37-cp37m-macosx_10_9_x86_64.whl
0018622e5d32ca0cc690db152a371889  scipy-1.6.1-cp37-cp37m-manylinux1_i686.whl
7612dc5ebc5928d606b6f486e0edabad
 scipy-1.6.1-cp37-cp37m-manylinux1_x86_64.whl
bcbc57efab027e9c74fe4be8ac1b6470
 scipy-1.6.1-cp37-cp37m-manylinux2014_aarch64.whl
49d9b5824b22c87d184214497fec1079  scipy-1.6.1-cp37-cp37m-win32.whl
929834c270b3056997717bbcee58809c  scipy-1.6.1-cp37-cp37m-win_amd64.whl
4a862104bb2add633ead9a28496356ae
 scipy-1.6.1-cp38-cp38-macosx_10_9_x86_64.whl
c0dc4f798d0acc015c5fb36d3d97f4ed  scipy-1.6.1-cp38-cp38-manylinux1_i686.whl
8f0dce3503871db857f44a3ffb5800f6
 scipy-1.6.1-cp38-cp38-manylinux1_x86_64.whl
e4ee2176f25684d1cd3d21f0db5906ed
 scipy-1.6.1-cp38-cp38-manylinux2014_aarch64.whl
8589661ea9a320746aef8299cd16f32f  scipy-1.6.1-cp38-cp38-win32.whl
819424a909991eec489441880709a97c  scipy-1.6.1-cp38-cp38-win_amd64.whl
e7ea30f4dc26b79a3a2b9446afd4c572
 scipy-1.6.1-cp39-cp39-macosx_10_9_x86_64.whl
d8f7678b426174aba4a6184803d90c5a  scipy-1.6.1-cp39-cp39-manylinux1_i686.whl
d8f5ec24b15fef9786a6233c28003753
 scipy-1.6.1-cp39-cp39-manylinux1_x86_64.whl
4a832944f71c5f7b019f6539475647a2
 scipy-1.6.1-cp39-cp39-manylinux2014_aarch64.whl
5fff9d3f673e4ae73e76f02ea8544aa3  scipy-1.6.1-cp39-cp39-win32.whl
b03f9713b7b9be7fa019ab3c94c72254  scipy-1.6.1-cp39-cp39-win_amd64.whl
98a860ce2d6296cace333d0a07501f13  scipy-1.6.1.tar.gz
5cd15c4b4abf2e24ed05dbde9e7b90c8  scipy-1.6.1.tar.xz
a3c4bf7491ea0ab49bc8b149334f50f7  scipy-1.6.1.zip

SHA256
~~~~~~

a15a1f3fc0abff33e792d6049161b7795909b40b97c6cc2934ed54384017ab76
 scipy-1.6.1-cp37-cp37m-macosx_10_9_x86_64.whl
e79570979ccdc3d165456dd62041d9556fb9733b86b4b6d818af7a0afc15f092
 scipy-1.6.1-cp37-cp37m-manylinux1_i686.whl
a423533c55fec61456dedee7b6ee7dce0bb6bfa395424ea374d25afa262be261
 scipy-1.6.1-cp37-cp37m-manylinux1_x86_64.whl
33d6b7df40d197bdd3049d64e8e680227151673465e5d85723b3b8f6b15a6ced
 scipy-1.6.1-cp37-cp37m-manylinux2014_aarch64.whl
6725e3fbb47da428794f243864f2297462e9ee448297c93ed1dcbc44335feb78
 scipy-1.6.1-cp37-cp37m-win32.whl
5fa9c6530b1661f1370bcd332a1e62ca7881785cc0f80c0d559b636567fab63c
 scipy-1.6.1-cp37-cp37m-win_amd64.whl
bd50daf727f7c195e26f27467c85ce653d41df4358a25b32434a50d8870fc519
 scipy-1.6.1-cp38-cp38-macosx_10_9_x86_64.whl
f46dd15335e8a320b0fb4685f58b7471702234cba8bb3442b69a3e1dc329c345
 scipy-1.6.1-cp38-cp38-manylinux1_i686.whl
0e5b0ccf63155d90da576edd2768b66fb276446c371b73841e3503be1d63fb5d
 scipy-1.6.1-cp38-cp38-manylinux1_x86_64.whl
2481efbb3740977e3c831edfd0bd9867be26387cacf24eb5e366a6a374d3d00d
 scipy-1.6.1-cp38-cp38-manylinux2014_aarch64.whl
68cb4c424112cd4be886b4d979c5497fba190714085f46b8ae67a5e4416c32b4
 scipy-1.6.1-cp38-cp38-win32.whl
5f331eeed0297232d2e6eea51b54e8278ed8bb10b099f69c44e2558c090d06bf
 scipy-1.6.1-cp38-cp38-win_amd64.whl
0c8a51d33556bf70367452d4d601d1742c0e806cd0194785914daf19775f0e67
 scipy-1.6.1-cp39-cp39-macosx_10_9_x86_64.whl
83bf7c16245c15bc58ee76c5418e46ea1811edcc2e2b03041b804e46084ab627
 scipy-1.6.1-cp39-cp39-manylinux1_i686.whl
794e768cc5f779736593046c9714e0f3a5940bc6dcc1dba885ad64cbfb28e9f0
 scipy-1.6.1-cp39-cp39-manylinux1_x86_64.whl
5da5471aed911fe7e52b86bf9ea32fb55ae93e2f0fac66c32e58897cfb02fa07
 scipy-1.6.1-cp39-cp39-manylinux2014_aarch64.whl
8e403a337749ed40af60e537cc4d4c03febddcc56cd26e774c9b1b600a70d3e4
 scipy-1.6.1-cp39-cp39-win32.whl
a5193a098ae9f29af283dcf0041f762601faf2e595c0db1da929875b7570353f
 scipy-1.6.1-cp39-cp39-win_amd64.whl
c4fceb864890b6168e79b0e714c585dbe2fd4222768ee90bc1aa0f8218691b11
 scipy-1.6.1.tar.gz
2800f47a5040cbab05b3ce58f1dfb670c70232b0f56d30380c6fd4ef4e787df5
 scipy-1.6.1.tar.xz
18601effa06aba0e9f34475b6d34b3d7454feabe8b0f96bcc483b3fd38b0afc2
 scipy-1.6.1.zip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210217/1246de8d/attachment.html>

From andyfaff at gmail.com  Wed Feb 17 22:34:16 2021
From: andyfaff at gmail.com (Andrew Nelson)
Date: Thu, 18 Feb 2021 14:34:16 +1100
Subject: [SciPy-Dev] GSoC'21 participation SciPy
In-Reply-To: <CAF6FJis5XMo5TVRySsjHLidJR53P960dwHytsT8pri2axX2fZg@mail.gmail.com>
References: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com>
 <CABL7CQjMkuCVnhWn+hRrHmxH=LA5F81FQazyEwMVu1Pw3+ijtQ@mail.gmail.com>
 <CAF6FJiv=eNLqvyEjhfdfZYg=a_1kUe7c+CJbfdMADr6vJSgqNw@mail.gmail.com>
 <CABL7CQgbp60t5OT9-7kraLDmy9n0r7dxrG_jYkjC6oxMZKyqrg@mail.gmail.com>
 <CAF6FJis5XMo5TVRySsjHLidJR53P960dwHytsT8pri2axX2fZg@mail.gmail.com>
Message-ID: <CAAbtOZc8rXpLzskkKtk_V6qYNXV5=m_4k-9QqEV1jZ52aWpHKw@mail.gmail.com>

It's good to hear about the ask-tell interface, it's not something I'd
heard about before.

The class-based Optimizer that was proposed wasn't going to work in quite
that way. The main concept was to create an (e.g.) LBFGSB class (inheriting
a Minimizer superclass). All Minimizer objects would be iterators, having a
__next__ method that would perform one step of a minimisation loop.
Iterator based design syncs quite well with the loop based design of most
of the existing minimisation algorithms. The __next__ method would be
responsible for calling the user based functions. If the user based
functions could be marked as vectorisable the __next__ method could
despatch a whole series of `x` locations for the user function (one or all
of func/jac/hess) to evaluate; the user function could do whatever
parallelisation it wanted. Vectorisable function evaluations also offer
benefits for numerical differentiation evaluation.
The return value of __next__ would be something along the lines of an
intermediate OptimizeResult.

I don't know the ask-tell approach works in finer detail. For example, each
minimisation step typically requires multiple function evaluations to
proceed, e.g. at least once for func evaluation, and many times more for
grad/jac and hess evaluation (not to mention constraint function
evaluations). THerefore there wouldn't be a 1:1 correspondence of a single
ask-tell and a complete step of the minimizer.

I reckon the development of this would be way more than a single GSOC could
provide, at least to get a mature design into scipy. It's vital to get the
architecture correct (esp. the base class), when considering all the
minimizers that scipy offers, and their different vagaries. Implementing
for one or two minimizers wouldn't be sufficient otherwise one forgets that
they e.g. all have different approaches to halting, and you find yourself
bolting other things on to make things work. In addition, it's not just the
minimizer methods that are involved, it's considering how this all ties in
with how constraints/numerical differentiation/`LowLevelCallable`/etc could
be improved/used in such a design. At least for the methods involved in
`minimize` such an opportunity is the time to consider a total redesign of
how things work. Smart/vectorisable numerical differentiation would be more
than a whole GSOC in itself.

As Robert says, implementation in a separate package would probably be the
best way to work; once the bugs have been ironed out it could be merged
into scipy-proper. Any redesign could take into account the existing
API's/functionality to make things a less jarring change.

It'd be great to get the original class-based Optimization off the ground,
or something similar. However, it's worth noting that the original proposal
only received lukewarm support.

A.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210218/1cc56127/attachment-0001.html>

From ralf.gommers at gmail.com  Thu Feb 18 05:01:12 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Thu, 18 Feb 2021 11:01:12 +0100
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CAHrOx8BbWAxWKDY9b=5bPb9um4rAWouXxTwYMjppkUQv9HOK9A@mail.gmail.com>
References: <mailman.48.1613393306.14274.scipy-dev@python.org>
 <FA5CE67C-5E72-490D-8762-15F1AFFCDE13@icloud.com>
 <CAHrOx8BbWAxWKDY9b=5bPb9um4rAWouXxTwYMjppkUQv9HOK9A@mail.gmail.com>
Message-ID: <CABL7CQiioGOh4iSL90sbgxtWgPM6R6=K8-2jiXsJ4QQwU1kLZQ@mail.gmail.com>

On Thu, Feb 18, 2021 at 3:45 AM Nicholas McKibben <nicholas.bgp at gmail.com>
wrote:

>
> > Probably good to make sure that aarch64 build times remain relatively
> stable
>
> Good point!  Can this be checked via a PR to scipy-wheels?
>

Don't worry about this one, if compile time increase on other platforms is
minor, it'll be fine for aarch64 too. We have limited TravisCI credits
(actual status of that is a little unclear), so no need to burn them for
this. We anyway should be moving CI providers for aarch64 at some point,
probably to https://www.drone.io/

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210218/1212504b/attachment.html>

From matti.picus at gmail.com  Thu Feb 18 06:35:10 2021
From: matti.picus at gmail.com (Matti Picus)
Date: Thu, 18 Feb 2021 13:35:10 +0200
Subject: [SciPy-Dev] Speed up large NumPy arrays with PNumPy
Message-ID: <16bdbae7-3607-8ec4-93be-5a110a3acef9@gmail.com>

An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210218/31e5a4a4/attachment.html>

From touqir at ualberta.ca  Thu Feb 18 08:12:00 2021
From: touqir at ualberta.ca (Touqir Sajed)
Date: Thu, 18 Feb 2021 19:12:00 +0600
Subject: [SciPy-Dev] Faster maximum flow algorithm for scipy.sparse.csgraph
Message-ID: <CALfD4on4ZN59HJ+xuL4aX9W8fu0C-4g_x6OTvt-aZc=ggnojbQ@mail.gmail.com>

Dear Scipy developers,

This is a continuation of https://github.com/scipy/scipy/issues/13402 . The
current implementation of scipy.sparse.csgraph uses Edmond's Karp algorithm
which is not quite good in terms of theoretical time complexity but despite
of this, the implementation is optimized enough to outperform several other
superior (theoretical complexity) algorithms as shown here :
https://github.com/scipy/scipy/pull/10566#issuecomment-552615594 . Later I
carried out benchmarks (
https://github.com/scipy/scipy/issues/13402#issuecomment-767909167 )
showing that indeed scipy's Edmond-Karp implementation can be significantly
beaten with optimized implementations. My original concern was
Edmond-Karp's theoretical complexity which limits its performance in some
cases (highly dense graphs).  So, having another algorithm in scipy with a
better theoretical complexity along with proven superior empirical
performance makes sense. Only the algorithms here :
https://github.com/touqir14/MaxFlow have been shown to significantly
outperform scipy's Edmond-Karp. I think it would be good to port one or
several of these implementations into scipy. Having solely cython ports
will probably be easier to maintain. One thing to ponder here is how much
of a complex implementation should we allow if we decide to add new max
flow algorithms to scipy.

Let me know your thoughts.

Cheers,
Touqir
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210218/ca80be4d/attachment.html>

From mhaberla at calpoly.edu  Sat Feb 20 00:43:13 2021
From: mhaberla at calpoly.edu (Matt Haberland)
Date: Fri, 19 Feb 2021 21:43:13 -0800
Subject: [SciPy-Dev] SciPy 2021 Conference Seeking Submissions and Reviewers
Message-ID: <CADuxUiwE7Lero+Ldep=6XOJjnXLxzqcvz6fyG05xVc4hH8xzuQ@mail.gmail.com>

Dear SciPy Developers,

Besides an incredible library, "SciPy" is also the name of a conference about
scientific computing with Python and the scientific Python ecosystem as a
whole.

This year's conference, SciPy 2021 <https://www.scipy2021.scipy.org/>, will
be held virtually from July 14 - July 16, with two days before for
tutorials and two days after for developer sprints. Registration
<https://www.scipy2021.scipy.org/register> is now open!

There's nothing more motivating than a deadline, right? Well, there are
just three days left to submit your talk, poster, or tutorial: submissions
<https://www.scipy2021.scipy.org/talk-poster-presentations> are due at
11:59 p.m. on Monday, February 22.

We also seek volunteers to review submissions over the next few weeks.
Please indicate your interest here
<https://docs.google.com/forms/d/e/1FAIpQLSeXp3VzmgzVmmJdBAiw7Nvi0O85oR7CAf0tn8C7cyOsl0nDhA/viewform>
.

Feel free to contact me if you have any questions about the conference, and
I hope to "see" you there!

Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210219/771586e0/attachment.html>

From nicholas.bgp at gmail.com  Sun Feb 21 23:36:24 2021
From: nicholas.bgp at gmail.com (Nicholas McKibben)
Date: Sun, 21 Feb 2021 21:36:24 -0700
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CABL7CQiioGOh4iSL90sbgxtWgPM6R6=K8-2jiXsJ4QQwU1kLZQ@mail.gmail.com>
References: <mailman.48.1613393306.14274.scipy-dev@python.org>
 <FA5CE67C-5E72-490D-8762-15F1AFFCDE13@icloud.com>
 <CAHrOx8BbWAxWKDY9b=5bPb9um4rAWouXxTwYMjppkUQv9HOK9A@mail.gmail.com>
 <CABL7CQiioGOh4iSL90sbgxtWgPM6R6=K8-2jiXsJ4QQwU1kLZQ@mail.gmail.com>
Message-ID: <CAHrOx8Cm+LtpGYgjxRM7X_PJpp5BaweQwShPjenqpFeE5s8b+w@mail.gmail.com>

Local testing of inclusion of Boost as a submodule has revealed some
undesirable side effects:
- all sources, documentation, etc. regardless of relevance to SciPy must be
fetched
- recursive submodule initialization can take quite a while (~10 minutes on
my machine and internet connection)
- lots of churn when running commands like `git status`

Of course we will also need to see how this impacts the CI pipelines.  This
extra overhead may initially cause some timeouts.  Another option that will
alleviate some of these pains is to create a header only repo similar to
this one: https://github.com/povilasb/boost-header-only.  It could live in
the SciPy github account and would be easy to update -- simply download the
Boost tarball release and copy over the include directory only (or build a
specific commit locally and do the same thing).  It is more maintenance
than simply checking out the latest tagged release of Boost and updating
the submodules (adds an extra step of updating the header only repo), but
it minimizes space and bandwidth usage.  Thoughts?

On Thu, Feb 18, 2021 at 3:01 AM Ralf Gommers <ralf.gommers at gmail.com> wrote:

>
>
> On Thu, Feb 18, 2021 at 3:45 AM Nicholas McKibben <nicholas.bgp at gmail.com>
> wrote:
>
>>
>> > Probably good to make sure that aarch64 build times remain relatively
>> stable
>>
>> Good point!  Can this be checked via a PR to scipy-wheels?
>>
>
> Don't worry about this one, if compile time increase on other platforms is
> minor, it'll be fine for aarch64 too. We have limited TravisCI credits
> (actual status of that is a little unclear), so no need to burn them for
> this. We anyway should be moving CI providers for aarch64 at some point,
> probably to https://www.drone.io/
>
> Cheers,
> Ralf
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210221/8fbac7f9/attachment.html>

From matthew.brett at gmail.com  Mon Feb 22 02:56:10 2021
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 22 Feb 2021 07:56:10 +0000
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CAHrOx8Cm+LtpGYgjxRM7X_PJpp5BaweQwShPjenqpFeE5s8b+w@mail.gmail.com>
References: <mailman.48.1613393306.14274.scipy-dev@python.org>
 <FA5CE67C-5E72-490D-8762-15F1AFFCDE13@icloud.com>
 <CAHrOx8BbWAxWKDY9b=5bPb9um4rAWouXxTwYMjppkUQv9HOK9A@mail.gmail.com>
 <CABL7CQiioGOh4iSL90sbgxtWgPM6R6=K8-2jiXsJ4QQwU1kLZQ@mail.gmail.com>
 <CAHrOx8Cm+LtpGYgjxRM7X_PJpp5BaweQwShPjenqpFeE5s8b+w@mail.gmail.com>
Message-ID: <CAH6Pt5pO_LiSwx9vSRGQGVSk=PQB=QT+9w3_Scr-e5Jjeh5wdw@mail.gmail.com>

The header-only repository sounds like the better option to me.   That
level of git churn for the submodule would be a noticeable burden.

From ralf.gommers at gmail.com  Mon Feb 22 04:35:36 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Mon, 22 Feb 2021 10:35:36 +0100
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CAHrOx8Cm+LtpGYgjxRM7X_PJpp5BaweQwShPjenqpFeE5s8b+w@mail.gmail.com>
References: <mailman.48.1613393306.14274.scipy-dev@python.org>
 <FA5CE67C-5E72-490D-8762-15F1AFFCDE13@icloud.com>
 <CAHrOx8BbWAxWKDY9b=5bPb9um4rAWouXxTwYMjppkUQv9HOK9A@mail.gmail.com>
 <CABL7CQiioGOh4iSL90sbgxtWgPM6R6=K8-2jiXsJ4QQwU1kLZQ@mail.gmail.com>
 <CAHrOx8Cm+LtpGYgjxRM7X_PJpp5BaweQwShPjenqpFeE5s8b+w@mail.gmail.com>
Message-ID: <CABL7CQhmDfqbg0=qYDTBHXq8VyJGYVPofji=B7u1tG_EdZFgkg@mail.gmail.com>

On Mon, Feb 22, 2021 at 5:36 AM Nicholas McKibben <nicholas.bgp at gmail.com>
wrote:

> Local testing of inclusion of Boost as a submodule has revealed some
> undesirable side effects:
> - all sources, documentation, etc. regardless of relevance to SciPy must
> be fetched
> - recursive submodule initialization can take quite a while (~10 minutes
> on my machine and internet connection)
> - lots of churn when running commands like `git status`
>

Ouch, that's a lot slower than I expected. I'm not sure I understand it
though, there should be no `git status` churn at all (unless the build
process messes with files in-place?) and it's faster than cloning our own
repo:

$ time git clone git at github.com:boostorg/boost.git
Cloning into 'boost'...
remote: Enumerating objects: 15, done.
remote: Counting objects: 100% (15/15), done.
remote: Compressing objects: 100% (11/11), done.
remote: Total 254626 (delta 8), reused 11 (delta 4), pack-reused 254611
Receiving objects: 100% (254626/254626), 62.02 MiB | 7.47 MiB/s, done.
Resolving deltas: 100% (163071/163071), done.

real 0m12.221s
user 0m5.959s
sys 0m2.725s


$ time git clone git at github.com:scipy/scipy.git
Cloning into 'scipy'...
remote: Enumerating objects: 178585, done.
remote: Total 178585 (delta 0), reused 0 (delta 0), pack-reused 178585
Receiving objects: 100% (178585/178585), 104.61 MiB | 6.56 MiB/s, done.
Resolving deltas: 100% (137836/137836), done.

real 0m21.492s
user 0m9.620s
sys 0m3.231s


What should I test to reproduce the problem?

Cheers,
Ralf


> Of course we will also need to see how this impacts the CI pipelines.
> This extra overhead may initially cause some timeouts.  Another option that
> will alleviate some of these pains is to create a header only repo similar
> to this one: https://github.com/povilasb/boost-header-only.  It could
> live in the SciPy github account and would be easy to update -- simply
> download the Boost tarball release and copy over the include directory only
> (or build a specific commit locally and do the same thing).  It is more
> maintenance than simply checking out the latest tagged release of Boost and
> updating the submodules (adds an extra step of updating the header only
> repo), but it minimizes space and bandwidth usage.  Thoughts?
>
> On Thu, Feb 18, 2021 at 3:01 AM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>>
>> On Thu, Feb 18, 2021 at 3:45 AM Nicholas McKibben <nicholas.bgp at gmail.com>
>> wrote:
>>
>>>
>>> > Probably good to make sure that aarch64 build times remain relatively
>>> stable
>>>
>>> Good point!  Can this be checked via a PR to scipy-wheels?
>>>
>>
>> Don't worry about this one, if compile time increase on other platforms
>> is minor, it'll be fine for aarch64 too. We have limited TravisCI credits
>> (actual status of that is a little unclear), so no need to burn them for
>> this. We anyway should be moving CI providers for aarch64 at some point,
>> probably to https://www.drone.io/
>>
>> Cheers,
>> Ralf
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210222/082721c8/attachment.html>

From hans.dembinski at gmail.com  Mon Feb 22 06:22:53 2021
From: hans.dembinski at gmail.com (Hans Dembinski)
Date: Mon, 22 Feb 2021 12:22:53 +0100
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <CABL7CQhmDfqbg0=qYDTBHXq8VyJGYVPofji=B7u1tG_EdZFgkg@mail.gmail.com>
References: <mailman.48.1613393306.14274.scipy-dev@python.org>
 <FA5CE67C-5E72-490D-8762-15F1AFFCDE13@icloud.com>
 <CAHrOx8BbWAxWKDY9b=5bPb9um4rAWouXxTwYMjppkUQv9HOK9A@mail.gmail.com>
 <CABL7CQiioGOh4iSL90sbgxtWgPM6R6=K8-2jiXsJ4QQwU1kLZQ@mail.gmail.com>
 <CAHrOx8Cm+LtpGYgjxRM7X_PJpp5BaweQwShPjenqpFeE5s8b+w@mail.gmail.com>
 <CABL7CQhmDfqbg0=qYDTBHXq8VyJGYVPofji=B7u1tG_EdZFgkg@mail.gmail.com>
Message-ID: <8F1CE745-50F2-48C7-81E2-01E6B762E3F9@gmail.com>

Hi,

> On 22. Feb 2021, at 10:35, Ralf Gommers <ralf.gommers at gmail.com> wrote:
> 
> Ouch, that's a lot slower than I expected. I'm not sure I understand it though, there should be no `git status` churn at all (unless the build process messes with files in-place?) and it's faster than cloning our own repo:
> 
> $ time git clone git at github.com:boostorg/boost.git
> Cloning into 'boost'...
> remote: Enumerating objects: 15, done.
> remote: Counting objects: 100% (15/15), done.
> remote: Compressing objects: 100% (11/11), done.
> remote: Total 254626 (delta 8), reused 11 (delta 4), pack-reused 254611
> Receiving objects: 100% (254626/254626), 62.02 MiB | 7.47 MiB/s, done.
> Resolving deltas: 100% (163071/163071), done.
> 
> real 0m12.221s
> user 0m5.959s
> sys 0m2.725s

because of the a long-term goal to make boost more modular, cloning boostorg/boost like this only clones the so called superproject of Boost, which indeed very small. It consists itself of many submodules with the individual Boost libraries, like Boost.Math etc, which live in separate repositories. If you do

git clone --recurse-submodules git at github.com:boostorg/boost.git

instead, you will see the long delay. Fetching all the submodules indeed takes a lot of time, unfortunately. The main Boost repo includes 157 submodules.

Best regards,
Hans

From treverhines at gmail.com  Mon Feb 22 09:43:33 2021
From: treverhines at gmail.com (Trever Hines)
Date: Mon, 22 Feb 2021 09:43:33 -0500
Subject: [SciPy-Dev] ENH: improve RBF interpolation
In-Reply-To: <274ab45e-a9cc-4710-a40e-65b6f96c0b4a@www.fastmail.com>
References: <CACOuGAMBNyQcKe7a9tM7faB=mr_UwPNGiCuVn=WML7wT9jo7Qg@mail.gmail.com>
 <274ab45e-a9cc-4710-a40e-65b6f96c0b4a@www.fastmail.com>
Message-ID: <CACOuGANtDzqD6bC33MWXsk+sO0t6J8EgkKAFHtSpQ0L33Px1jw@mail.gmail.com>

Hello all,

I have made a pull request here https://github.com/scipy/scipy/pull/13595,
and I would appreciate any feedback.

On Thu, Feb 4, 2021 at 7:23 PM Stefan van der Walt <stefanv at berkeley.edu>
wrote:

>
> Is there any advantage to keeping the old interface, or should this
> eventually replace Rbf entirely?
>
>
My intention is for `RBFInterpolator` to replace `Rbf` entirely.  It should
be possible to replicate the functionality of `Rbf` with `RBFInterpolator`
(albeit with warnings when the interpolant may not be well-posed). `Rbf` is
not currently deprecated in my PR, but I can make that change if you think
it is appropriate.

-Trever
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210222/1990a5e3/attachment.html>

From nicholas.bgp at gmail.com  Mon Feb 22 09:54:44 2021
From: nicholas.bgp at gmail.com (Nicholas McKibben)
Date: Mon, 22 Feb 2021 07:54:44 -0700
Subject: [SciPy-Dev] Boost for stats
In-Reply-To: <8F1CE745-50F2-48C7-81E2-01E6B762E3F9@gmail.com>
References: <mailman.48.1613393306.14274.scipy-dev@python.org>
 <FA5CE67C-5E72-490D-8762-15F1AFFCDE13@icloud.com>
 <CAHrOx8BbWAxWKDY9b=5bPb9um4rAWouXxTwYMjppkUQv9HOK9A@mail.gmail.com>
 <CABL7CQiioGOh4iSL90sbgxtWgPM6R6=K8-2jiXsJ4QQwU1kLZQ@mail.gmail.com>
 <CAHrOx8Cm+LtpGYgjxRM7X_PJpp5BaweQwShPjenqpFeE5s8b+w@mail.gmail.com>
 <CABL7CQhmDfqbg0=qYDTBHXq8VyJGYVPofji=B7u1tG_EdZFgkg@mail.gmail.com>
 <8F1CE745-50F2-48C7-81E2-01E6B762E3F9@gmail.com>
Message-ID: <CAHrOx8CV21vhqPU=Ao1=94t1G0Z6oS_M73iW6mPLgC0LCbMmxg@mail.gmail.com>

Adding the following options to the .gitmodules file was also useful for
speeding up routine git commands:
    active = false
    ignore = true
    shallow = true

I am not sure if all of them are necessary - I don't profess to be a git
wizard. I still had commands such as 'git add -u' hang.

Indeed --recurse-submodules is necessary (and might be why the CI is
currently failing for the the PR).

Thanks,
Nicholas


On Mon, Feb 22, 2021, 04:23 Hans Dembinski <hans.dembinski at gmail.com> wrote:

> Hi,
>
> > On 22. Feb 2021, at 10:35, Ralf Gommers <ralf.gommers at gmail.com> wrote:
> >
> > Ouch, that's a lot slower than I expected. I'm not sure I understand it
> though, there should be no `git status` churn at all (unless the build
> process messes with files in-place?) and it's faster than cloning our own
> repo:
> >
> > $ time git clone git at github.com:boostorg/boost.git
> > Cloning into 'boost'...
> > remote: Enumerating objects: 15, done.
> > remote: Counting objects: 100% (15/15), done.
> > remote: Compressing objects: 100% (11/11), done.
> > remote: Total 254626 (delta 8), reused 11 (delta 4), pack-reused 254611
> > Receiving objects: 100% (254626/254626), 62.02 MiB | 7.47 MiB/s, done.
> > Resolving deltas: 100% (163071/163071), done.
> >
> > real 0m12.221s
> > user 0m5.959s
> > sys 0m2.725s
>
> because of the a long-term goal to make boost more modular, cloning
> boostorg/boost like this only clones the so called superproject of Boost,
> which indeed very small. It consists itself of many submodules with the
> individual Boost libraries, like Boost.Math etc, which live in separate
> repositories. If you do
>
> git clone --recurse-submodules git at github.com:boostorg/boost.git
>
> instead, you will see the long delay. Fetching all the submodules indeed
> takes a lot of time, unfortunately. The main Boost repo includes 157
> submodules.
>
> Best regards,
> Hans
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210222/2ed07987/attachment.html>

From roy.pamphile at gmail.com  Tue Feb 23 04:17:53 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Tue, 23 Feb 2021 10:17:53 +0100
Subject: [SciPy-Dev] merged scipy.stats.qmc with quasi-Monte Carlo
 functionality
Message-ID: <7FBC7377-4A05-4FBC-83B8-F31B5C4C752B@gmail.com>

Hi everyone,

First of all, thank you for everyone who helped with adding QMC (scipy.stats.qmc) :).

I wanted to give an overview of the current work in progress and some perspectives.

Waiting for reviews:

* Using QMC in scipy.optimize: https://github.com/scipy/scipy/pull/13469 <https://github.com/scipy/scipy/pull/13469> (had a few reviews, could be shipped fast and blocking Stefan Endres to work on improvements of shgo).
* New QMC method (LHS optimized): https://github.com/scipy/scipy/pull/13471 <https://github.com/scipy/scipy/pull/13471> (code was originally in the QMC PR and received initial reviews thanks to Matt Haberland, but since then nothing more).
* General tutorial: https://github.com/scipy/scipy/pull/13487 <https://github.com/scipy/scipy/pull/13487> (finished but Art Owen might drop in and add things, this could also wait for another PR).
* Small Cython refactoring of an internal function of Sobol?: https://github.com/scipy/scipy/pull/13514 <https://github.com/scipy/scipy/pull/13514> (waiting for a maintainer to be merged).

Work in progress:

* Port to Cython (Pythran is being looked at too) of discrepancy functions (thanks Arthur Volant): https://github.com/scipy/scipy/pull/13576 <https://github.com/scipy/scipy/pull/13576> .

Other Cythonization which have not been looked at yet, see this issue https://github.com/scipy/scipy/issues/13474 <https://github.com/scipy/scipy/issues/13474> .

Perspectives:

* Other scrambling methods (for Halton and Sobol?, but could also have more generic things like randomization of any QMC).
* Use QMC with any distribution from scipy.stats. There is an issue with some initial discussions, would need more opinion for me to continue and propose a PR: https://github.com/scipy/scipy/issues/13368 <https://github.com/scipy/scipy/issues/13368>.
* Add other QMC methods: lattice rules, other low discrepancy sequences, adaptive LHS, etc.
* Add other uniformity criteria: L-inf, minimum spanning tree, discrepancy over sub-spaces, etc.
* General method to construct an optimal design based on a metric. (Similar to what I propose with LHS optimized).
* Your ideas.

Thanks in advance for your help.

Cheers,

Pamphile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210223/ecfd756c/attachment.html>

From roy.pamphile at gmail.com  Thu Feb 25 06:55:12 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Thu, 25 Feb 2021 12:55:12 +0100
Subject: [SciPy-Dev] Using more functionalities from GitHub?
Message-ID: <0689A9B7-5561-4782-A83B-AF8C6EA80641@gmail.com>

Hi everyone,

I would like to propose to use more functionalities of GitHub. I could not find any reference showing this was discussed before,
so I apologize if the following would not be relevant.

Teams

Currently there is just the triage team, but we could create other teams like one per submodule or for given skill sets.
This could allow easier pinging of people on PR.

Project

We have the roadmap and some issues are acting as meta-issues. But we could use, instead/on top, the integrated project management dashboard.
It would be more convenient to keep track of what can be done, who is working on what area, etc. Also, this would clearly be in favor of openness and
transparency with the management of the library.


Cheers,
Pamphile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210225/4f3a7ef4/attachment.html>

From mhaberla at calpoly.edu  Thu Feb 25 14:21:42 2021
From: mhaberla at calpoly.edu (Matt Haberland)
Date: Thu, 25 Feb 2021 11:21:42 -0800
Subject: [SciPy-Dev] Using more functionalities from GitHub?
In-Reply-To: <0689A9B7-5561-4782-A83B-AF8C6EA80641@gmail.com>
References: <0689A9B7-5561-4782-A83B-AF8C6EA80641@gmail.com>
Message-ID: <CADuxUixutSDngiBZ-bMT8kAT=0be67PJREvmZzE7EhH-bU3WNg@mail.gmail.com>

I haven't used these features before, but they sound useful to me.

On Thu, Feb 25, 2021 at 3:55 AM Pamphile Roy <roy.pamphile at gmail.com> wrote:

> Hi everyone,
>
> I would like to propose to use more functionalities of GitHub. I could not
> find any reference showing this was discussed before,
> so I apologize if the following would not be relevant.
>
> *Teams*
>
> Currently there is just the *triage* team, but we could create other
> teams like one per submodule or for given skill sets.
> This could allow easier pinging of people on PR.
>
> *Project*
>
> We have the roadmap and some issues are acting as meta-issues. But we
> could use, instead/on top, the integrated project management dashboard.
> It would be more convenient to keep track of what can be done, who is
> working on what area, etc. Also, this would clearly be in favor of openness
> and
> transparency with the management of the library.
>
>
> Cheers,
> Pamphile
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>


-- 
Matt Haberland
Assistant Professor
BioResource and Agricultural Engineering
08A-3K, Cal Poly
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210225/19970677/attachment.html>

From tirthasheshpatel at gmail.com  Fri Feb 26 04:13:10 2021
From: tirthasheshpatel at gmail.com (Tirth Patel)
Date: Fri, 26 Feb 2021 14:43:10 +0530
Subject: [SciPy-Dev] mypy 0.770 is broken on Python 3.9.0a5
Message-ID: <CABpuv38hwDMGc4rf7_f6RzBSTovfjKymjAfUum4HX3ff2i5hGw@mail.gmail.com>

I recently proposed gh-13613
(https://github.com/scipy/scipy/pull/13613) which adds CI for type
checking and noticed this failure:

```
scipy/_lib/_uarray/_backend.py:96: error: syntax error in type comment  [syntax]
Found 1 error in 1 file (checked 662 source files)
```

I don't see any errors with Python 3.8.0 and mypy 0.770, mypy 0.780,
and mypy 0.812 (latest release). So, this probably has something to do
with Python 3.9.0a5 (which is currently being installed on
ubuntu-latest). I checked with mypy 0.780, mypy 0.800, and mypy 0.812
on Python 3.9.0a5. This error disappears in all those versions but
dozens of others arise.

I think this error is related to python/mypy#8614
(https://github.com/python/mypy/issues/8614) which got fixed in
mypy>=0.780.

So, if I am not wrong, we have two options here:
  - Stick with Python 3.8 and mypy 0.770.
  - Use mypy latest (0.812 currently) which works for both Python 3.8
and Python 3.9.

I think the latter would be a better choice as NumPy does it and mypy
is changing fast so it will help keep up with new changes. I have a
fix for all the errors occurring with mypy latest (mypy 0.812) and
would add the changes to the PR if there is a consensus to go that
way!

-- 
Kind Regards,
Tirth Patel

From ralf.gommers at gmail.com  Fri Feb 26 04:21:18 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Fri, 26 Feb 2021 10:21:18 +0100
Subject: [SciPy-Dev] mypy 0.770 is broken on Python 3.9.0a5
In-Reply-To: <CABpuv38hwDMGc4rf7_f6RzBSTovfjKymjAfUum4HX3ff2i5hGw@mail.gmail.com>
References: <CABpuv38hwDMGc4rf7_f6RzBSTovfjKymjAfUum4HX3ff2i5hGw@mail.gmail.com>
Message-ID: <CABL7CQgy1apwYHqkYJTVSvx0Z-3axzjyTbwtyOwTJQvf-NPNyA@mail.gmail.com>

On Fri, Feb 26, 2021 at 10:13 AM Tirth Patel <tirthasheshpatel at gmail.com>
wrote:

> I recently proposed gh-13613
> (https://github.com/scipy/scipy/pull/13613) which adds CI for type
> checking and noticed this failure:
>
> ```
> scipy/_lib/_uarray/_backend.py:96: error: syntax error in type comment
> [syntax]
> Found 1 error in 1 file (checked 662 source files)
> ```
>
> I don't see any errors with Python 3.8.0 and mypy 0.770, mypy 0.780,
> and mypy 0.812 (latest release). So, this probably has something to do
> with Python 3.9.0a5 (which is currently being installed on
> ubuntu-latest). I checked with mypy 0.780, mypy 0.800, and mypy 0.812
> on Python 3.9.0a5. This error disappears in all those versions but
> dozens of others arise.
>
> I think this error is related to python/mypy#8614
> (https://github.com/python/mypy/issues/8614) which got fixed in
> mypy>=0.780.
>
> So, if I am not wrong, we have two options here:
>   - Stick with Python 3.8 and mypy 0.770.
>   - Use mypy latest (0.812 currently) which works for both Python 3.8
> and Python 3.9.
>
> I think the latter would be a better choice as NumPy does it and mypy
> is changing fast so it will help keep up with new changes. I have a
> fix for all the errors occurring with mypy latest (mypy 0.812) and
> would add the changes to the PR if there is a consensus to go that
> way!
>

Yes we can bump mypy versions, no problem to update to the most recent
version if needed. It's not a runtime dependency, so there's not much of a
downside to updating other than a few maintainers and contributers needing
to update their local version.

Thanks for working on this Tirth!

Cheers,
Ralf


> --
> Kind Regards,
> Tirth Patel
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210226/761c1fda/attachment-0001.html>

From ralf.gommers at gmail.com  Sat Feb 27 11:25:13 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 27 Feb 2021 17:25:13 +0100
Subject: [SciPy-Dev] Using more functionalities from GitHub?
In-Reply-To: <0689A9B7-5561-4782-A83B-AF8C6EA80641@gmail.com>
References: <0689A9B7-5561-4782-A83B-AF8C6EA80641@gmail.com>
Message-ID: <CABL7CQgacWj5+5H2C-VpWMSJxZ1LTSkbM9btHDyexqcJ20a1Gw@mail.gmail.com>

On Thu, Feb 25, 2021 at 12:55 PM Pamphile Roy <roy.pamphile at gmail.com>
wrote:

> Hi everyone,
>
> I would like to propose to use more functionalities of GitHub. I could not
> find any reference showing this was discussed before,
> so I apologize if the following would not be relevant.
>

Thanks for asking Pamphile. We indeed haven't discussed this before, at
least not in the last few years.


> *Teams*
>
> Currently there is just the *triage* team, but we could create other
> teams like one per submodule or for given skill sets.
> This could allow easier pinging of people on PR.
>

I agree it would be good to make some improvements here. The team has grown
a lot, and many people don't know who are the experts/maintainers for some
module. A long time ago we tried to keep a list manually in the repo (a
MAINTAINERS.rst file), but that just went out of date and most people
didn't know to look there anyway.

A related problem: I have also stopped watching the repo, because the
amount of notifications I was getting was starting to get a little
overwhelming. Instead, I check in and browse new issues/PRs every few days
to a week - but that means I may miss relevant stuff. There's currently not
a good middle ground here.

PyTorch has a useful system where you can subscribe to a label, and then
once the label gets added a bot comes along and Cc's you on the issue. It
does require running a bot, which would be another piece of machinery to
maintain.

The most related GitHub feature is CODEOWNERS:
https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/about-code-owners.
It can be used to automatically request PR reviews from individuals or from
a team.

So there's at least three approaches:
1. Teams per submodule and other area
2. A bot to subscribe to labels
3. Using CODEOWNERS

The trouble with (1) is that it's a lot of overhead managing teams in the
GitHub UI, and only people with owner/maintainer status can do it.

My preference is (3) I think: it solves both problems to some extent, it's
the most granular (you can get notifications for individual files as well
as blog patterns), and it's a plain file in the repo that everyone can
propose changes to via a PR. For pinging people outside of PRs, we can use
the same file as documentation (just look at it, find the submodule/file of
interest, and see who is subscribed to it to @-mention them). Should we try
that?


> *Project*
>
> We have the roadmap and some issues are acting as meta-issues. But we
> could use, instead/on top, the integrated project management dashboard.
> It would be more convenient to keep track of what can be done, who is
> working on what area, etc.
>

My experience with GitHub Projects isn't great, it's more work than
tracking issues to keep up to date, and is less integrated with the rest of
the GitHub workflow. I'd be happy to give interested people the permissions
to create new project boards for specific topics if that's how they like to
work, but I'd like to keep it completely optional.


> Also, this would clearly be in favor of openness and transparency with the
> management of the library.
>

There's actually very little management going on. There's zero hidden
repositories or other content; the only thing is a very low-traffic private
maintainer mailing list that is meant only for topics that aren't always
good to discuss in public (mostly just deciding on giving someone more
permissions). Maybe it's time to give regular community calls a go again -
it's working quite well for NumPy. We tried it briefly a couple of years
ago and it was useful but I dropped the ball on organizing at some point
because I was too busy.

Should we try that again? Maybe regular once a month Zoom calls open to
anyone who wants to attend?

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210227/225c3712/attachment.html>

From christoph.baumgarten at gmail.com  Sat Feb 27 15:34:04 2021
From: christoph.baumgarten at gmail.com (Christoph Baumgarten)
Date: Sat, 27 Feb 2021 21:34:04 +0100
Subject: [SciPy-Dev] (no subject)
Message-ID: <CABXY2qCY04J5UDK+CmesHpQF8eurTj85w=c2xb4-DEyzvib2cw@mail.gmail.com>

Hi,

I implemented the Cramer-von-Mises test for two samples in PR 13263
<https://github.com/scipy/scipy/pull/13263>, The proposed name is
cramervonmises_2samp. The one-sample version (cramervonmises) was already
released in version 1.6.0. Since a few names of tests in scipy.stats were
recently discussed on the mailing list (though cramervonmises was not), I
just wanted to mention the PR here in case there are concerns about the
name.

One additional remark: for the KS test, there are three functions: kstest,
ks_1samp, ks_2samp and kstest can be used both for the one- and two-sample
tests. In my view, this makes the definition of kstest quite complicated
since the meaning of the parameters depends on the version of the test and
one needs a helper function _parse_kstest_args(data1, data2, args, N) in
stats/stats.py

So maybe cramervonmises_1samp and cramervonmises_2samp would have been a
good choice, though I hope the names cramervonmises and
cramervonmises_2samp also guide the user when to use which function. (While
writing this message, I noted that the documentation of cramervonmises
should state more clearly that it is about the one-sample test, e.g. "Perform
the Cram?r-von Mises test for goodness of fit.' --> 'Perfrom the one-sample
...')

Any views? Thanks for your feedback

Christoph
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210227/bd8effb5/attachment.html>

From ralf.gommers at gmail.com  Sun Feb 28 11:14:58 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 28 Feb 2021 17:14:58 +0100
Subject: [SciPy-Dev] (no subject)
In-Reply-To: <CABXY2qCY04J5UDK+CmesHpQF8eurTj85w=c2xb4-DEyzvib2cw@mail.gmail.com>
References: <CABXY2qCY04J5UDK+CmesHpQF8eurTj85w=c2xb4-DEyzvib2cw@mail.gmail.com>
Message-ID: <CABL7CQjSDODqZY7v1UeN8-pcChMqX1HYjqMt1U9gCcn410n9Nw@mail.gmail.com>

Hi Christoph,


On Sat, Feb 27, 2021 at 9:34 PM Christoph Baumgarten <
christoph.baumgarten at gmail.com> wrote:

> Hi,
>
> I implemented the Cramer-von-Mises test for two samples in PR 13263
> <https://github.com/scipy/scipy/pull/13263>, The proposed name is
> cramervonmises_2samp. The one-sample version (cramervonmises) was already
> released in version 1.6.0. Since a few names of tests in scipy.stats were
> recently discussed on the mailing list (though cramervonmises was not), I
> just wanted to mention the PR here in case there are concerns about the
> name.
>

This seems like a nice function to add.


> One additional remark: for the KS test, there are three functions: kstest,
> ks_1samp, ks_2samp and kstest can be used both for the one- and two-sample
> tests. In my view, this makes the definition of kstest quite complicated
> since the meaning of the parameters depends on the version of the test and
> one needs a helper function _parse_kstest_args(data1, data2, args, N) in
> stats/stats.py
>
> So maybe cramervonmises_1samp and cramervonmises_2samp would have been a
> good choice, though I hope the names cramervonmises and
> cramervonmises_2samp also guide the user when to use which function. (While
> writing this message, I noted that the documentation of cramervonmises
> should state more clearly that it is about the one-sample test, e.g. "Perform
> the Cram?r-von Mises test for goodness of fit.' --> 'Perfrom the one-sample
> ...')
>

I agree with your assessment - `kstest` doing both is not great, keeping
things separate like for cramervonmises(_2samp) is nicer.

Cheers,
Ralf


> Any views? Thanks for your feedback
>
> Christoph
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210228/e25ad598/attachment.html>

From roy.pamphile at gmail.com  Sun Feb 28 14:48:24 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Sun, 28 Feb 2021 20:48:24 +0100
Subject: [SciPy-Dev] Using more functionalities from GitHub?
In-Reply-To: <CABL7CQgacWj5+5H2C-VpWMSJxZ1LTSkbM9btHDyexqcJ20a1Gw@mail.gmail.com>
References: <0689A9B7-5561-4782-A83B-AF8C6EA80641@gmail.com>
 <CABL7CQgacWj5+5H2C-VpWMSJxZ1LTSkbM9btHDyexqcJ20a1Gw@mail.gmail.com>
Message-ID: <65EA20D3-70D3-4B9E-A010-3C9FA78B6A5A@gmail.com>


> On 27.02.2021, at 17:25, Ralf Gommers <ralf.gommers at gmail.com> wrote:
> 
> 
> The most related GitHub feature is CODEOWNERS: https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/about-code-owners <https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/about-code-owners>. It can be used to automatically request PR reviews from individuals or from a team.
> 
> So there's at least three approaches:
> 1. Teams per submodule and other area
> 2. A bot to subscribe to labels
> 3. Using CODEOWNERS
> 
> The trouble with (1) is that it's a lot of overhead managing teams in the GitHub UI, and only people with owner/maintainer status can do it.
> 
> My preference is (3) I think: it solves both problems to some extent, it's the most granular (you can get notifications for individual files as well as blog patterns), and it's a plain file in the repo that everyone can propose changes to via a PR. For pinging people outside of PRs, we can use the same file as documentation (just look at it, find the submodule/file of interest, and see who is subscribed to it to @-mention them). Should we try that?

Thanks for pointing out CODEOWNERS, I didn?t know about this! I agree with you, this looks like a good idea and would bring more value than just grouping members by tags.
This would also not prevent to also have, if needed, some grouping for convenience, like maintainer team, review team, build team, etc.

So, I vote in favor of this solution.

> 
> 
> Project
> 
> We have the roadmap and some issues are acting as meta-issues. But we could use, instead/on top, the integrated project management dashboard.
> It would be more convenient to keep track of what can be done, who is working on what area, etc.
> 
> My experience with GitHub Projects isn't great, it's more work than tracking issues to keep up to date, and is less integrated with the rest of the GitHub workflow. I'd be happy to give interested people the permissions to create new project boards for specific topics if that's how they like to work, but I'd like to keep it completely optional.
>  
> Also, this would clearly be in favor of openness and transparency with the management of the library.
> 
> There's actually very little management going on. There's zero hidden repositories or other content; the only thing is a very low-traffic private maintainer mailing list that is meant only for topics that aren't always good to discuss in public (mostly just deciding on giving someone more permissions). Maybe it's time to give regular community calls a go again - it's working quite well for NumPy. We tried it briefly a couple of years ago and it was useful but I dropped the ball on organizing at some point because I was too busy.
> 
> Should we try that again? Maybe regular once a month Zoom calls open to anyone who wants to attend?


Thanks for the clarification.

Currently, what I am personally missing is a way to easily understand the project directions. Why we do some things, where the project is going, why the roadmap is like this, etc.
So something a bit more detailed than the roadmap. Meeting minutes of such zoom calls could be relevant. But that?s some work to do? That?s why I suggested something like
project as it?s fairly quick to update and follow.

But I am a new player here, so I am just suggesting as you (the maintainers) will have to do most of the work here.

Cheers,
Pamphile

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210228/20d01f11/attachment.html>

From ralf.gommers at gmail.com  Sun Feb 28 17:13:13 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 28 Feb 2021 23:13:13 +0100
Subject: [SciPy-Dev] ENH: improve RBF interpolation
In-Reply-To: <CACOuGANtDzqD6bC33MWXsk+sO0t6J8EgkKAFHtSpQ0L33Px1jw@mail.gmail.com>
References: <CACOuGAMBNyQcKe7a9tM7faB=mr_UwPNGiCuVn=WML7wT9jo7Qg@mail.gmail.com>
 <274ab45e-a9cc-4710-a40e-65b6f96c0b4a@www.fastmail.com>
 <CACOuGANtDzqD6bC33MWXsk+sO0t6J8EgkKAFHtSpQ0L33Px1jw@mail.gmail.com>
Message-ID: <CABL7CQiRo08DJoVuXEyzrwWKv-aHRQqrFisXSVUKk7Cw6Y_H7g@mail.gmail.com>

On Mon, Feb 22, 2021 at 3:43 PM Trever Hines <treverhines at gmail.com> wrote:

> Hello all,
>
> I have made a pull request here https://github.com/scipy/scipy/pull/13595,
> and I would appreciate any feedback.
>

Your PR looks really good, thanks for working on this Trever!


> On Thu, Feb 4, 2021 at 7:23 PM Stefan van der Walt <stefanv at berkeley.edu>
> wrote:
>
>>
>> Is there any advantage to keeping the old interface, or should this
>> eventually replace Rbf entirely?
>>
>>
> My intention is for `RBFInterpolator` to replace `Rbf` entirely.  It
> should be possible to replicate the functionality of `Rbf` with
> `RBFInterpolator` (albeit with warnings when the interpolant may not be
> well-posed). `Rbf` is not currently deprecated in my PR, but I can make
> that change if you think it is appropriate.
>

It looks to me like deprecating Rbf is a good idea. However, it's best not
to do it too quickly after introducing the replacement, because then you
force users that want to be compatible with multiple versions of scipy to
do:

if scipy.__version__ >= 1.7.0:
    RBFInterpolator(...)
else:
   Rbf(...)

So I would suggest that in your open PR you add a note to the `Rbf`
docstring saying something like "`Rbf` is legacy code, for new usage please
use `RbfInterpolator` instead".

Cheers,
Ralf


>
> -Trever
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210228/a1c3f5fc/attachment.html>