From samogot at gmail.com  Thu Apr  1 18:05:10 2021
From: samogot at gmail.com (=?UTF-8?B?0IbQstCw0L0g0J3QsNC50LTRjNC+0L3QvtCy?=)
Date: Fri, 2 Apr 2021 00:05:10 +0200
Subject: [SciPy-Dev] Multivariate non-central hypergeometric distributions
 (Wallenius' and Fisher's)
In-Reply-To: <CAMJZOa2yw+NTFCMfUZ625kZ8NyL_Q8JWZyoXtQ5YdomwPxsPsw@mail.gmail.com>
References: <CAMJZOa2yw+NTFCMfUZ625kZ8NyL_Q8JWZyoXtQ5YdomwPxsPsw@mail.gmail.com>
Message-ID: <CAMJZOa0xemJR7WCjcMcEyGUKvqMqPF6YDtLxbb+r2e4d1k-HDA@mail.gmail.com>

Hi everyone.

Univariate versions of non-central hypergeometric distributions based
on Agner Fog's BiasedUrn C++ code were added recently (in
https://github.com/scipy/scipy/pull/13330). C++ code added in that PR
already contains the implementation of multivariate versions of the same
distributions. As far as I understand, the only things needed for
multivariate distributions to work are Python wrapper and probably some
tests.

Is anyone interested in adding them? If not, I might get to it myself later
this month, but as I haven't made any scipy contributions yet and am not
familiar with the codebase, I will need much more time to rump up than an
experienced contributor :)

--
Regards,
Ivan Naydonov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210402/455ec86f/attachment.html>

From tirthasheshpatel at gmail.com  Fri Apr  2 09:49:26 2021
From: tirthasheshpatel at gmail.com (Tirth Patel)
Date: Fri, 2 Apr 2021 19:19:26 +0530
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
Message-ID: <CABpuv38XtcJWOT6kskF_Rv3T=_0iSoNCVr7gtnupL0kGQixfWg@mail.gmail.com>

Hi all,

I would like to participate in GSoC this year and found this project
very interesting!

TL; DR: I have a few questions regarding the project:
  - Is the user interface desired as a separate python submodule
(inside `scipy.stats`) or does it serve as an extension of the `rvs`
method?
  - Should UNU.RAN C library be included as a submodule within SciPy
(e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
gh-13328)?

About Me
********
I am Tirth (@tirthasheshpatel on GitHub), a third-year computer
science undergrad student. I am quite familiar with Cython and a lot
of my college courses make use of C. I have a good knowledge of
probability theory and statistics.

Open Source work: I have participated in GSoC with the PyMC team last
year. I am a contributor to SciPy since May 2020 and recently a
maintainer.

About Project
*************
I had a question about the project. Is the user interface desired as a
separate python submodule inside `scipy.stats`? like:

    import scipy.stats as stats

    # sample a 1000 variates from a normal distribution
    # with mean 0 and std 1.5. Let UNU.RAN choose the method
    rvs = stats.random.normal(0., 1.5, size=1000, method='auto')

    # sample 100 samples from the beta distribution using TDR method
    beta_rvs = stats.random.beta(1, 2, size=100, method='tdr')

    # the `rvs` methods remains unaffected.
    norm_rvs = stats.norm.rvs(0, 1.5, size=1000)

Or does it serve as an extension of the `rvs` method:

    from scipy.stats import norm, beta

    # something like this:
    # method = None => same behaviour as previous versions
    # method = 'auto' => use UNU.RAN and let it choose the method
    rvs = norm.rvs(0, 1.5, size=1000, method='auto')

    beta_rvs = beta.rvs(1, 2, size=100, method='tdr')

Also, should UNU.RAN C library be included as a submodule within SciPy
(e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
gh-13328)?


--
Kind Regards,
Tirth

From christoph.baumgarten at gmail.com  Fri Apr  2 15:44:24 2021
From: christoph.baumgarten at gmail.com (Christoph Baumgarten)
Date: Fri, 2 Apr 2021 21:44:24 +0200
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
In-Reply-To: <mailman.9.1617379201.2885.scipy-dev@python.org>
References: <mailman.9.1617379201.2885.scipy-dev@python.org>
Message-ID: <CABXY2qAJt23pQASSMmLi8Ys=+q4SQmpD2hdxjTNZKaM4m=mjfw@mail.gmail.com>

Hi Tirth,

great to hear that you are interested in the project! My main goal would be
to add the "universal" rv generation methods to SciPy, e.g. PINV, TDR (UNU.RAN
User Manual (wu.ac.at)
<http://statmath.wu.ac.at/software/unuran/doc/unuran.html#Methods_005ffor_005fCONT>).
At the moment, we just have one such function in SciPy (Statistical
functions (scipy.stats) ? SciPy v1.6.2 Reference Guide
<https://docs.scipy.org/doc/scipy/reference/stats.html#random-variate-generation>)
and it is very basic (I implemented it a while ago). Such functionality is
very useful in many situations, see e.g. OverflowError when sampling from
some handmade stats distributions ? Issue #13051 ? scipy/scipy (github.com)
<https://github.com/scipy/scipy/issues/13051> So the API would rather be
name_of_sampling_method(pdf / cdf, parameters of the sampling methods).

Whether one should add a keyword to distribution.rvs(...) that allows the
user to choose the sampling method might be a question for a follow-up
project. This would also be quite time-consuming since you need to verify
which method is appropriate for a given distribution. A simpler task could
be to check if the rvs methods of a specific distribution could be
overwritten with the corresponding method in UNU.RAN (UNU.RAN User Manual
(wu.ac.at)
<http://statmath.wu.ac.at/software/unuran/doc/unuran.html#Stddist>).
For example, geninvgauss in SciPy relies on a Python implementation of a
rejection method / RoU and the implementation in UNU.RAN (gig / gig2) might
be faster. Also distributions with slow ppf methods relying on special
functions would be natural candidates. But that would also be of lower
priority for me.

I hope it helps. Feel free to reach out if you have more questions.

Christoph

On Fri, Apr 2, 2021 at 6:00 PM <scipy-dev-request at python.org> wrote:

> Send SciPy-Dev mailing list submissions to
>         scipy-dev at python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://mail.python.org/mailman/listinfo/scipy-dev
> or, via email, send a message with subject or body 'help' to
>         scipy-dev-request at python.org
>
> You can reach the person managing the list at
>         scipy-dev-owner at python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of SciPy-Dev digest..."
>
>
> Today's Topics:
>
>    1. Multivariate non-central hypergeometric distributions
>       (Wallenius' and Fisher's) (???? ?????????)
>    2. GSoC: Integrate library UNU.RAN into scipy.stats (Tirth Patel)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 2 Apr 2021 00:05:10 +0200
> From: ???? ????????? <samogot at gmail.com>
> To: scipy-dev at python.org
> Subject: [SciPy-Dev] Multivariate non-central hypergeometric
>         distributions (Wallenius' and Fisher's)
> Message-ID:
>         <
> CAMJZOa0xemJR7WCjcMcEyGUKvqMqPF6YDtLxbb+r2e4d1k-HDA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hi everyone.
>
> Univariate versions of non-central hypergeometric distributions based
> on Agner Fog's BiasedUrn C++ code were added recently (in
> https://github.com/scipy/scipy/pull/13330). C++ code added in that PR
> already contains the implementation of multivariate versions of the same
> distributions. As far as I understand, the only things needed for
> multivariate distributions to work are Python wrapper and probably some
> tests.
>
> Is anyone interested in adding them? If not, I might get to it myself later
> this month, but as I haven't made any scipy contributions yet and am not
> familiar with the codebase, I will need much more time to rump up than an
> experienced contributor :)
>
> --
> Regards,
> Ivan Naydonov
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://mail.python.org/pipermail/scipy-dev/attachments/20210402/455ec86f/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 2
> Date: Fri, 2 Apr 2021 19:19:26 +0530
> From: Tirth Patel <tirthasheshpatel at gmail.com>
> To: scipy-dev <scipy-dev at python.org>
> Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
> Message-ID:
>         <CABpuv38XtcJWOT6kskF_Rv3T=_
> 0iSoNCVr7gtnupL0kGQixfWg at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Hi all,
>
> I would like to participate in GSoC this year and found this project
> very interesting!
>
> TL; DR: I have a few questions regarding the project:
>   - Is the user interface desired as a separate python submodule
> (inside `scipy.stats`) or does it serve as an extension of the `rvs`
> method?
>   - Should UNU.RAN C library be included as a submodule within SciPy
> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
> gh-13328)?
>
> About Me
> ********
> I am Tirth (@tirthasheshpatel on GitHub), a third-year computer
> science undergrad student. I am quite familiar with Cython and a lot
> of my college courses make use of C. I have a good knowledge of
> probability theory and statistics.
>
> Open Source work: I have participated in GSoC with the PyMC team last
> year. I am a contributor to SciPy since May 2020 and recently a
> maintainer.
>
> About Project
> *************
> I had a question about the project. Is the user interface desired as a
> separate python submodule inside `scipy.stats`? like:
>
>     import scipy.stats as stats
>
>     # sample a 1000 variates from a normal distribution
>     # with mean 0 and std 1.5. Let UNU.RAN choose the method
>     rvs = stats.random.normal(0., 1.5, size=1000, method='auto')
>
>     # sample 100 samples from the beta distribution using TDR method
>     beta_rvs = stats.random.beta(1, 2, size=100, method='tdr')
>
>     # the `rvs` methods remains unaffected.
>     norm_rvs = stats.norm.rvs(0, 1.5, size=1000)
>
> Or does it serve as an extension of the `rvs` method:
>
>     from scipy.stats import norm, beta
>
>     # something like this:
>     # method = None => same behaviour as previous versions
>     # method = 'auto' => use UNU.RAN and let it choose the method
>     rvs = norm.rvs(0, 1.5, size=1000, method='auto')
>
>     beta_rvs = beta.rvs(1, 2, size=100, method='tdr')
>
> Also, should UNU.RAN C library be included as a submodule within SciPy
> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
> gh-13328)?
>
>
> --
> Kind Regards,
> Tirth
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
>
> ------------------------------
>
> End of SciPy-Dev Digest, Vol 210, Issue 2
> *****************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210402/5411758e/attachment.html>

From gabrielfranksimonetto at gmail.com  Sat Apr  3 12:31:50 2021
From: gabrielfranksimonetto at gmail.com (Gabriel Simonetto)
Date: Sat, 3 Apr 2021 13:31:50 -0300
Subject: [SciPy-Dev] GSoC proposal draft: Add type hints on scipy.special
In-Reply-To: <CAKFGQGxuY3OdVvdd0TWUBE091H9j0BV7DUKaxba91YOYT88Z2A@mail.gmail.com>
References: <CA+rKq-qcoFXyYkRa7WLPNCRWvwdy6HfTLb_XZS-rVj7vKvoqtQ@mail.gmail.com>
 <CAKFGQGxuY3OdVvdd0TWUBE091H9j0BV7DUKaxba91YOYT88Z2A@mail.gmail.com>
Message-ID: <CA+rKq-oZ7zhSgUfP7mbfnEH0MuM+6Q0Spy3=EL6mN6K1qzK3nQ@mail.gmail.com>

Hey Joshua, thanks for the answer!

I was looking into your points and I would like to ask for your opinion on
them:

1) For the Numpy versions, I couldn't understand why that is a problem for
the development phase, since I could work on my fork and change the numpy
dependency to point directly at numpy's master, rather than to an already
released version, would that be acceptable?

2) I was looking for how that would work and found the _generate_pyx.py
responsible for generating some automated stubs related to C code, but
haven't gotten very far with the f2py file. My problem with it is that
since the fortran functions aren't exposed, I don't understand how I should
declare that my stubs are referencing the functions instantiated on the
.pyf, is there such a thing as a .pyfi ?

I am sorry if my question implies a lack of understanding of both typing
and the f2py mechanisms, I am fairly new to these topics. It's a shame I
couldn't find your branch to better explore my shortcomings! Is there a
chance you know where it is?

So... I will try my best to enrich my proposal, but chances are that I
won't be able to make it so much better than it already is, given my lack
of experience, and that understanding type interactions would be something
that I would need to pick upon before the coding period starts.

I need to be honest about the fact that I don't already have the expertise
to do the job, and I would like to know if it's expected to pick things up
as the project happens, or if for this job in particular it wouldn't be
possible to learn so much stuff on so little time (or something like that)

Looking forward to any inputs, I appreciate the advice very much!


Em qui., 1 de abr. de 2021 ?s 00:15, Joshua Wilson <
josh.craig.wilson at gmail.com> escreveu:

> Hey Gabriel,
>
> One thing to consider is that a large chunk of special is ufuncs, and
> better ufunc typing is going to need PRs like
>
> https://github.com/numpy/numpy/pull/18417
>
> which aren't yet in a released version of NumPy. So you'll want to
> make sure the timing works out there.
>
> You mention .pyf files in the doc, they are an interesting case
> because ideally we'd be able to auto-generate stubs for them. I even
> have a languishing branch somewhere that has a start on doing that...
> there are a few complications because the objects exported by the pyf
> extension module are actually instances of one class, so you'd need to
> either fudge the typing a bit or use Generic a lot.
>
> I'd recommend thinking through how to handle the above complications
> and discussing that in the doc.
>
> - Josh (also the person142 mentioned in the doc)
>
> On Wed, Mar 31, 2021 at 5:42 PM Gabriel Simonetto
> <gabrielfranksimonetto at gmail.com> wrote:
> >
> > Hi everyone!
> >
> > I finished making a rough draft of what I would like to do for this
> year's gsoc, could someone give me some tips on how to improve it?
> >
> > In particular I would like to know if this is a good module for the
> project or if it would be better allocated elsewhere, and how could I make
> my timeline a bit sharper.
> >
> > Here is the link for adding comments:
> https://docs.google.com/document/d/1d3NkbQC9rBcoKkuOmsx95wYDO_OMA5m1PBSu_oiwQVw/edit?usp=sharing
> >
> > Thanks!
> > Gabriel Simonetto
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev at python.org
> > https://mail.python.org/mailman/listinfo/scipy-dev
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210403/b8d36cc7/attachment.html>

From tirthasheshpatel at gmail.com  Sat Apr  3 14:15:53 2021
From: tirthasheshpatel at gmail.com (Tirth Patel)
Date: Sat, 3 Apr 2021 23:45:53 +0530
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
In-Reply-To: <CABXY2qAJt23pQASSMmLi8Ys=+q4SQmpD2hdxjTNZKaM4m=mjfw@mail.gmail.com>
References: <mailman.9.1617379201.2885.scipy-dev@python.org>
 <CABXY2qAJt23pQASSMmLi8Ys=+q4SQmpD2hdxjTNZKaM4m=mjfw@mail.gmail.com>
Message-ID: <CABpuv3-_WUUsWNn_mArqPMiNWr6n2QXGL=H_TXSVDmZaHYRB7w@mail.gmail.com>

Hi Christoph,

Thanks for the reply!

On 4/3/21, Christoph Baumgarten <christoph.baumgarten at gmail.com> wrote:
> Hi Tirth,
>
> great to hear that you are interested in the project! My main goal would be
> to add the "universal" rv generation methods to SciPy, e.g. PINV, TDR
> (UNU.RAN
> User Manual (wu.ac.at)
> <http://statmath.wu.ac.at/software/unuran/doc/unuran.html#Methods_005ffor_005fCONT>).
> At the moment, we just have one such function in SciPy (Statistical
> functions (scipy.stats) ? SciPy v1.6.2 Reference Guide
> <https://docs.scipy.org/doc/scipy/reference/stats.html#random-variate-generation>)
> and it is very basic (I implemented it a while ago). Such functionality is
> very useful in many situations, see e.g. OverflowError when sampling from
> some handmade stats distributions ? Issue #13051 ? scipy/scipy (github.com)
> <https://github.com/scipy/scipy/issues/13051> So the API would rather be
> name_of_sampling_method(pdf / cdf, parameters of the sampling methods).
>

I think I get a general idea now. Thanks!

> Whether one should add a keyword to distribution.rvs(...) that allows the
> user to choose the sampling method might be a question for a follow-up
> project. This would also be quite time-consuming since you need to verify
> which method is appropriate for a given distribution. A simpler task could
> be to check if the rvs methods of a specific distribution could be
> overwritten with the corresponding method in UNU.RAN (UNU.RAN User Manual
> (wu.ac.at)
> <http://statmath.wu.ac.at/software/unuran/doc/unuran.html#Stddist>).
> For example, geninvgauss in SciPy relies on a Python implementation of a
> rejection method / RoU and the implementation in UNU.RAN (gig / gig2) might
> be faster. Also distributions with slow ppf methods relying on special
> functions would be natural candidates. But that would also be of lower
> priority for me.
>
> I hope it helps. Feel free to reach out if you have more questions.
>
> Christoph
>
> On Fri, Apr 2, 2021 at 6:00 PM <scipy-dev-request at python.org> wrote:
>
>> Send SciPy-Dev mailing list submissions to
>>         scipy-dev at python.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://mail.python.org/mailman/listinfo/scipy-dev
>> or, via email, send a message with subject or body 'help' to
>>         scipy-dev-request at python.org
>>
>> You can reach the person managing the list at
>>         scipy-dev-owner at python.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of SciPy-Dev digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Multivariate non-central hypergeometric distributions
>>       (Wallenius' and Fisher's) (???? ?????????)
>>    2. GSoC: Integrate library UNU.RAN into scipy.stats (Tirth Patel)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Fri, 2 Apr 2021 00:05:10 +0200
>> From: ???? ????????? <samogot at gmail.com>
>> To: scipy-dev at python.org
>> Subject: [SciPy-Dev] Multivariate non-central hypergeometric
>>         distributions (Wallenius' and Fisher's)
>> Message-ID:
>>         <
>> CAMJZOa0xemJR7WCjcMcEyGUKvqMqPF6YDtLxbb+r2e4d1k-HDA at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Hi everyone.
>>
>> Univariate versions of non-central hypergeometric distributions based
>> on Agner Fog's BiasedUrn C++ code were added recently (in
>> https://github.com/scipy/scipy/pull/13330). C++ code added in that PR
>> already contains the implementation of multivariate versions of the same
>> distributions. As far as I understand, the only things needed for
>> multivariate distributions to work are Python wrapper and probably some
>> tests.
>>
>> Is anyone interested in adding them? If not, I might get to it myself
>> later
>> this month, but as I haven't made any scipy contributions yet and am not
>> familiar with the codebase, I will need much more time to rump up than an
>> experienced contributor :)
>>
>> --
>> Regards,
>> Ivan Naydonov
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <
>> https://mail.python.org/pipermail/scipy-dev/attachments/20210402/455ec86f/attachment-0001.html
>> >
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Fri, 2 Apr 2021 19:19:26 +0530
>> From: Tirth Patel <tirthasheshpatel at gmail.com>
>> To: scipy-dev <scipy-dev at python.org>
>> Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
>> Message-ID:
>>         <CABpuv38XtcJWOT6kskF_Rv3T=_
>> 0iSoNCVr7gtnupL0kGQixfWg at mail.gmail.com>
>> Content-Type: text/plain; charset="UTF-8"
>>
>> Hi all,
>>
>> I would like to participate in GSoC this year and found this project
>> very interesting!
>>
>> TL; DR: I have a few questions regarding the project:
>>   - Is the user interface desired as a separate python submodule
>> (inside `scipy.stats`) or does it serve as an extension of the `rvs`
>> method?
>>   - Should UNU.RAN C library be included as a submodule within SciPy
>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>> gh-13328)?
>>
>> About Me
>> ********
>> I am Tirth (@tirthasheshpatel on GitHub), a third-year computer
>> science undergrad student. I am quite familiar with Cython and a lot
>> of my college courses make use of C. I have a good knowledge of
>> probability theory and statistics.
>>
>> Open Source work: I have participated in GSoC with the PyMC team last
>> year. I am a contributor to SciPy since May 2020 and recently a
>> maintainer.
>>
>> About Project
>> *************
>> I had a question about the project. Is the user interface desired as a
>> separate python submodule inside `scipy.stats`? like:
>>
>>     import scipy.stats as stats
>>
>>     # sample a 1000 variates from a normal distribution
>>     # with mean 0 and std 1.5. Let UNU.RAN choose the method
>>     rvs = stats.random.normal(0., 1.5, size=1000, method='auto')
>>
>>     # sample 100 samples from the beta distribution using TDR method
>>     beta_rvs = stats.random.beta(1, 2, size=100, method='tdr')
>>
>>     # the `rvs` methods remains unaffected.
>>     norm_rvs = stats.norm.rvs(0, 1.5, size=1000)
>>
>> Or does it serve as an extension of the `rvs` method:
>>
>>     from scipy.stats import norm, beta
>>
>>     # something like this:
>>     # method = None => same behaviour as previous versions
>>     # method = 'auto' => use UNU.RAN and let it choose the method
>>     rvs = norm.rvs(0, 1.5, size=1000, method='auto')
>>
>>     beta_rvs = beta.rvs(1, 2, size=100, method='tdr')
>>
>> Also, should UNU.RAN C library be included as a submodule within SciPy
>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>> gh-13328)?
>>
>>
>> --
>> Kind Regards,
>> Tirth
>>
>>
>> ------------------------------
>>
>> Subject: Digest Footer
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>>
>> ------------------------------
>>
>> End of SciPy-Dev Digest, Vol 210, Issue 2
>> *****************************************
>>
>


-- 
Kind Regards,
Tirth Patel

From h.klemm at gmx.de  Sun Apr  4 11:51:41 2021
From: h.klemm at gmx.de (Hanno Klemm)
Date: Sun, 4 Apr 2021 17:51:41 +0200
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
In-Reply-To: <CABXY2qAJt23pQASSMmLi8Ys=+q4SQmpD2hdxjTNZKaM4m=mjfw@mail.gmail.com>
References: <CABXY2qAJt23pQASSMmLi8Ys=+q4SQmpD2hdxjTNZKaM4m=mjfw@mail.gmail.com>
Message-ID: <2F5C5A29-42C7-4E41-9AEA-70E5AA1E7476@gmx.de>

Hi Christoph, Tirth,

this sounds like an interesting project, however, when I look at the documentation of UNU.RAN, it seems to be licensed under the GPL. I always thought that GPL is incompatible with scipy?s license?

Kind regards,
Hanno

> On 2. Apr 2021, at 21:44, Christoph Baumgarten <christoph.baumgarten at gmail.com> wrote:
> 
> ?
> 
> Hi Tirth,
> 
> great to hear that you are interested in the project! My main goal would be to add the "universal" rv generation methods to SciPy, e.g. PINV, TDR (UNU.RAN User Manual (wu.ac.at)). At the moment, we just have one such function in SciPy (Statistical functions (scipy.stats) ? SciPy v1.6.2 Reference Guide) and it is very basic (I implemented it a while ago). Such functionality is very useful in many situations, see e.g. OverflowError when sampling from some handmade stats distributions ? Issue #13051 ? scipy/scipy (github.com) So the API would rather be name_of_sampling_method(pdf / cdf, parameters of the sampling methods).
> 
> Whether one should add a keyword to distribution.rvs(...) that allows the user to choose the sampling method might be a question for a follow-up project. This would also be quite time-consuming since you need to verify which method is appropriate for a given distribution. A simpler task could be to check if the rvs methods of a specific distribution could be overwritten with the corresponding method in UNU.RAN (UNU.RAN User Manual (wu.ac.at)). For example, geninvgauss in SciPy relies on a Python implementation of a rejection method / RoU and the implementation in UNU.RAN (gig / gig2) might be faster. Also distributions with slow ppf methods relying on special functions would be natural candidates. But that would also be of lower priority for me.
> 
> I hope it helps. Feel free to reach out if you have more questions.
> 
> Christoph
> 
>> On Fri, Apr 2, 2021 at 6:00 PM <scipy-dev-request at python.org> wrote:
>> Send SciPy-Dev mailing list submissions to
>>         scipy-dev at python.org
>> 
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://mail.python.org/mailman/listinfo/scipy-dev
>> or, via email, send a message with subject or body 'help' to
>>         scipy-dev-request at python.org
>> 
>> You can reach the person managing the list at
>>         scipy-dev-owner at python.org
>> 
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of SciPy-Dev digest..."
>> 
>> 
>> Today's Topics:
>> 
>>    1. Multivariate non-central hypergeometric distributions
>>       (Wallenius' and Fisher's) (???? ?????????)
>>    2. GSoC: Integrate library UNU.RAN into scipy.stats (Tirth Patel)
>> 
>> 
>> ----------------------------------------------------------------------
>> 
>> Message: 1
>> Date: Fri, 2 Apr 2021 00:05:10 +0200
>> From: ???? ????????? <samogot at gmail.com>
>> To: scipy-dev at python.org
>> Subject: [SciPy-Dev] Multivariate non-central hypergeometric
>>         distributions (Wallenius' and Fisher's)
>> Message-ID:
>>         <CAMJZOa0xemJR7WCjcMcEyGUKvqMqPF6YDtLxbb+r2e4d1k-HDA at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>> 
>> Hi everyone.
>> 
>> Univariate versions of non-central hypergeometric distributions based
>> on Agner Fog's BiasedUrn C++ code were added recently (in
>> https://github.com/scipy/scipy/pull/13330). C++ code added in that PR
>> already contains the implementation of multivariate versions of the same
>> distributions. As far as I understand, the only things needed for
>> multivariate distributions to work are Python wrapper and probably some
>> tests.
>> 
>> Is anyone interested in adding them? If not, I might get to it myself later
>> this month, but as I haven't made any scipy contributions yet and am not
>> familiar with the codebase, I will need much more time to rump up than an
>> experienced contributor :)
>> 
>> --
>> Regards,
>> Ivan Naydonov
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210402/455ec86f/attachment-0001.html>
>> 
>> ------------------------------
>> 
>> Message: 2
>> Date: Fri, 2 Apr 2021 19:19:26 +0530
>> From: Tirth Patel <tirthasheshpatel at gmail.com>
>> To: scipy-dev <scipy-dev at python.org>
>> Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
>> Message-ID:
>>         <CABpuv38XtcJWOT6kskF_Rv3T=_0iSoNCVr7gtnupL0kGQixfWg at mail.gmail.com>
>> Content-Type: text/plain; charset="UTF-8"
>> 
>> Hi all,
>> 
>> I would like to participate in GSoC this year and found this project
>> very interesting!
>> 
>> TL; DR: I have a few questions regarding the project:
>>   - Is the user interface desired as a separate python submodule
>> (inside `scipy.stats`) or does it serve as an extension of the `rvs`
>> method?
>>   - Should UNU.RAN C library be included as a submodule within SciPy
>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>> gh-13328)?
>> 
>> About Me
>> ********
>> I am Tirth (@tirthasheshpatel on GitHub), a third-year computer
>> science undergrad student. I am quite familiar with Cython and a lot
>> of my college courses make use of C. I have a good knowledge of
>> probability theory and statistics.
>> 
>> Open Source work: I have participated in GSoC with the PyMC team last
>> year. I am a contributor to SciPy since May 2020 and recently a
>> maintainer.
>> 
>> About Project
>> *************
>> I had a question about the project. Is the user interface desired as a
>> separate python submodule inside `scipy.stats`? like:
>> 
>>     import scipy.stats as stats
>> 
>>     # sample a 1000 variates from a normal distribution
>>     # with mean 0 and std 1.5. Let UNU.RAN choose the method
>>     rvs = stats.random.normal(0., 1.5, size=1000, method='auto')
>> 
>>     # sample 100 samples from the beta distribution using TDR method
>>     beta_rvs = stats.random.beta(1, 2, size=100, method='tdr')
>> 
>>     # the `rvs` methods remains unaffected.
>>     norm_rvs = stats.norm.rvs(0, 1.5, size=1000)
>> 
>> Or does it serve as an extension of the `rvs` method:
>> 
>>     from scipy.stats import norm, beta
>> 
>>     # something like this:
>>     # method = None => same behaviour as previous versions
>>     # method = 'auto' => use UNU.RAN and let it choose the method
>>     rvs = norm.rvs(0, 1.5, size=1000, method='auto')
>> 
>>     beta_rvs = beta.rvs(1, 2, size=100, method='tdr')
>> 
>> Also, should UNU.RAN C library be included as a submodule within SciPy
>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>> gh-13328)?
>> 
>> 
>> --
>> Kind Regards,
>> Tirth
>> 
>> 
>> ------------------------------
>> 
>> Subject: Digest Footer
>> 
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>> 
>> 
>> ------------------------------
>> 
>> End of SciPy-Dev Digest, Vol 210, Issue 2
>> *****************************************
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210404/1c249fc2/attachment-0001.html>

From andrea.gavana at gmail.com  Sun Apr  4 11:54:22 2021
From: andrea.gavana at gmail.com (Andrea Gavana)
Date: Sun, 4 Apr 2021 17:54:22 +0200
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
In-Reply-To: <2F5C5A29-42C7-4E41-9AEA-70E5AA1E7476@gmx.de>
References: <CABXY2qAJt23pQASSMmLi8Ys=+q4SQmpD2hdxjTNZKaM4m=mjfw@mail.gmail.com>
 <2F5C5A29-42C7-4E41-9AEA-70E5AA1E7476@gmx.de>
Message-ID: <CAEf70byEr9gP38aJfw4zGt9qHfmct+LVAn0WgLakK_EkBcEGsw@mail.gmail.com>

Hi Hanno,

On Sun, 4 Apr 2021 at 17.52, Hanno Klemm <h.klemm at gmx.de> wrote:

> Hi Christoph, Tirth,
>
> this sounds like an interesting project, however, when I look at the
> documentation of UNU.RAN, it seems to be licensed under the GPL. I always
> thought that GPL is incompatible with scipy?s license?
>

As far as I have understood, the developers of UNU.RAN have opted for a
much more open license if the library gets embedded into SciPy. I think I
saw a message in the mailing list about that, but I can?t find it at the
moment.

Andrea.


> Kind regards,
> Hanno
>
> On 2. Apr 2021, at 21:44, Christoph Baumgarten <
> christoph.baumgarten at gmail.com> wrote:
>
> ?
>
> Hi Tirth,
>
> great to hear that you are interested in the project! My main goal would
> be to add the "universal" rv generation methods to SciPy, e.g. PINV, TDR (UNU.RAN
> User Manual (wu.ac.at)
> <http://statmath.wu.ac.at/software/unuran/doc/unuran.html#Methods_005ffor_005fCONT>).
> At the moment, we just have one such function in SciPy (Statistical
> functions (scipy.stats) ? SciPy v1.6.2 Reference Guide
> <https://docs.scipy.org/doc/scipy/reference/stats.html#random-variate-generation>)
> and it is very basic (I implemented it a while ago). Such functionality is
> very useful in many situations, see e.g. OverflowError when sampling from
> some handmade stats distributions ? Issue #13051 ? scipy/scipy (github.com)
> <https://github.com/scipy/scipy/issues/13051> So the API would rather be
> name_of_sampling_method(pdf / cdf, parameters of the sampling methods).
>
> Whether one should add a keyword to distribution.rvs(...) that allows the
> user to choose the sampling method might be a question for a follow-up
> project. This would also be quite time-consuming since you need to verify
> which method is appropriate for a given distribution. A simpler task could
> be to check if the rvs methods of a specific distribution could be
> overwritten with the corresponding method in UNU.RAN (UNU.RAN User Manual
> (wu.ac.at)
> <http://statmath.wu.ac.at/software/unuran/doc/unuran.html#Stddist>).
> For example, geninvgauss in SciPy relies on a Python implementation of a
> rejection method / RoU and the implementation in UNU.RAN (gig / gig2) might
> be faster. Also distributions with slow ppf methods relying on special
> functions would be natural candidates. But that would also be of lower
> priority for me.
>
> I hope it helps. Feel free to reach out if you have more questions.
>
> Christoph
>
> On Fri, Apr 2, 2021 at 6:00 PM <scipy-dev-request at python.org> wrote:
>
>> Send SciPy-Dev mailing list submissions to
>>         scipy-dev at python.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://mail.python.org/mailman/listinfo/scipy-dev
>> or, via email, send a message with subject or body 'help' to
>>         scipy-dev-request at python.org
>>
>> You can reach the person managing the list at
>>         scipy-dev-owner at python.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of SciPy-Dev digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Multivariate non-central hypergeometric distributions
>>       (Wallenius' and Fisher's) (???? ?????????)
>>    2. GSoC: Integrate library UNU.RAN into scipy.stats (Tirth Patel)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Fri, 2 Apr 2021 00:05:10 +0200
>> From: ???? ????????? <samogot at gmail.com>
>> To: scipy-dev at python.org
>> Subject: [SciPy-Dev] Multivariate non-central hypergeometric
>>         distributions (Wallenius' and Fisher's)
>> Message-ID:
>>         <
>> CAMJZOa0xemJR7WCjcMcEyGUKvqMqPF6YDtLxbb+r2e4d1k-HDA at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Hi everyone.
>>
>> Univariate versions of non-central hypergeometric distributions based
>> on Agner Fog's BiasedUrn C++ code were added recently (in
>> https://github.com/scipy/scipy/pull/13330). C++ code added in that PR
>> already contains the implementation of multivariate versions of the same
>> distributions. As far as I understand, the only things needed for
>> multivariate distributions to work are Python wrapper and probably some
>> tests.
>>
>> Is anyone interested in adding them? If not, I might get to it myself
>> later
>> this month, but as I haven't made any scipy contributions yet and am not
>> familiar with the codebase, I will need much more time to rump up than an
>> experienced contributor :)
>>
>> --
>> Regards,
>> Ivan Naydonov
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <
>> https://mail.python.org/pipermail/scipy-dev/attachments/20210402/455ec86f/attachment-0001.html
>> >
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Fri, 2 Apr 2021 19:19:26 +0530
>> From: Tirth Patel <tirthasheshpatel at gmail.com>
>> To: scipy-dev <scipy-dev at python.org>
>> Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
>> Message-ID:
>>         <CABpuv38XtcJWOT6kskF_Rv3T=_
>> 0iSoNCVr7gtnupL0kGQixfWg at mail.gmail.com>
>> Content-Type: text/plain; charset="UTF-8"
>>
>> Hi all,
>>
>> I would like to participate in GSoC this year and found this project
>> very interesting!
>>
>> TL; DR: I have a few questions regarding the project:
>>   - Is the user interface desired as a separate python submodule
>> (inside `scipy.stats`) or does it serve as an extension of the `rvs`
>> method?
>>   - Should UNU.RAN C library be included as a submodule within SciPy
>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>> gh-13328)?
>>
>> About Me
>> ********
>> I am Tirth (@tirthasheshpatel on GitHub), a third-year computer
>> science undergrad student. I am quite familiar with Cython and a lot
>> of my college courses make use of C. I have a good knowledge of
>> probability theory and statistics.
>>
>> Open Source work: I have participated in GSoC with the PyMC team last
>> year. I am a contributor to SciPy since May 2020 and recently a
>> maintainer.
>>
>> About Project
>> *************
>> I had a question about the project. Is the user interface desired as a
>> separate python submodule inside `scipy.stats`? like:
>>
>>     import scipy.stats as stats
>>
>>     # sample a 1000 variates from a normal distribution
>>     # with mean 0 and std 1.5. Let UNU.RAN choose the method
>>     rvs = stats.random.normal(0., 1.5, size=1000, method='auto')
>>
>>     # sample 100 samples from the beta distribution using TDR method
>>     beta_rvs = stats.random.beta(1, 2, size=100, method='tdr')
>>
>>     # the `rvs` methods remains unaffected.
>>     norm_rvs = stats.norm.rvs(0, 1.5, size=1000)
>>
>> Or does it serve as an extension of the `rvs` method:
>>
>>     from scipy.stats import norm, beta
>>
>>     # something like this:
>>     # method = None => same behaviour as previous versions
>>     # method = 'auto' => use UNU.RAN and let it choose the method
>>     rvs = norm.rvs(0, 1.5, size=1000, method='auto')
>>
>>     beta_rvs = beta.rvs(1, 2, size=100, method='tdr')
>>
>> Also, should UNU.RAN C library be included as a submodule within SciPy
>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>> gh-13328)?
>>
>>
>> --
>> Kind Regards,
>> Tirth
>>
>>
>> ------------------------------
>>
>> Subject: Digest Footer
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>>
>> ------------------------------
>>
>> End of SciPy-Dev Digest, Vol 210, Issue 2
>> *****************************************
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210404/940409c0/attachment.html>

From robert.kern at gmail.com  Sun Apr  4 12:28:21 2021
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 4 Apr 2021 12:28:21 -0400
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
In-Reply-To: <CAEf70byEr9gP38aJfw4zGt9qHfmct+LVAn0WgLakK_EkBcEGsw@mail.gmail.com>
References: <CABXY2qAJt23pQASSMmLi8Ys=+q4SQmpD2hdxjTNZKaM4m=mjfw@mail.gmail.com>
 <2F5C5A29-42C7-4E41-9AEA-70E5AA1E7476@gmx.de>
 <CAEf70byEr9gP38aJfw4zGt9qHfmct+LVAn0WgLakK_EkBcEGsw@mail.gmail.com>
Message-ID: <CAF6FJisjCny42j_kNbTqvOLshmEfb4+FoVgLQ1G1c9CO3rtoOQ@mail.gmail.com>

On Sun, Apr 4, 2021 at 11:55 AM Andrea Gavana <andrea.gavana at gmail.com>
wrote:

> Hi Hanno,
>
> On Sun, 4 Apr 2021 at 17.52, Hanno Klemm <h.klemm at gmx.de> wrote:
>
>> Hi Christoph, Tirth,
>>
>> this sounds like an interesting project, however, when I look at the
>> documentation of UNU.RAN, it seems to be licensed under the GPL. I always
>> thought that GPL is incompatible with scipy?s license?
>>
>
> As far as I have understood, the developers of UNU.RAN have opted for a
> much more open license if the library gets embedded into SciPy. I think I
> saw a message in the mailing list about that, but I can?t find it at the
> moment.
>

https://mail.python.org/pipermail/scipy-dev/2021-March/024641.html

Yes, we have been given permission to use UNU.RAN under a BSD license.
There is at least one file that was contributed by a third party in
`src/uniform/` to supply different core uniform PRNGs, and that's still
under the GPL. But we can ignore everything in that directory since we'll
want to be using numpy `BitGenerator`s instead.

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210404/b3c765bf/attachment.html>

From PeterBell10 at live.co.uk  Sun Apr  4 17:03:01 2021
From: PeterBell10 at live.co.uk (Peter Bell)
Date: Sun, 4 Apr 2021 21:03:01 +0000
Subject: [SciPy-Dev] Proposal to deprecate squeeze on distance metric inputs
Message-ID: <AM4PR0501MB2273A7BC24AE189BE59BAE758E789@AM4PR0501MB2273.eurprd05.prod.outlook.com>

Currently, all the metrics in scipy.spatial.distance? call _validate_vector which internally does

input = np.atleast_1d(input.squeeze())

This means the input can have any number of extra length-1 dimensions which are ignored and not propagated to the output.  Shapes that shouldn't be compatible (e.g. (1, 3) and (3, 1) array) also match up as if they were 1d, instead of broadcasting like normal NumPy operations.

In order to extend the distance metrics with n-dimensional array support, we either break broadcasting or remove this behavior.
So, in PR #13774<https://github.com/scipy/scipy/pull/13774> I'm proposing to deprecate the squeeze and require the input actually be 1-dimensional.

-Peter

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210404/9f915957/attachment.html>

From christoph.baumgarten at gmail.com  Mon Apr  5 10:42:18 2021
From: christoph.baumgarten at gmail.com (Christoph Baumgarten)
Date: Mon, 5 Apr 2021 16:42:18 +0200
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
In-Reply-To: <mailman.5.1617552001.27990.scipy-dev@python.org>
References: <mailman.5.1617552001.27990.scipy-dev@python.org>
Message-ID: <CABXY2qAfoViRZ0iqAgrsi1JQy_cki4KE24_Zuidagg_eVQMW0g@mail.gmail.com>

Hi,

regarding the licence of UNU.RAN: I have obtained the permission from the
authors of UNU.RAN to include their package into SciPy under BSD. The only
part that we cannot use is a uniform random number generator in
src/uniform/mrg31k3p.c (Combined multiple recursive generator by Pierre
L'Ecuyer and Renee Touzin.)

Christoph

On Sun, Apr 4, 2021 at 6:00 PM <scipy-dev-request at python.org> wrote:

> Send SciPy-Dev mailing list submissions to
>         scipy-dev at python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://mail.python.org/mailman/listinfo/scipy-dev
> or, via email, send a message with subject or body 'help' to
>         scipy-dev-request at python.org
>
> You can reach the person managing the list at
>         scipy-dev-owner at python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of SciPy-Dev digest..."
>
>
> Today's Topics:
>
>    1. Re: GSoC: Integrate library UNU.RAN into scipy.stats
>       (Andrea Gavana)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sun, 4 Apr 2021 17:54:22 +0200
> From: Andrea Gavana <andrea.gavana at gmail.com>
> To: SciPy Developers List <scipy-dev at python.org>
> Subject: Re: [SciPy-Dev] GSoC: Integrate library UNU.RAN into
>         scipy.stats
> Message-ID:
>         <
> CAEf70byEr9gP38aJfw4zGt9qHfmct+LVAn0WgLakK_EkBcEGsw at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hi Hanno,
>
> On Sun, 4 Apr 2021 at 17.52, Hanno Klemm <h.klemm at gmx.de> wrote:
>
> > Hi Christoph, Tirth,
> >
> > this sounds like an interesting project, however, when I look at the
> > documentation of UNU.RAN, it seems to be licensed under the GPL. I always
> > thought that GPL is incompatible with scipy?s license?
> >
>
> As far as I have understood, the developers of UNU.RAN have opted for a
> much more open license if the library gets embedded into SciPy. I think I
> saw a message in the mailing list about that, but I can?t find it at the
> moment.
>
> Andrea.
>
>
>
> > Kind regards,
> > Hanno
> >
> > On 2. Apr 2021, at 21:44, Christoph Baumgarten <
> > christoph.baumgarten at gmail.com> wrote:
> >
> > ?
> >
> > Hi Tirth,
> >
> > great to hear that you are interested in the project! My main goal would
> > be to add the "universal" rv generation methods to SciPy, e.g. PINV, TDR
> (UNU.RAN
> > User Manual (wu.ac.at)
> > <
> http://statmath.wu.ac.at/software/unuran/doc/unuran.html#Methods_005ffor_005fCONT
> >).
> > At the moment, we just have one such function in SciPy (Statistical
> > functions (scipy.stats) ? SciPy v1.6.2 Reference Guide
> > <
> https://docs.scipy.org/doc/scipy/reference/stats.html#random-variate-generation
> >)
> > and it is very basic (I implemented it a while ago). Such functionality
> is
> > very useful in many situations, see e.g. OverflowError when sampling from
> > some handmade stats distributions ? Issue #13051 ? scipy/scipy (
> github.com)
> > <https://github.com/scipy/scipy/issues/13051> So the API would rather be
> > name_of_sampling_method(pdf / cdf, parameters of the sampling methods).
> >
> > Whether one should add a keyword to distribution.rvs(...) that allows the
> > user to choose the sampling method might be a question for a follow-up
> > project. This would also be quite time-consuming since you need to verify
> > which method is appropriate for a given distribution. A simpler task
> could
> > be to check if the rvs methods of a specific distribution could be
> > overwritten with the corresponding method in UNU.RAN (UNU.RAN User Manual
> > (wu.ac.at)
> > <http://statmath.wu.ac.at/software/unuran/doc/unuran.html#Stddist>).
> > For example, geninvgauss in SciPy relies on a Python implementation of a
> > rejection method / RoU and the implementation in UNU.RAN (gig / gig2)
> might
> > be faster. Also distributions with slow ppf methods relying on special
> > functions would be natural candidates. But that would also be of lower
> > priority for me.
> >
> > I hope it helps. Feel free to reach out if you have more questions.
> >
> > Christoph
> >
> > On Fri, Apr 2, 2021 at 6:00 PM <scipy-dev-request at python.org> wrote:
> >
> >> Send SciPy-Dev mailing list submissions to
> >>         scipy-dev at python.org
> >>
> >> To subscribe or unsubscribe via the World Wide Web, visit
> >>         https://mail.python.org/mailman/listinfo/scipy-dev
> >> or, via email, send a message with subject or body 'help' to
> >>         scipy-dev-request at python.org
> >>
> >> You can reach the person managing the list at
> >>         scipy-dev-owner at python.org
> >>
> >> When replying, please edit your Subject line so it is more specific
> >> than "Re: Contents of SciPy-Dev digest..."
> >>
> >>
> >> Today's Topics:
> >>
> >>    1. Multivariate non-central hypergeometric distributions
> >>       (Wallenius' and Fisher's) (???? ?????????)
> >>    2. GSoC: Integrate library UNU.RAN into scipy.stats (Tirth Patel)
> >>
> >>
> >> ----------------------------------------------------------------------
> >>
> >> Message: 1
> >> Date: Fri, 2 Apr 2021 00:05:10 +0200
> >> From: ???? ????????? <samogot at gmail.com>
> >> To: scipy-dev at python.org
> >> Subject: [SciPy-Dev] Multivariate non-central hypergeometric
> >>         distributions (Wallenius' and Fisher's)
> >> Message-ID:
> >>         <
> >> CAMJZOa0xemJR7WCjcMcEyGUKvqMqPF6YDtLxbb+r2e4d1k-HDA at mail.gmail.com>
> >> Content-Type: text/plain; charset="utf-8"
> >>
> >> Hi everyone.
> >>
> >> Univariate versions of non-central hypergeometric distributions based
> >> on Agner Fog's BiasedUrn C++ code were added recently (in
> >> https://github.com/scipy/scipy/pull/13330). C++ code added in that PR
> >> already contains the implementation of multivariate versions of the same
> >> distributions. As far as I understand, the only things needed for
> >> multivariate distributions to work are Python wrapper and probably some
> >> tests.
> >>
> >> Is anyone interested in adding them? If not, I might get to it myself
> >> later
> >> this month, but as I haven't made any scipy contributions yet and am not
> >> familiar with the codebase, I will need much more time to rump up than
> an
> >> experienced contributor :)
> >>
> >> --
> >> Regards,
> >> Ivan Naydonov
> >> -------------- next part --------------
> >> An HTML attachment was scrubbed...
> >> URL: <
> >>
> https://mail.python.org/pipermail/scipy-dev/attachments/20210402/455ec86f/attachment-0001.html
> >> >
> >>
> >> ------------------------------
> >>
> >> Message: 2
> >> Date: Fri, 2 Apr 2021 19:19:26 +0530
> >> From: Tirth Patel <tirthasheshpatel at gmail.com>
> >> To: scipy-dev <scipy-dev at python.org>
> >> Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
> >> Message-ID:
> >>         <CABpuv38XtcJWOT6kskF_Rv3T=_
> >> 0iSoNCVr7gtnupL0kGQixfWg at mail.gmail.com>
> >> Content-Type: text/plain; charset="UTF-8"
> >>
> >> Hi all,
> >>
> >> I would like to participate in GSoC this year and found this project
> >> very interesting!
> >>
> >> TL; DR: I have a few questions regarding the project:
> >>   - Is the user interface desired as a separate python submodule
> >> (inside `scipy.stats`) or does it serve as an extension of the `rvs`
> >> method?
> >>   - Should UNU.RAN C library be included as a submodule within SciPy
> >> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
> >> gh-13328)?
> >>
> >> About Me
> >> ********
> >> I am Tirth (@tirthasheshpatel on GitHub), a third-year computer
> >> science undergrad student. I am quite familiar with Cython and a lot
> >> of my college courses make use of C. I have a good knowledge of
> >> probability theory and statistics.
> >>
> >> Open Source work: I have participated in GSoC with the PyMC team last
> >> year. I am a contributor to SciPy since May 2020 and recently a
> >> maintainer.
> >>
> >> About Project
> >> *************
> >> I had a question about the project. Is the user interface desired as a
> >> separate python submodule inside `scipy.stats`? like:
> >>
> >>     import scipy.stats as stats
> >>
> >>     # sample a 1000 variates from a normal distribution
> >>     # with mean 0 and std 1.5. Let UNU.RAN choose the method
> >>     rvs = stats.random.normal(0., 1.5, size=1000, method='auto')
> >>
> >>     # sample 100 samples from the beta distribution using TDR method
> >>     beta_rvs = stats.random.beta(1, 2, size=100, method='tdr')
> >>
> >>     # the `rvs` methods remains unaffected.
> >>     norm_rvs = stats.norm.rvs(0, 1.5, size=1000)
> >>
> >> Or does it serve as an extension of the `rvs` method:
> >>
> >>     from scipy.stats import norm, beta
> >>
> >>     # something like this:
> >>     # method = None => same behaviour as previous versions
> >>     # method = 'auto' => use UNU.RAN and let it choose the method
> >>     rvs = norm.rvs(0, 1.5, size=1000, method='auto')
> >>
> >>     beta_rvs = beta.rvs(1, 2, size=100, method='tdr')
> >>
> >> Also, should UNU.RAN C library be included as a submodule within SciPy
> >> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
> >> gh-13328)?
> >>
> >>
> >> --
> >> Kind Regards,
> >> Tirth
> >>
> >>
> >> ------------------------------
> >>
> >> Subject: Digest Footer
> >>
> >> _______________________________________________
> >> SciPy-Dev mailing list
> >> SciPy-Dev at python.org
> >> https://mail.python.org/mailman/listinfo/scipy-dev
> >>
> >>
> >> ------------------------------
> >>
> >> End of SciPy-Dev Digest, Vol 210, Issue 2
> >> *****************************************
> >>
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev at python.org
> > https://mail.python.org/mailman/listinfo/scipy-dev
> >
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev at python.org
> > https://mail.python.org/mailman/listinfo/scipy-dev
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://mail.python.org/pipermail/scipy-dev/attachments/20210404/940409c0/attachment-0001.html
> >
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
>
> ------------------------------
>
> End of SciPy-Dev Digest, Vol 210, Issue 5
> *****************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210405/9b59b435/attachment-0001.html>

From ralf.gommers at gmail.com  Mon Apr  5 12:44:26 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Mon, 5 Apr 2021 18:44:26 +0200
Subject: [SciPy-Dev] Using more functionalities from GitHub?
In-Reply-To: <65EA20D3-70D3-4B9E-A010-3C9FA78B6A5A@gmail.com>
References: <0689A9B7-5561-4782-A83B-AF8C6EA80641@gmail.com>
 <CABL7CQgacWj5+5H2C-VpWMSJxZ1LTSkbM9btHDyexqcJ20a1Gw@mail.gmail.com>
 <65EA20D3-70D3-4B9E-A010-3C9FA78B6A5A@gmail.com>
Message-ID: <CABL7CQg4f99O61gC0W4HvUH=o_wE4JDhOwb9fF879QpMwPiSMw@mail.gmail.com>

On Sun, Feb 28, 2021 at 8:48 PM Pamphile Roy <roy.pamphile at gmail.com> wrote:

>
>
> On 27.02.2021, at 17:25, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
>
> The most related GitHub feature is CODEOWNERS:
> https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/about-code-owners.
> It can be used to automatically request PR reviews from individuals or from
> a team.
>
> So there's at least three approaches:
> 1. Teams per submodule and other area
> 2. A bot to subscribe to labels
> 3. Using CODEOWNERS
>
> The trouble with (1) is that it's a lot of overhead managing teams in the
> GitHub UI, and only people with owner/maintainer status can do it.
>
> My preference is (3) I think: it solves both problems to some extent, it's
> the most granular (you can get notifications for individual files as well
> as blog patterns), and it's a plain file in the repo that everyone can
> propose changes to via a PR. For pinging people outside of PRs, we can use
> the same file as documentation (just look at it, find the submodule/file of
> interest, and see who is subscribed to it to @-mention them). Should we try
> that?
>
>
> Thanks for pointing out CODEOWNERS, I didn?t know about this! I agree with
> you, this looks like a good idea and would bring more value than just
> grouping members by tags.
> This would also not prevent to also have, if needed, some grouping for
> convenience, like maintainer team, review team, build team, etc.
>
> So, I vote in favor of this solution.
>

It took a while, but I followed up on using CODEOWNERS in
https://github.com/scipy/scipy/pull/13810

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210405/917c317d/attachment.html>

From ralf.gommers at gmail.com  Tue Apr  6 06:11:01 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 6 Apr 2021 12:11:01 +0200
Subject: [SciPy-Dev] Proposal to deprecate squeeze on distance metric
 inputs
In-Reply-To: <AM4PR0501MB2273A7BC24AE189BE59BAE758E789@AM4PR0501MB2273.eurprd05.prod.outlook.com>
References: <AM4PR0501MB2273A7BC24AE189BE59BAE758E789@AM4PR0501MB2273.eurprd05.prod.outlook.com>
Message-ID: <CABL7CQhjZ7D10rLvk21sLvRk9BBO953xC5t78G+PjuCuQDhqhw@mail.gmail.com>

On Sun, Apr 4, 2021 at 11:05 PM Peter Bell <PeterBell10 at live.co.uk> wrote:

> Currently, all the metrics in scipy.spatial.distance call _validate_vector
> which internally does
>
> input = np.atleast_1d(input.squeeze())
>
> This means the input can have any number of extra length-1 dimensions
> which are ignored and not propagated to the output.  Shapes that shouldn't
> be compatible (e.g. (1, 3) and (3, 1) array) also match up as if they were
> 1d, instead of broadcasting like normal NumPy operations.
>
> In order to extend the distance metrics with n-dimensional array support,
> we either break broadcasting or remove this behavior.
> So, in PR #13774 <https://github.com/scipy/scipy/pull/13774> I'm
> proposing to deprecate the squeeze and require the input actually be
> 1-dimensional.
>

Thanks Peter. I agree. This is long-standing behavior, but it's quite weird
and arguably a bug (or at least a serious usability/consistency issue).

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210406/53721a51/attachment.html>

From roy.pamphile at gmail.com  Tue Apr  6 06:55:27 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Tue, 6 Apr 2021 12:55:27 +0200
Subject: [SciPy-Dev] pydata-sphinx-theme
In-Reply-To: <4707F08B-48F0-4953-87E1-5AEEC731EE4B@gmail.com>
References: <59A8480E-2640-471E-BE8E-DE2FADC53A36@gmail.com>
 <4707F08B-48F0-4953-87E1-5AEEC731EE4B@gmail.com>
Message-ID: <8C04F28E-7739-4C14-9BE6-F660AF214350@gmail.com>

Hi everyone,

The PR is merged! You can see the new doc at https://scipy.github.io/devdocs/index.html <https://scipy.github.io/devdocs/index.html>

Before the public release, we still have time to modify things like the landing page, links, some descriptions, etc.
I opened a follow-up PR to talk about these things: https://github.com/scipy/scipy/pull/13814 <https://github.com/scipy/scipy/pull/13814>

Thanks in advance for your help.

Cheers,
Pamphile

> On 22.03.2021, at 14:03, Pamphile Roy <roy.pamphile at gmail.com> wrote:
> 
> Hi everyone,
> 
> I started implementing this over here: https://github.com/scipy/scipy/pull/13724
> 
> There are still a few issues, but it looks nice!
> 
> It?s quite a big change, so it would be great if we have more feedback.
> 
> Cheers,
> Pamphile
> 
> 
> 
>> On 21.03.2021, at 17:16, Pamphile Roy <roy.pamphile at gmail.com> wrote:
>> 
>> Hi everyone,
>> 
>> I was wondering if there was a plan about moving the doc to using PyData?s theme? (I could not find any issues or ref in mail, sorry if I missed something).
>> IMO it?s good if we have a UX which is close to NumPy and the other libraries of the stack. Although for the landing page, I would propose to have fewer things like Pandas.
>> 
>> I could propose to work on this, if this is something we want.
>> 
>> Cheers,
>> Pamphile
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210406/dfb9f228/attachment.html>

From andyfaff at gmail.com  Tue Apr  6 23:38:45 2021
From: andyfaff at gmail.com (Andrew Nelson)
Date: Wed, 7 Apr 2021 13:38:45 +1000
Subject: [SciPy-Dev] updating = 'plus'/'worst' keywords added to
 differential_evolution
Message-ID: <CAAbtOZc4SG+9A77OPg2VNzoVgZyoG1GJL1xGgKWRBeT1XuBJAA@mail.gmail.com>

I've been investigating an article by Tanabe. There they investigate two
approaches that can reduce the number of function evaluations with
differential evolution. I've been evaluating the approaches, and wanted to
float possible modifications with the list first.

Tanabe, Ryoji. Revisiting Population Models in Differential Evolution on a
Limited Budget of Evaluations." In International Conference on Parallel
Problem Solving from Nature, pp. 257-272. Springer, Cham, 2020.

An elink to the article is at
https://login.easychair.org/publications/preprint_download/MtB7.

With ``'worst'`` (algorithm 4), only the worst solution vectors, numbering
``floor(M * alpha)``, are updated per generation. From the article:
"Ali [1] demonstrated that a better individual is rarely replaced with its
trial vector. In other words, the better the individual xi (i ? {1, ...,
?}) is, the more difficult it is to generate ui such that f(ui) ? f(xi). In
contrast, a worse individual is frequently replaced with its trial vector.
Based on this observation, Ali proposed the worst improvement model that
allows only the ? worst individuals to generate their ? trial vectors. The
number of function evaluations could possibly be reduced by not generating
the remaining ? ? ? trial vectors that are unlikely to outperform their ? ?
? parent individuals."

With ``'plus'`` (algorithm 3), a random selection of population members,
numbering ``floor(M * alpha)``, are chosen to create a trial population.
Then the ``M`` best solution vectors from the union of the existing and
trial populations are chosen to go forward into the next iteration. From
the article:
"The elitist (? + ?) model is general in the field of evolutionary
algorithms, including genetic algorithm and ES. For each iteration, a set
of ? trial vectors Q = {u1 , ..., u? } are generated (lines 3?5 in
Algorithm 3). For each u, the target vector is randomly selected from the
population. Then, the best ? individuals in P ? Q survive to the next
iteration (line 6 in Algorithm 3). Unlike other evolutionary algorithms,
the (? + ?) model has not received much attention in the DE community. Only
a few previous studies (e.g., [37,39,49]) considered the (? + ?) model. As
pointed out in [37], the synchronous model may discard a trial vector that
performs worse than its parent but performs better than other individuals
in the population. The (? + ?) model addresses such an issue."

Both ``'worst'`` and ``'plus'`` can reduce the total number of function
evaluations required, but at the increased risk of not finding a global
minimum.

_____________________________________
Dr. Andrew Nelson


_____________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210407/34474b91/attachment.html>

From xingyuliu at g.harvard.edu  Thu Apr  8 11:10:00 2021
From: xingyuliu at g.harvard.edu (Xingyu Liu)
Date: Thu, 8 Apr 2021 15:10:00 +0000
Subject: [SciPy-Dev] GSoC proposal draft: Improve performance through the
 use of Pythran
Message-ID: <ME2PR01MB5331A44790BEBC5F5D66C7BBFA749@ME2PR01MB5331.ausprd01.prod.outlook.com>

Hi everyone:

I'm Xingyu Liu(@charlotte12l on GitHub), a first-year Data Science master student at Harvard University, and I?m proficient in Python and C++. I?m very interested in the project - Improve performance through the use of Pythran. I did many investigations and tries on this project these days (many thanks to Ralf for helping me iterating this proposal!), and here is my current proposal draft: https://docs.google.com/document/d/1nM7dYbmModiukQw-sSOVGz6t5S6HC0VVWucYadI_aMQ/edit?usp=sharing .

I?d appreciate it if you guys can give me some tips on it!

Thanks,
Xingyu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210408/1d5e489f/attachment.html>

From davidmenhur at gmail.com  Thu Apr  8 12:56:17 2021
From: davidmenhur at gmail.com (=?UTF-8?Q?David_Men=C3=A9ndez_Hurtado?=)
Date: Thu, 8 Apr 2021 18:56:17 +0200
Subject: [SciPy-Dev] GSoC proposal draft: Improve performance through
 the use of Pythran
In-Reply-To: <ME2PR01MB5331A44790BEBC5F5D66C7BBFA749@ME2PR01MB5331.ausprd01.prod.outlook.com>
References: <ME2PR01MB5331A44790BEBC5F5D66C7BBFA749@ME2PR01MB5331.ausprd01.prod.outlook.com>
Message-ID: <CAJhcF=0uwafZk7T-hzVRx=4VO3Tp2CSsexWaSUhMS3CV_v3Eow@mail.gmail.com>

It is unfortunate that the example you choose turned up to be slower than
pure Python. If you could figure out what turns out to be slowing it down,
maybe pythran could fix it. But I realise this may take quite some time, so
it may have to wait until the project starts.

/David

On Thu, 8 Apr 2021, 5:10 pm Xingyu Liu, <xingyuliu at g.harvard.edu> wrote:

> Hi everyone:
>
>
>
> I'm Xingyu Liu(@charlotte12l on GitHub), a first-year Data Science master
> student at Harvard University, and I?m proficient in Python and C++. I?m
> very interested in the project - Improve performance through the use of
> Pythran. I did many investigations and tries on this project these days (many
> thanks to Ralf for helping me iterating this proposal!), and here is my
> current proposal draft:
> https://docs.google.com/document/d/1nM7dYbmModiukQw-sSOVGz6t5S6HC0VVWucYadI_aMQ/edit?usp=sharing
> .
>
>
>
> I?d appreciate it if you guys can give me some tips on it!
>
>
>
> Thanks,
>
> Xingyu
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210408/2df6e203/attachment.html>

From roy.pamphile at gmail.com  Thu Apr  8 14:50:39 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Thu, 8 Apr 2021 20:50:39 +0200
Subject: [SciPy-Dev] GSoC proposal draft: Improve performance through
 the use of Pythran
In-Reply-To: <ME2PR01MB5331A44790BEBC5F5D66C7BBFA749@ME2PR01MB5331.ausprd01.prod.outlook.com>
References: <ME2PR01MB5331A44790BEBC5F5D66C7BBFA749@ME2PR01MB5331.ausprd01.prod.outlook.com>
Message-ID: <FB816E77-6223-42C6-9F45-DD16F00D6563@gmail.com>

Hi Xingyu,

This looks like a great plan. If you manage to improve all the things you listed, it?s going to be of great value!

Good luck and I am looking forward to seeing your PRs :)

Cheers,
Pamphile

> On 08.04.2021, at 17:10, Xingyu Liu <xingyuliu at g.harvard.edu> wrote:
> 
> Hi everyone:
>  
> I'm Xingyu Liu(@charlotte12l on GitHub), a first-year Data Science master student at Harvard University, and I?m proficient in Python and C++. I?m very interested in the project - Improve performance through the use of Pythran. I did many investigations and tries on this project these days (many thanks to Ralf for helping me iterating this proposal!), and here is my current proposal draft:https://docs.google.com/document/d/1nM7dYbmModiukQw-sSOVGz6t5S6HC0VVWucYadI_aMQ/edit?usp=sharing <https://docs.google.com/document/d/1nM7dYbmModiukQw-sSOVGz6t5S6HC0VVWucYadI_aMQ/edit?usp=sharing> .
>  
> I?d appreciate it if you guys can give me some tips on it!
>  
> Thanks,
> Xingyu
>  
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org <mailto:SciPy-Dev at python.org>
> https://mail.python.org/mailman/listinfo/scipy-dev <https://mail.python.org/mailman/listinfo/scipy-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210408/49289229/attachment.html>

From serge.guelton at telecom-bretagne.eu  Thu Apr  8 16:16:49 2021
From: serge.guelton at telecom-bretagne.eu (Serge Guelton)
Date: Thu, 8 Apr 2021 22:16:49 +0200
Subject: [SciPy-Dev] GSoC proposal draft: Improve performance through
 the use of Pythran
In-Reply-To: <CAJhcF=0uwafZk7T-hzVRx=4VO3Tp2CSsexWaSUhMS3CV_v3Eow@mail.gmail.com>
References: <ME2PR01MB5331A44790BEBC5F5D66C7BBFA749@ME2PR01MB5331.ausprd01.prod.outlook.com>
 <CAJhcF=0uwafZk7T-hzVRx=4VO3Tp2CSsexWaSUhMS3CV_v3Eow@mail.gmail.com>
Message-ID: <20210408201649.GA23005@sguelton.remote.csb>

On Thu, Apr 08, 2021 at 06:56:17PM +0200, David Men?ndez Hurtado wrote:
> It is unfortunate that the example you choose turned up to be slower than pure
> Python. If you could figure out what turns out to be slowing it down, maybe
> pythran could fix it. But I realise this may take quite some time, so it may
> have to wait until the project starts.

Xingyu Liu opened an issue in the pythran tracker about that performance issue[1]
I've started investigating, and his problem exhibits an interesting situation,
where bascially whe have

```
if a < b:
  x[b:] = x[:a]
```

when doing an array copy like this, numpy semantic prevents usage of a memcpy,
because they may overlap. However in that case, the guard contains information
to allow the memcpy. This is an optimization that could be implemented at Python
level, but that would require an extension of the range analysis of pythran, as
it doesn't work on symbolic value. Probably not within the scope of a GSoc
though ;-)


[1] https://github.com/serge-sans-paille/pythran/issues/1753
> /David?
> 
> On Thu, 8 Apr 2021, 5:10 pm Xingyu Liu, <[1]xingyuliu at g.harvard.edu> wrote:
> 
> 
>     Hi everyone:
> 
>     ?
> 
>     I'm Xingyu Liu(@charlotte12l on GitHub), a first-year Data Science master
>     student at Harvard University, and I?m proficient in Python and C++. I?m
>     very interested in the project - Improve performance through the use of
>     Pythran. I did many investigations and tries on this project these days 
>     (many thanks to Ralf for helping me iterating this proposal!), and here is
>     my current proposal draft: [2]https://docs.google.com/document/d/
>     1nM7dYbmModiukQw-sSOVGz6t5S6HC0VVWucYadI_aMQ/edit?usp=sharing .
> 
>     ?
> 
>     I?d appreciate it if you guys can give me some tips on it!
> 
>     ?
> 
>     Thanks,
> 
>     Xingyu
> 
>     ?
> 
>     _______________________________________________
>     SciPy-Dev mailing list
>     [3]SciPy-Dev at python.org
>     [4]https://mail.python.org/mailman/listinfo/scipy-dev
> 
> 
> References:
> 
> [1] mailto:xingyuliu at g.harvard.edu
> [2] https://docs.google.com/document/d/1nM7dYbmModiukQw-sSOVGz6t5S6HC0VVWucYadI_aMQ/edit?usp=sharing
> [3] mailto:SciPy-Dev at python.org
> [4] https://mail.python.org/mailman/listinfo/scipy-dev

> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev


From mmalenic1 at gmail.com  Thu Apr  8 20:53:14 2021
From: mmalenic1 at gmail.com (Marko Malenic)
Date: Fri, 9 Apr 2021 10:53:14 +1000
Subject: [SciPy-Dev] GSoC - Enquiry
Message-ID: <CAPyWmaF10e3qhq4Qio9M3Nad_t4d1P=FwZOZJ2cpUR+CDG2cRg@mail.gmail.com>

Hi all,

My name is Marko and I'm interested in contributing and joining the SciPy
Google Summer of Code project.

I'm aware it's quite late in the proposal application process, so I am
emailing to ask if it's worthwhile submitting a proposal?
If so, could I receive some pointers on where to start and if there are any
good beginner issues I could work for a PR?

Best,
Marko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210409/1514a603/attachment.html>

From ralf.gommers at gmail.com  Fri Apr  9 06:06:00 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Fri, 9 Apr 2021 12:06:00 +0200
Subject: [SciPy-Dev] GSoC - Enquiry
In-Reply-To: <CAPyWmaF10e3qhq4Qio9M3Nad_t4d1P=FwZOZJ2cpUR+CDG2cRg@mail.gmail.com>
References: <CAPyWmaF10e3qhq4Qio9M3Nad_t4d1P=FwZOZJ2cpUR+CDG2cRg@mail.gmail.com>
Message-ID: <CABL7CQh=VFU4gjMui03zX573UyUmfW++6QMLA9-+0nk6cZgj=A@mail.gmail.com>

Hi Marko,

On Fri, Apr 9, 2021 at 2:53 AM Marko Malenic <mmalenic1 at gmail.com> wrote:

> Hi all,
>
> My name is Marko and I'm interested in contributing and joining the SciPy
> Google Summer of Code project.
>
> I'm aware it's quite late in the proposal application process, so I am
> emailing to ask if it's worthwhile submitting a proposal?
> If so, could I receive some pointers on where to start and if there are
> any good beginner issues I could work for a PR?
>

You have only four days left - if you work on it full-time it's possible to
get in a good application and meet the requirements, but it's a bit late to
get feedback and rework your proposal. It's not too late, but you haven't
helped your chances by leaving it till now.

There's a set of pointers in
https://github.com/scipy/scipy/wiki/GSoC-2021-project-ideas. And there's a
"good-first-issue" label:
https://github.com/scipy/scipy/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22

Cheers,
Ralf


> Best,
> Marko
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210409/cc61e3f8/attachment.html>

From xingyuliu at g.harvard.edu  Fri Apr  9 09:44:54 2021
From: xingyuliu at g.harvard.edu (Xingyu Liu)
Date: Fri, 9 Apr 2021 13:44:54 +0000
Subject: [SciPy-Dev] GSoC proposal draft: Improve performance through
 the use of Pythran
In-Reply-To: <20210408201649.GA23005@sguelton.remote.csb>
References: <ME2PR01MB5331A44790BEBC5F5D66C7BBFA749@ME2PR01MB5331.ausprd01.prod.outlook.com>
 <CAJhcF=0uwafZk7T-hzVRx=4VO3Tp2CSsexWaSUhMS3CV_v3Eow@mail.gmail.com>
 <20210408201649.GA23005@sguelton.remote.csb>
Message-ID: <ME2PR01MB53318CD2525276EC07BE2C12FA739@ME2PR01MB5331.ausprd01.prod.outlook.com>

Thanks for providing information about the array copy issue! I didn't notice that before ... I think optimizing this would be very useful, I can help test it if you implement it later :)

Thanks,
Xingyu

?On 2021/4/9, 4:23 AM, "SciPy-Dev on behalf of Serge Guelton" <scipy-dev-bounces+xingyuliu=g.harvard.edu at python.org on behalf of serge.guelton at telecom-bretagne.eu> wrote:

    On Thu, Apr 08, 2021 at 06:56:17PM +0200, David Men?ndez Hurtado wrote:
    > It is unfortunate that the example you choose turned up to be slower than pure
    > Python. If you could figure out what turns out to be slowing it down, maybe
    > pythran could fix it. But I realise this may take quite some time, so it may
    > have to wait until the project starts.

    Xingyu Liu opened an issue in the pythran tracker about that performance issue[1]
    I've started investigating, and his problem exhibits an interesting situation,
    where bascially whe have

    ```
    if a < b:
      x[b:] = x[:a]
    ```

    when doing an array copy like this, numpy semantic prevents usage of a memcpy,
    because they may overlap. However in that case, the guard contains information
    to allow the memcpy. This is an optimization that could be implemented at Python
    level, but that would require an extension of the range analysis of pythran, as
    it doesn't work on symbolic value. Probably not within the scope of a GSoc
    though ;-)


    [1] https://github.com/serge-sans-paille/pythran/issues/1753
    > /David 
    > 
    > On Thu, 8 Apr 2021, 5:10 pm Xingyu Liu, <[1]xingyuliu at g.harvard.edu> wrote:
    > 
    > 
    >     Hi everyone:
    > 
    >      
    > 
    >     I'm Xingyu Liu(@charlotte12l on GitHub), a first-year Data Science master
    >     student at Harvard University, and I?m proficient in Python and C++. I?m
    >     very interested in the project - Improve performance through the use of
    >     Pythran. I did many investigations and tries on this project these days 
    >     (many thanks to Ralf for helping me iterating this proposal!), and here is
    >     my current proposal draft: [2]https://docs.google.com/document/d/
    >     1nM7dYbmModiukQw-sSOVGz6t5S6HC0VVWucYadI_aMQ/edit?usp=sharing .
    > 
    >      
    > 
    >     I?d appreciate it if you guys can give me some tips on it!
    > 
    >      
    > 
    >     Thanks,
    > 
    >     Xingyu
    > 
    >      
    > 
    >     _______________________________________________
    >     SciPy-Dev mailing list
    >     [3]SciPy-Dev at python.org
    >     [4]https://mail.python.org/mailman/listinfo/scipy-dev
    > 
    > 
    > References:
    > 
    > [1] mailto:xingyuliu at g.harvard.edu
    > [2] https://docs.google.com/document/d/1nM7dYbmModiukQw-sSOVGz6t5S6HC0VVWucYadI_aMQ/edit?usp=sharing
    > [3] mailto:SciPy-Dev at python.org
    > [4] https://mail.python.org/mailman/listinfo/scipy-dev

    > _______________________________________________
    > SciPy-Dev mailing list
    > SciPy-Dev at python.org
    > https://mail.python.org/mailman/listinfo/scipy-dev

    _______________________________________________
    SciPy-Dev mailing list
    SciPy-Dev at python.org
    https://mail.python.org/mailman/listinfo/scipy-dev

From roy.pamphile at gmail.com  Fri Apr  9 13:41:58 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Fri, 9 Apr 2021 19:41:58 +0200
Subject: [SciPy-Dev] Sensitivity analysis module proposal
Message-ID: <37CE6815-2B29-448E-9F65-52273BF3AD14@gmail.com>

Hi everyone,

I would like to propose to add sensitivity analysis (SA/GSA) functions. Also called uncertainty quantification (UQ) or verification and validation (V&V) depending on the field.

The goal of these methods is primarily to answer a simple question: how is my function impacted by parameter changes?
So if we have F(x1, x2, ?, x_n). How F changes when we change the x_n. A very important answer it gives is: which of the variable contribute the most to the function?
It can do this qualitatively but also quantitatively and provide an ordering of variable importance.

A simple use case? If you have a model with 10 parameters and want to optimize the output, you can do a SA before to discard some parameters.
Also you can prioritize your efforts to improve the uncertainty you might have on some parameters.

SA is getting more and more attention as stakeholders are getting how paramount it is.
Here is one of the latest articles presenting it and explaining how essential this is to all scientific communities: https://www.sciencedirect.com/science/article/pii/S1364815220310112 <https://www.sciencedirect.com/science/article/pii/S1364815220310112>
(This article was authored by the most renown researcher in the field.)

The most famous indicator is Sobol? indices (same person but different than the Sobol? sequence) which is a variance based analysis.
We already have notes in the roadmap about ANOVA, and I think we could extend this to Sobol? indices, Moment independent methods (use the whole PDF (all moments), not just the variance, hence the name), and maybe Shapley indices. I could see this go into scipy.stats.sa/uq/uncertainty for instance, or as a new module.

I would be happy to read your thoughts. I have already some code for both Sobol? and moment independent methods.

Cheers,
Pamphile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210409/4f38d216/attachment.html>

From robert.kern at gmail.com  Fri Apr  9 13:51:42 2021
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 9 Apr 2021 13:51:42 -0400
Subject: [SciPy-Dev] Sensitivity analysis module proposal
In-Reply-To: <37CE6815-2B29-448E-9F65-52273BF3AD14@gmail.com>
References: <37CE6815-2B29-448E-9F65-52273BF3AD14@gmail.com>
Message-ID: <CAF6FJisMyEKWBf44=+GM4YwO+ShDy8QOkZXP1a06o_KxtVJ9Xg@mail.gmail.com>

On Fri, Apr 9, 2021 at 1:42 PM Pamphile Roy <roy.pamphile at gmail.com> wrote:

> Hi everyone,
>
> I would like to propose to add sensitivity analysis (SA/GSA) functions.
> Also called uncertainty quantification (UQ) or verification and validation
> (V&V) depending on the field.
>

SALib is actively developed. I recommend contributing there if there are
any gaps that you think need to be filled.

https://salib.readthedocs.io/en/latest/

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210409/362e948c/attachment.html>

From ilhanpolat at gmail.com  Sat Apr 10 11:42:24 2021
From: ilhanpolat at gmail.com (Ilhan Polat)
Date: Sat, 10 Apr 2021 17:42:24 +0200
Subject: [SciPy-Dev] Proposal to merge linalg.pinv and linalg.pinv2 and
 deprecate pinv2
In-Reply-To: <13cc48ae-2606-4e03-809a-b97b65738a59@www.fastmail.com>
References: <CAEBuzr-fPx6_b0KBd862hoE1cRqzXYzoFoYGYzRUTSHQ4Aqguw@mail.gmail.com>
 <CABL7CQhd+D4ms_eJo=s9hMtYY0m6VzTwTb-FwjUCg-hxF+br_g@mail.gmail.com>
 <CAF6FJisuaZL6=-yZYRHk4UFgb+f0Nx_ay6C39z=Pi6GgxZTJ5Q@mail.gmail.com>
 <CAF6FJiu2i+692_cCcR6BzLYCm3hvqbm_SYUW1FZD0k6tCQmidg@mail.gmail.com>
 <13cc48ae-2606-4e03-809a-b97b65738a59@www.fastmail.com>
Message-ID: <CAEBuzr9jHkqAJv8svBm2mEUkQJqhJB=SJKG9Ec0XkF2BYw45og@mail.gmail.com>

Thanks Stefan, this proves that it dates back to matlab pre6.5 days, for
some reason I vaguely remember using pinv2 on some non-python tool but
couldn't find anything online.

I attempted to fix the atol/rtol issue that apparently caused some
difficulties in over scikit-learn for quite some timesee [1], [2] reported
by @ogrisel and then deprecated pinv2. The PR is here
https://github.com/scipy/scipy/pull/13831

Feedback always welcome


[1] : https://github.com/scipy/scipy/issues/13704
[2] : https://github.com/scikit-learn/scikit-learn/pull/19646


On Mon, Mar 22, 2021 at 9:33 PM Stefan van der Walt <stefanv at berkeley.edu>
wrote:

> On Sun, Mar 21, 2021, at 16:17, Robert Kern wrote:
>
> On Sun, Mar 21, 2021 at 6:53 PM Robert Kern <robert.kern at gmail.com> wrote:
>
> On Sun, Mar 21, 2021 at 6:00 PM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>
> Do you happen to know the history of how we ended up with pinv2?
>
>
> I suspect that when `pinv2()` was added, the `lstsq()` call underlying
> `pinv()` was not SVD-based. The precise LAPACK driver has changed over the
> years. We might have started with the QR-based driver.
>
>
> It's going to be very hard to tell definitively because I think the
> history got lost in the SVN->git conversion due to some directory renames
> that happened in the early days. The pinv/pinv2 split seems to have been
> very early, though, so it may have dated from the original Multipack
> library (the source tarballs of which are also linkrotted away).
>
>
> I think you're right, this was pre-SVN.  I was looking at the following
> commit from https://github.com/scipy/scipy-svn
>
> commit c6ef539392f31bda0b56541a1c8fdd61a0c0e6eb (HEAD)
> Author: pearu <pearu at d6536bca-fef9-0310-8506-e4c0a848fbcf>
> Date:   Sun Apr 7 15:03:50 2002 +0000
>
>     Replacing linalg with linalg2: linalg->linalg/linalg1 and
> linalg2->linalg
>
>
> There, the linalg/basic.py file is added, and inside it both pinv and
> pinv2 already exist:
>
> def pinv(a, cond=-1):
>     """Compute generalized inverse of A using least-squares solver.
>     """
>     a = asarray(a)
>     t = a.typecode()
>     b = scipy.identity(a.shape[0],t)
>     return lstsq(a, b, cond=cond)[0]
>
> def pinv2(a, cond=-1):
>     """Compute the generalized inverse of A using svd.
>     """
>     a = asarray(a)
>     u, s, vh = decomp.svd(a)
>     m = u.shape[1]
>     n = vh.shape[0]
>     t = u.typecode()
>     if cond is -1 or cond is None:
>         cond = {0: feps*1e3, 1: eps*1e6}[_array_precision[t]]
>     cutoff = cond*scipy_base.maximum.reduce(s)
>     for i in range(min(n,m)):
>         if s[i] > cutoff:
>             s[i] = 1.0/s[i]
>         else:
>             s[i] = 0.0
>     return dot(tran(conj(vh)),tran(conj(u))*s[:,NewAxis])
>
>
> I have not been able to find a copy of multipack-0.7.tar.gz
>
> St?fan
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210410/f17c6d97/attachment-0001.html>

From josh.craig.wilson at gmail.com  Sat Apr 10 12:05:16 2021
From: josh.craig.wilson at gmail.com (Joshua Wilson)
Date: Sat, 10 Apr 2021 11:05:16 -0500
Subject: [SciPy-Dev] GSoC proposal draft: Add type hints on scipy.special
In-Reply-To: <CA+rKq-oZ7zhSgUfP7mbfnEH0MuM+6Q0Spy3=EL6mN6K1qzK3nQ@mail.gmail.com>
References: <CA+rKq-qcoFXyYkRa7WLPNCRWvwdy6HfTLb_XZS-rVj7vKvoqtQ@mail.gmail.com>
 <CAKFGQGxuY3OdVvdd0TWUBE091H9j0BV7DUKaxba91YOYT88Z2A@mail.gmail.com>
 <CA+rKq-oZ7zhSgUfP7mbfnEH0MuM+6Q0Spy3=EL6mN6K1qzK3nQ@mail.gmail.com>
Message-ID: <CAKFGQGwsWcY9_pSwsrqpDX8=sOYY58o6ikSb-SQ71zpVywbq_w@mail.gmail.com>

> I couldn't understand why that is a problem for the development phase, since I could work on my fork and change the numpy dependency to point directly at numpy's master, rather than to an already released version, would that be acceptable?

This is generally undesirable-developing software in isolation against
unreleased versions has a number of risks; e.g.

- The released version ends up being different
- It's harder to get feeback (so you risk building the wrong
thing)-commit early and often is the right approach
- It leads to one giant PR at the end-those are _really hard_ to
review, and usually everyone involved ends up frustrated (e.g. it
could take me months to find the time to review something large)

For a newer contributor who isn't as steeped in the overall direction
of scipy.special I'd be particularly worried.

> I don't understand how I should declare that my stubs are referencing the functions instantiated on the .pyf, is there such a thing as a .pyfi ?

You would want to generate stubs for the extension module that f2py
will generate. If we take special/specfun.pyf as an example you'd end
up with:

specfun.pyf <- input
specfunmodule.c <- generated extension module; when compiled gives a
module importable as "specfun"
specfun.pyi <- now we want to generate these stubs from the pyf


On Sat, Apr 3, 2021 at 11:33 AM Gabriel Simonetto
<gabrielfranksimonetto at gmail.com> wrote:
>
> Hey Joshua, thanks for the answer!
>
> I was looking into your points and I would like to ask for your opinion on them:
>
> 1) For the Numpy versions, I couldn't understand why that is a problem for the development phase, since I could work on my fork and change the numpy dependency to point directly at numpy's master, rather than to an already released version, would that be acceptable?
>
> 2) I was looking for how that would work and found the _generate_pyx.py responsible for generating some automated stubs related to C code, but haven't gotten very far with the f2py file. My problem with it is that since the fortran functions aren't exposed, I don't understand how I should declare that my stubs are referencing the functions instantiated on the .pyf, is there such a thing as a .pyfi ?
>
> I am sorry if my question implies a lack of understanding of both typing and the f2py mechanisms, I am fairly new to these topics. It's a shame I couldn't find your branch to better explore my shortcomings! Is there a chance you know where it is?
>
> So... I will try my best to enrich my proposal, but chances are that I won't be able to make it so much better than it already is, given my lack of experience, and that understanding type interactions would be something that I would need to pick upon before the coding period starts.
>
> I need to be honest about the fact that I don't already have the expertise to do the job, and I would like to know if it's expected to pick things up as the project happens, or if for this job in particular it wouldn't be possible to learn so much stuff on so little time (or something like that)
>
> Looking forward to any inputs, I appreciate the advice very much!
>
>
>
> Em qui., 1 de abr. de 2021 ?s 00:15, Joshua Wilson <josh.craig.wilson at gmail.com> escreveu:
>>
>> Hey Gabriel,
>>
>> One thing to consider is that a large chunk of special is ufuncs, and
>> better ufunc typing is going to need PRs like
>>
>> https://github.com/numpy/numpy/pull/18417
>>
>> which aren't yet in a released version of NumPy. So you'll want to
>> make sure the timing works out there.
>>
>> You mention .pyf files in the doc, they are an interesting case
>> because ideally we'd be able to auto-generate stubs for them. I even
>> have a languishing branch somewhere that has a start on doing that...
>> there are a few complications because the objects exported by the pyf
>> extension module are actually instances of one class, so you'd need to
>> either fudge the typing a bit or use Generic a lot.
>>
>> I'd recommend thinking through how to handle the above complications
>> and discussing that in the doc.
>>
>> - Josh (also the person142 mentioned in the doc)
>>
>> On Wed, Mar 31, 2021 at 5:42 PM Gabriel Simonetto
>> <gabrielfranksimonetto at gmail.com> wrote:
>> >
>> > Hi everyone!
>> >
>> > I finished making a rough draft of what I would like to do for this year's gsoc, could someone give me some tips on how to improve it?
>> >
>> > In particular I would like to know if this is a good module for the project or if it would be better allocated elsewhere, and how could I make my timeline a bit sharper.
>> >
>> > Here is the link for adding comments: https://docs.google.com/document/d/1d3NkbQC9rBcoKkuOmsx95wYDO_OMA5m1PBSu_oiwQVw/edit?usp=sharing
>> >
>> > Thanks!
>> > Gabriel Simonetto
>> > _______________________________________________
>> > SciPy-Dev mailing list
>> > SciPy-Dev at python.org
>> > https://mail.python.org/mailman/listinfo/scipy-dev
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev

From ndbecker2 at gmail.com  Sat Apr 10 15:56:05 2021
From: ndbecker2 at gmail.com (Neal Becker)
Date: Sat, 10 Apr 2021 15:56:05 -0400
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
In-Reply-To: <CABXY2qAfoViRZ0iqAgrsi1JQy_cki4KE24_Zuidagg_eVQMW0g@mail.gmail.com>
References: <mailman.5.1617552001.27990.scipy-dev@python.org>
 <CABXY2qAfoViRZ0iqAgrsi1JQy_cki4KE24_Zuidagg_eVQMW0g@mail.gmail.com>
Message-ID: <CAG3t+pGc3HK+vra59a7bAqxQqShaed1fsy1TsaUp9kCZkVgsRw@mail.gmail.com>

One thing to consider, is unuran still maintained?  It's been a long
time since last release.

On Mon, Apr 5, 2021 at 10:42 AM Christoph Baumgarten
<christoph.baumgarten at gmail.com> wrote:
>
>
> Hi,
>
> regarding the licence of UNU.RAN: I have obtained the permission from the authors of UNU.RAN to include their package into SciPy under BSD. The only part that we cannot use is a uniform random number generator in src/uniform/mrg31k3p.c (Combined multiple recursive generator by Pierre L'Ecuyer and Renee Touzin.)
>
> Christoph
>
> On Sun, Apr 4, 2021 at 6:00 PM <scipy-dev-request at python.org> wrote:
>>
>> Send SciPy-Dev mailing list submissions to
>>         scipy-dev at python.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://mail.python.org/mailman/listinfo/scipy-dev
>> or, via email, send a message with subject or body 'help' to
>>         scipy-dev-request at python.org
>>
>> You can reach the person managing the list at
>>         scipy-dev-owner at python.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of SciPy-Dev digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Re: GSoC: Integrate library UNU.RAN into scipy.stats
>>       (Andrea Gavana)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Sun, 4 Apr 2021 17:54:22 +0200
>> From: Andrea Gavana <andrea.gavana at gmail.com>
>> To: SciPy Developers List <scipy-dev at python.org>
>> Subject: Re: [SciPy-Dev] GSoC: Integrate library UNU.RAN into
>>         scipy.stats
>> Message-ID:
>>         <CAEf70byEr9gP38aJfw4zGt9qHfmct+LVAn0WgLakK_EkBcEGsw at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Hi Hanno,
>>
>> On Sun, 4 Apr 2021 at 17.52, Hanno Klemm <h.klemm at gmx.de> wrote:
>>
>> > Hi Christoph, Tirth,
>> >
>> > this sounds like an interesting project, however, when I look at the
>> > documentation of UNU.RAN, it seems to be licensed under the GPL. I always
>> > thought that GPL is incompatible with scipy?s license?
>> >
>>
>> As far as I have understood, the developers of UNU.RAN have opted for a
>> much more open license if the library gets embedded into SciPy. I think I
>> saw a message in the mailing list about that, but I can?t find it at the
>> moment.
>>
>> Andrea.
>>
>>
>>
>> > Kind regards,
>> > Hanno
>> >
>> > On 2. Apr 2021, at 21:44, Christoph Baumgarten <
>> > christoph.baumgarten at gmail.com> wrote:
>> >
>> > ?
>> >
>> > Hi Tirth,
>> >
>> > great to hear that you are interested in the project! My main goal would
>> > be to add the "universal" rv generation methods to SciPy, e.g. PINV, TDR (UNU.RAN
>> > User Manual (wu.ac.at)
>> > <http://statmath.wu.ac.at/software/unuran/doc/unuran.html#Methods_005ffor_005fCONT>).
>> > At the moment, we just have one such function in SciPy (Statistical
>> > functions (scipy.stats) ? SciPy v1.6.2 Reference Guide
>> > <https://docs.scipy.org/doc/scipy/reference/stats.html#random-variate-generation>)
>> > and it is very basic (I implemented it a while ago). Such functionality is
>> > very useful in many situations, see e.g. OverflowError when sampling from
>> > some handmade stats distributions ? Issue #13051 ? scipy/scipy (github.com)
>> > <https://github.com/scipy/scipy/issues/13051> So the API would rather be
>> > name_of_sampling_method(pdf / cdf, parameters of the sampling methods).
>> >
>> > Whether one should add a keyword to distribution.rvs(...) that allows the
>> > user to choose the sampling method might be a question for a follow-up
>> > project. This would also be quite time-consuming since you need to verify
>> > which method is appropriate for a given distribution. A simpler task could
>> > be to check if the rvs methods of a specific distribution could be
>> > overwritten with the corresponding method in UNU.RAN (UNU.RAN User Manual
>> > (wu.ac.at)
>> > <http://statmath.wu.ac.at/software/unuran/doc/unuran.html#Stddist>).
>> > For example, geninvgauss in SciPy relies on a Python implementation of a
>> > rejection method / RoU and the implementation in UNU.RAN (gig / gig2) might
>> > be faster. Also distributions with slow ppf methods relying on special
>> > functions would be natural candidates. But that would also be of lower
>> > priority for me.
>> >
>> > I hope it helps. Feel free to reach out if you have more questions.
>> >
>> > Christoph
>> >
>> > On Fri, Apr 2, 2021 at 6:00 PM <scipy-dev-request at python.org> wrote:
>> >
>> >> Send SciPy-Dev mailing list submissions to
>> >>         scipy-dev at python.org
>> >>
>> >> To subscribe or unsubscribe via the World Wide Web, visit
>> >>         https://mail.python.org/mailman/listinfo/scipy-dev
>> >> or, via email, send a message with subject or body 'help' to
>> >>         scipy-dev-request at python.org
>> >>
>> >> You can reach the person managing the list at
>> >>         scipy-dev-owner at python.org
>> >>
>> >> When replying, please edit your Subject line so it is more specific
>> >> than "Re: Contents of SciPy-Dev digest..."
>> >>
>> >>
>> >> Today's Topics:
>> >>
>> >>    1. Multivariate non-central hypergeometric distributions
>> >>       (Wallenius' and Fisher's) (???? ?????????)
>> >>    2. GSoC: Integrate library UNU.RAN into scipy.stats (Tirth Patel)
>> >>
>> >>
>> >> ----------------------------------------------------------------------
>> >>
>> >> Message: 1
>> >> Date: Fri, 2 Apr 2021 00:05:10 +0200
>> >> From: ???? ????????? <samogot at gmail.com>
>> >> To: scipy-dev at python.org
>> >> Subject: [SciPy-Dev] Multivariate non-central hypergeometric
>> >>         distributions (Wallenius' and Fisher's)
>> >> Message-ID:
>> >>         <
>> >> CAMJZOa0xemJR7WCjcMcEyGUKvqMqPF6YDtLxbb+r2e4d1k-HDA at mail.gmail.com>
>> >> Content-Type: text/plain; charset="utf-8"
>> >>
>> >> Hi everyone.
>> >>
>> >> Univariate versions of non-central hypergeometric distributions based
>> >> on Agner Fog's BiasedUrn C++ code were added recently (in
>> >> https://github.com/scipy/scipy/pull/13330). C++ code added in that PR
>> >> already contains the implementation of multivariate versions of the same
>> >> distributions. As far as I understand, the only things needed for
>> >> multivariate distributions to work are Python wrapper and probably some
>> >> tests.
>> >>
>> >> Is anyone interested in adding them? If not, I might get to it myself
>> >> later
>> >> this month, but as I haven't made any scipy contributions yet and am not
>> >> familiar with the codebase, I will need much more time to rump up than an
>> >> experienced contributor :)
>> >>
>> >> --
>> >> Regards,
>> >> Ivan Naydonov
>> >> -------------- next part --------------
>> >> An HTML attachment was scrubbed...
>> >> URL: <
>> >> https://mail.python.org/pipermail/scipy-dev/attachments/20210402/455ec86f/attachment-0001.html
>> >> >
>> >>
>> >> ------------------------------
>> >>
>> >> Message: 2
>> >> Date: Fri, 2 Apr 2021 19:19:26 +0530
>> >> From: Tirth Patel <tirthasheshpatel at gmail.com>
>> >> To: scipy-dev <scipy-dev at python.org>
>> >> Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
>> >> Message-ID:
>> >>         <CABpuv38XtcJWOT6kskF_Rv3T=_
>> >> 0iSoNCVr7gtnupL0kGQixfWg at mail.gmail.com>
>> >> Content-Type: text/plain; charset="UTF-8"
>> >>
>> >> Hi all,
>> >>
>> >> I would like to participate in GSoC this year and found this project
>> >> very interesting!
>> >>
>> >> TL; DR: I have a few questions regarding the project:
>> >>   - Is the user interface desired as a separate python submodule
>> >> (inside `scipy.stats`) or does it serve as an extension of the `rvs`
>> >> method?
>> >>   - Should UNU.RAN C library be included as a submodule within SciPy
>> >> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>> >> gh-13328)?
>> >>
>> >> About Me
>> >> ********
>> >> I am Tirth (@tirthasheshpatel on GitHub), a third-year computer
>> >> science undergrad student. I am quite familiar with Cython and a lot
>> >> of my college courses make use of C. I have a good knowledge of
>> >> probability theory and statistics.
>> >>
>> >> Open Source work: I have participated in GSoC with the PyMC team last
>> >> year. I am a contributor to SciPy since May 2020 and recently a
>> >> maintainer.
>> >>
>> >> About Project
>> >> *************
>> >> I had a question about the project. Is the user interface desired as a
>> >> separate python submodule inside `scipy.stats`? like:
>> >>
>> >>     import scipy.stats as stats
>> >>
>> >>     # sample a 1000 variates from a normal distribution
>> >>     # with mean 0 and std 1.5. Let UNU.RAN choose the method
>> >>     rvs = stats.random.normal(0., 1.5, size=1000, method='auto')
>> >>
>> >>     # sample 100 samples from the beta distribution using TDR method
>> >>     beta_rvs = stats.random.beta(1, 2, size=100, method='tdr')
>> >>
>> >>     # the `rvs` methods remains unaffected.
>> >>     norm_rvs = stats.norm.rvs(0, 1.5, size=1000)
>> >>
>> >> Or does it serve as an extension of the `rvs` method:
>> >>
>> >>     from scipy.stats import norm, beta
>> >>
>> >>     # something like this:
>> >>     # method = None => same behaviour as previous versions
>> >>     # method = 'auto' => use UNU.RAN and let it choose the method
>> >>     rvs = norm.rvs(0, 1.5, size=1000, method='auto')
>> >>
>> >>     beta_rvs = beta.rvs(1, 2, size=100, method='tdr')
>> >>
>> >> Also, should UNU.RAN C library be included as a submodule within SciPy
>> >> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>> >> gh-13328)?
>> >>
>> >>
>> >> --
>> >> Kind Regards,
>> >> Tirth
>> >>
>> >>
>> >> ------------------------------
>> >>
>> >> Subject: Digest Footer
>> >>
>> >> _______________________________________________
>> >> SciPy-Dev mailing list
>> >> SciPy-Dev at python.org
>> >> https://mail.python.org/mailman/listinfo/scipy-dev
>> >>
>> >>
>> >> ------------------------------
>> >>
>> >> End of SciPy-Dev Digest, Vol 210, Issue 2
>> >> *****************************************
>> >>
>> > _______________________________________________
>> > SciPy-Dev mailing list
>> > SciPy-Dev at python.org
>> > https://mail.python.org/mailman/listinfo/scipy-dev
>> >
>> > _______________________________________________
>> > SciPy-Dev mailing list
>> > SciPy-Dev at python.org
>> > https://mail.python.org/mailman/listinfo/scipy-dev
>> >
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210404/940409c0/attachment-0001.html>
>>
>> ------------------------------
>>
>> Subject: Digest Footer
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>>
>> ------------------------------
>>
>> End of SciPy-Dev Digest, Vol 210, Issue 5
>> *****************************************
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev


-- 
Those who don't understand recursion are doomed to repeat it

From robert.kern at gmail.com  Sat Apr 10 17:30:34 2021
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 10 Apr 2021 17:30:34 -0400
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
In-Reply-To: <CAG3t+pGc3HK+vra59a7bAqxQqShaed1fsy1TsaUp9kCZkVgsRw@mail.gmail.com>
References: <mailman.5.1617552001.27990.scipy-dev@python.org>
 <CABXY2qAfoViRZ0iqAgrsi1JQy_cki4KE24_Zuidagg_eVQMW0g@mail.gmail.com>
 <CAG3t+pGc3HK+vra59a7bAqxQqShaed1fsy1TsaUp9kCZkVgsRw@mail.gmail.com>
Message-ID: <CAF6FJisfvejG7owbSM68CiL71vF17_ertL598jwQ=mb4wcmE6w@mail.gmail.com>

On Sat, Apr 10, 2021 at 3:57 PM Neal Becker <ndbecker2 at gmail.com> wrote:

> One thing to consider, is unuran still maintained?  It's been a long
> time since last release.
>

It's a mature library, so I don't think it has been actively developed.
However, it is mature enough for the authors to build R bindings on top of
it this year, without much modification to the source code (someone ran a
minifier over it to remove comments, so it's hard to tell amongst that
chaff if anything significant has been modified, but it doesn't look like
much).

  https://cran.r-project.org/web/packages/Runuran/index.html

The worst-case scenario is that we simply implement the algorithms
ourselves using the C code as a reference. Given how we want to interact
with scipy.stats distribution objects and numpy BitGenerators, it's
entirely possible that's where we want to end up, anyways.

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210410/d63d1543/attachment-0001.html>

From tirthasheshpatel at gmail.com  Sun Apr 11 04:45:47 2021
From: tirthasheshpatel at gmail.com (Tirth Patel)
Date: Sun, 11 Apr 2021 14:15:47 +0530
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
In-Reply-To: <CABpuv38XtcJWOT6kskF_Rv3T=_0iSoNCVr7gtnupL0kGQixfWg@mail.gmail.com>
References: <CABpuv38XtcJWOT6kskF_Rv3T=_0iSoNCVr7gtnupL0kGQixfWg@mail.gmail.com>
Message-ID: <CABpuv3_PksR-uLMXhu6DMgwcAYsE9v20Lk=-+GT=ByxvXihv7Q@mail.gmail.com>

Hi everyone,

I have been working on a draft proposal lately and think it is almost
ready. Please feel free to add suggestions/comments, if any.

Proposal:
https://docs.google.com/document/d/1xvRdpNVseTL8d7eWdiyY6rcNxBVKCXt81xE9HyXvtqE/edit?usp=sharing


Kind regards,
Tirth Patel


On Fri, Apr 2, 2021 at 7:19 PM Tirth Patel <tirthasheshpatel at gmail.com>
wrote:

> Hi all,
>
> I would like to participate in GSoC this year and found this project
> very interesting!
>
> TL; DR: I have a few questions regarding the project:
>   - Is the user interface desired as a separate python submodule
> (inside `scipy.stats`) or does it serve as an extension of the `rvs`
> method?
>   - Should UNU.RAN C library be included as a submodule within SciPy
> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
> gh-13328)?
>
> About Me
> ********
> I am Tirth (@tirthasheshpatel on GitHub), a third-year computer
> science undergrad student. I am quite familiar with Cython and a lot
> of my college courses make use of C. I have a good knowledge of
> probability theory and statistics.
>
> Open Source work: I have participated in GSoC with the PyMC team last
> year. I am a contributor to SciPy since May 2020 and recently a
> maintainer.
>
> About Project
> *************
> I had a question about the project. Is the user interface desired as a
> separate python submodule inside `scipy.stats`? like:
>
>     import scipy.stats as stats
>
>     # sample a 1000 variates from a normal distribution
>     # with mean 0 and std 1.5. Let UNU.RAN choose the method
>     rvs = stats.random.normal(0., 1.5, size=1000, method='auto')
>
>     # sample 100 samples from the beta distribution using TDR method
>     beta_rvs = stats.random.beta(1, 2, size=100, method='tdr')
>
>     # the `rvs` methods remains unaffected.
>     norm_rvs = stats.norm.rvs(0, 1.5, size=1000)
>
> Or does it serve as an extension of the `rvs` method:
>
>     from scipy.stats import norm, beta
>
>     # something like this:
>     # method = None => same behaviour as previous versions
>     # method = 'auto' => use UNU.RAN and let it choose the method
>     rvs = norm.rvs(0, 1.5, size=1000, method='auto')
>
>     beta_rvs = beta.rvs(1, 2, size=100, method='tdr')
>
> Also, should UNU.RAN C library be included as a submodule within SciPy
> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
> gh-13328)?
>
>
> --
> Kind Regards,
> Tirth
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/7b58ecbc/attachment.html>

From sayedmazen70 at gmail.com  Sun Apr 11 08:40:36 2021
From: sayedmazen70 at gmail.com (Mazen Sayed)
Date: Sun, 11 Apr 2021 14:40:36 +0200
Subject: [SciPy-Dev] GSoC SciPy Optimization Ideas
Message-ID: <CAFTPoAyk_1Y2c6TeSQkXfKZmRgm1qP69pzMB4agTjADzEsUOkA@mail.gmail.com>

Dear,

I hope this email finds you well, this is my proposal for scipy.optimize
project, I'm really interested to work on this project.

Thanks

https://drive.google.com/file/d/12Q6NnorN74VkuQw_HRx2kuY-FIoK90V0/view?usp=sharing


<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/bf2a0e98/attachment.html>

From roy.pamphile at gmail.com  Sun Apr 11 09:06:21 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Sun, 11 Apr 2021 15:06:21 +0200
Subject: [SciPy-Dev] Sensitivity analysis module proposal
In-Reply-To: <CAF6FJisMyEKWBf44=+GM4YwO+ShDy8QOkZXP1a06o_KxtVJ9Xg@mail.gmail.com>
References: <37CE6815-2B29-448E-9F65-52273BF3AD14@gmail.com>
 <CAF6FJisMyEKWBf44=+GM4YwO+ShDy8QOkZXP1a06o_KxtVJ9Xg@mail.gmail.com>
Message-ID: <AE4B423D-60B6-4D30-8E86-7893CBC8D747@gmail.com>


> On 09.04.2021, at 19:51, Robert Kern <robert.kern at gmail.com> wrote:
> 
> On Fri, Apr 9, 2021 at 1:42 PM Pamphile Roy <roy.pamphile at gmail.com <mailto:roy.pamphile at gmail.com>> wrote:
> Hi everyone,
> 
> I would like to propose to add sensitivity analysis (SA/GSA) functions. Also called uncertainty quantification (UQ) or verification and validation (V&V) depending on the field.
> 
> SALib is actively developed. I recommend contributing there if there are any gaps that you think need to be filled.
> 
> https://salib.readthedocs.io/en/latest/ <https://salib.readthedocs.io/en/latest/> 

In my opinion, the fact that a library exists is not contradictory to adding some functionalities in SciPy. We are discussing about including UNU.RAN which is arguably the same.
SALib is a nice library, but as a user you will only find it and be willing to use it if you already know about SA. Like all niche products.

Having it in SciPy (or another project with a wider scope like statsmodels) would allow a greater exposure to the whole scientific community to this problematic. Again, this topic is getting more and more traction and SA is now a recurring theme for industrial applications.

We should really consider the positive fallback it could have. Taking scipy.stats.qmc for instance. Now that it?s in, a lot of projects will benefit from this inclusion. Not only they can rely on it, but being SciPy, we also took great care about the design and fixed things which were not that obvious nor even really studied (scikit-optimize, optuna, pydoe, and even SALib all had issues with their QMC implementations).
Thanks to the implementation and review process, 2 articles got written and SciPy will be presented during a conference to a new community, the QMC community.
And I believe we could have the same impact here and attract people from the SA community. R is still massively used in both cases.

In the end, if we don?t want any SA in SciPy, it?s fine but it should be motivated by something other than: it exists elsewhere. Because we are at the point where almost everything exists elsewhere.
Furthermore, I believe SA matches our scope as we have various types of analysis of variance (ANOVA) in the roadmap.

Cheers,
Pamphile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/def27664/attachment-0001.html>

From danielschmitzsiegen at googlemail.com  Sun Apr 11 09:08:57 2021
From: danielschmitzsiegen at googlemail.com (Daniel Schmitz)
Date: Sun, 11 Apr 2021 15:08:57 +0200
Subject: [SciPy-Dev] GSoC SciPy Optimization Ideas
In-Reply-To: <CAFTPoAyk_1Y2c6TeSQkXfKZmRgm1qP69pzMB4agTjADzEsUOkA@mail.gmail.com>
References: <CAFTPoAyk_1Y2c6TeSQkXfKZmRgm1qP69pzMB4agTjADzEsUOkA@mail.gmail.com>
Message-ID: <CALj43ZRVyJ_mhFRgbfrnkh4d-19JQXM4ayj5P_vQ_1FHxy688w@mail.gmail.com>

Hey Sayed,

my two cents, not being a CoreDeveloper but a python developer interested
in Optimization algorithms.

The automatic reformulation of the constrained problem into an
unconstrained problem sounds similar to nlopt's augmented lagrangian:
https://nlopt.readthedocs.io/en/latest/NLopt_Algorithms/#augmented-lagrangian-algorithm
. I think this would be a great addition to scipy.optimize. I imagine that
you would pass the reformulated objective to minimize then and just reuse
the existing algorithms.

One objection to your idea about "smart initialization": why exactly 50
points and how exactly would they be sampled if no bounds are provided?
Theoretically, a grid search over samples generated by for example latin
hypercube sampling within a bounded volume could be a better initialization
than a random guess. But I am not sure that this is in many cases a good
idea. If you have no idea how to initialize your optimizer, I would go for
one of the global optimizers.

Best,

Daniel

On Sun, 11 Apr 2021 at 14:41, Mazen Sayed <sayedmazen70 at gmail.com> wrote:

> Dear,
>
> I hope this email finds you well, this is my proposal for scipy.optimize
> project, I'm really interested to work on this project.
>
> Thanks
>
>
> https://drive.google.com/file/d/12Q6NnorN74VkuQw_HRx2kuY-FIoK90V0/view?usp=sharing
>
>
>
>
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
> <#m_6919920764018287639_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/9572d2c9/attachment.html>

From sayedmazen70 at gmail.com  Sun Apr 11 09:44:39 2021
From: sayedmazen70 at gmail.com (Mazen Sayed)
Date: Sun, 11 Apr 2021 15:44:39 +0200
Subject: [SciPy-Dev] GSoC SciPy Optimization Ideas
In-Reply-To: <CALj43ZRVyJ_mhFRgbfrnkh4d-19JQXM4ayj5P_vQ_1FHxy688w@mail.gmail.com>
References: <CAFTPoAyk_1Y2c6TeSQkXfKZmRgm1qP69pzMB4agTjADzEsUOkA@mail.gmail.com>
 <CALj43ZRVyJ_mhFRgbfrnkh4d-19JQXM4ayj5P_vQ_1FHxy688w@mail.gmail.com>
Message-ID: <CAFTPoAzk07Vt7XK75vHiAuZ3S3tTe4QJ+7koetG+cV4b7EqrkA@mail.gmail.com>

Hey Dr. Schmitz,

For the first point, yes that is what I mean, First we'll reformulate the
problem to unconstrained, then we will use an existing  algorithm.

For the second point, I was thinking of generating 50 random points using
the user defined bounds, then substitute in the objective, and choose the
initial that minimizes our goal.

Thanks,
Mazen


On Sun, Apr 11, 2021 at 3:09 PM Daniel Schmitz <
danielschmitzsiegen at googlemail.com> wrote:

> Hey Sayed,
>
> my two cents, not being a CoreDeveloper but a python developer interested
> in Optimization algorithms.
>
> The automatic reformulation of the constrained problem into an
> unconstrained problem sounds similar to nlopt's augmented lagrangian:
> https://nlopt.readthedocs.io/en/latest/NLopt_Algorithms/#augmented-lagrangian-algorithm
> . I think this would be a great addition to scipy.optimize. I imagine that
> you would pass the reformulated objective to minimize then and just reuse
> the existing algorithms.
>
> One objection to your idea about "smart initialization": why exactly 50
> points and how exactly would they be sampled if no bounds are provided?
> Theoretically, a grid search over samples generated by for example latin
> hypercube sampling within a bounded volume could be a better initialization
> than a random guess. But I am not sure that this is in many cases a good
> idea. If you have no idea how to initialize your optimizer, I would go for
> one of the global optimizers.
>
> Best,
>
> Daniel
>
> On Sun, 11 Apr 2021 at 14:41, Mazen Sayed <sayedmazen70 at gmail.com> wrote:
>
>> Dear,
>>
>> I hope this email finds you well, this is my proposal for scipy.optimize
>> project, I'm really interested to work on this project.
>>
>> Thanks
>>
>>
>> https://drive.google.com/file/d/12Q6NnorN74VkuQw_HRx2kuY-FIoK90V0/view?usp=sharing
>>
>>
>>
>>
>>
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free.
>> www.avast.com
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>> <#m_7582723163811328871_m_6919920764018287639_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/0e5210fd/attachment.html>

From ilhanpolat at gmail.com  Sun Apr 11 10:14:11 2021
From: ilhanpolat at gmail.com (Ilhan Polat)
Date: Sun, 11 Apr 2021 16:14:11 +0200
Subject: [SciPy-Dev] Sensitivity analysis module proposal
In-Reply-To: <AE4B423D-60B6-4D30-8E86-7893CBC8D747@gmail.com>
References: <37CE6815-2B29-448E-9F65-52273BF3AD14@gmail.com>
 <CAF6FJisMyEKWBf44=+GM4YwO+ShDy8QOkZXP1a06o_KxtVJ9Xg@mail.gmail.com>
 <AE4B423D-60B6-4D30-8E86-7893CBC8D747@gmail.com>
Message-ID: <CAEBuzr91vSO7T40=g=n7VwTgFhDVXJoC=vJjiiF9umm8BA15oA@mail.gmail.com>

 > In my opinion, the fact that a library exists is not contradictory to
adding some functionalities in SciPy. We are discussing about including
UNU.RAN which is arguably the same. SALib is a nice library, but as a user
you will only find it and be willing to use it if you already know about
SA. Like all niche products.

Indeed I find both quite niche. I don't have any strong opinions though I
think stats residents should weigh in. I am also fine with having it
somewhere else since SciPy is feeling like becoming a bit StatsPy lately
:)

> it should be motivated by something other than: it exists elsewhere

It's actually quite the reason for many things we considered before. If we
are not going to provide at least as good as SALib, there is no point in
having a half-baked version of it in SciPy. Note that it is super easy to
add things but incredibly hard to take it out later. So I think there must
be a substantial need for this (which I am not aware of) that requires
SciPy level inclusion.


On Sun, Apr 11, 2021 at 3:07 PM Pamphile Roy <roy.pamphile at gmail.com> wrote:

>
>
> On 09.04.2021, at 19:51, Robert Kern <robert.kern at gmail.com> wrote:
>
> On Fri, Apr 9, 2021 at 1:42 PM Pamphile Roy <roy.pamphile at gmail.com>
> wrote:
>
>> Hi everyone,
>>
>> I would like to propose to add sensitivity analysis (SA/GSA) functions.
>> Also called uncertainty quantification (UQ) or verification and validation
>> (V&V) depending on the field.
>>
>
> SALib is actively developed. I recommend contributing there if there are
> any gaps that you think need to be filled.
>
> https://salib.readthedocs.io/en/latest/
>
>
> In my opinion, the fact that a library exists is not contradictory to
> adding some functionalities in SciPy. We are discussing about including
> UNU.RAN which is arguably the same.
> SALib is a nice library, but as a user you will only find it and be
> willing to use it if you already know about SA. Like all niche products.
>
> Having it in SciPy (or another project with a wider scope like
> statsmodels) would allow a greater exposure to the whole scientific
> community to this problematic. Again, this topic is getting more and more
> traction and SA is now a recurring theme for industrial applications.
>
> We should really consider the positive fallback it could have. Taking
> scipy.stats.qmc for instance. Now that it?s in, a lot of projects will
> benefit from this inclusion. Not only they can rely on it, but being SciPy,
> we also took great care about the design and fixed things which were not
> that obvious nor even really studied (scikit-optimize, optuna, pydoe, and
> even SALib all had issues with their QMC implementations).
> Thanks to the implementation and review process, 2 articles got written
> and SciPy will be presented during a conference to a new community, the QMC
> community.
> And I believe we could have the same impact here and attract people from
> the SA community. R is still massively used in both cases.
>
> In the end, if we don?t want any SA in SciPy, it?s fine but it should be
> motivated by something other than: it exists elsewhere. Because we are at
> the point where almost everything exists elsewhere.
> Furthermore, I believe SA matches our scope as we have various types of
> analysis of variance (ANOVA) in the roadmap.
>
> Cheers,
> Pamphile
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/ebb7c950/attachment-0001.html>

From andrea.gavana at gmail.com  Sun Apr 11 10:20:00 2021
From: andrea.gavana at gmail.com (Andrea Gavana)
Date: Sun, 11 Apr 2021 16:20:00 +0200
Subject: [SciPy-Dev] GSoC SciPy Optimization Ideas
In-Reply-To: <CALj43ZRVyJ_mhFRgbfrnkh4d-19JQXM4ayj5P_vQ_1FHxy688w@mail.gmail.com>
References: <CAFTPoAyk_1Y2c6TeSQkXfKZmRgm1qP69pzMB4agTjADzEsUOkA@mail.gmail.com>
 <CALj43ZRVyJ_mhFRgbfrnkh4d-19JQXM4ayj5P_vQ_1FHxy688w@mail.gmail.com>
Message-ID: <CAEf70bwQ0xh995xp8Mscqj=zfFNUci9c1qcRUQBdMVTxFoJ2YQ@mail.gmail.com>

Hi,

On Sun, 11 Apr 2021 at 15.09, Daniel Schmitz <
danielschmitzsiegen at googlemail.com> wrote:

> Hey Sayed,
>
> my two cents, not being a CoreDeveloper but a python developer interested
> in Optimization algorithms.
>
> The automatic reformulation of the constrained problem into an
> unconstrained problem sounds similar to nlopt's augmented lagrangian:
> https://nlopt.readthedocs.io/en/latest/NLopt_Algorithms/#augmented-lagrangian-algorithm
> . I think this would be a great addition to scipy.optimize. I imagine that
> you would pass the reformulated objective to minimize then and just reuse
> the existing algorithms.
>
> One objection to your idea about "smart initialization": why exactly 50
> points and how exactly would they be sampled if no bounds are provided?
> Theoretically, a grid search over samples generated by for example latin
> hypercube sampling within a bounded volume could be a better initialization
> than a random guess. But I am not sure that this is in many cases a good
> idea. If you have no idea how to initialize your optimizer, I would go for
> one of the global optimizers.
>

My 2 cents too - I?m not a SciPy developer but I am passionate about
optimization.

I tend to agree with Daniel here, randomly choosing 50 points in a
high-dimensional optimization space is not going to give any advantage. And
why 50?

The initialization part is one of the most important (and difficult to get
right) part of any optimization algorithm, but this is mostly true for
global ones: differential evolution, SHGO, Dual Annealing they?re all have
their own way. Some of these and many others (especially local algorithms)
rely on the user to explicitly pass an initial guess and take it from there.

As for the penalty approach, I do agree it would be a nice addition: you
may want to take a look - whether for inspiration or out of curiosity - at
the Mystic library (
https://github.com/uqfoundation/mystic). I believe the author has covered
most use cases in terms of penalty/barrier methods. Speaking of
penalty/barrier approaches, you may also wish to consider whether a
constraint violation results into a point for which the objective function
*cannot* be evaluated or simply a point you don?t want your algorithm to go
(but the objective function can be evaluated there).

As for the normalization/denormalization concept, I think it should
definitely be a feature of all algorithms - and honestly I would expect
anyone knowledgeable on optimization to always apply some sort of
normalization if the parameters values spans multiple order of magnitude.
There?s quite a few algorithms already that apply normalization internally
no matter what, and of course this process helps in the vast majority of
optimization problems (why would you make your algorithm sweat in handling
different variables spanning 10 orders of magnitude?).

Andrea.


> Best,
>
> Daniel
>
> On Sun, 11 Apr 2021 at 14:41, Mazen Sayed <sayedmazen70 at gmail.com> wrote:
>
>> Dear,
>>
>> I hope this email finds you well, this is my proposal for scipy.optimize
>> project, I'm really interested to work on this project.
>>
>> Thanks
>>
>>
>> https://drive.google.com/file/d/12Q6NnorN74VkuQw_HRx2kuY-FIoK90V0/view?usp=sharing
>>
>>
>>
>>
>>
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free.
>> www.avast.com
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>> <#m_-8203804488649032660_m_-111851643062817127_m_6919920764018287639_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/945e6fb5/attachment.html>

From sayedmazen70 at gmail.com  Sun Apr 11 10:32:16 2021
From: sayedmazen70 at gmail.com (Mazen Sayed)
Date: Sun, 11 Apr 2021 16:32:16 +0200
Subject: [SciPy-Dev] GSoC SciPy Optimization Ideas
In-Reply-To: <CAEf70bwQ0xh995xp8Mscqj=zfFNUci9c1qcRUQBdMVTxFoJ2YQ@mail.gmail.com>
References: <CAFTPoAyk_1Y2c6TeSQkXfKZmRgm1qP69pzMB4agTjADzEsUOkA@mail.gmail.com>
 <CALj43ZRVyJ_mhFRgbfrnkh4d-19JQXM4ayj5P_vQ_1FHxy688w@mail.gmail.com>
 <CAEf70bwQ0xh995xp8Mscqj=zfFNUci9c1qcRUQBdMVTxFoJ2YQ@mail.gmail.com>
Message-ID: <CAFTPoAyzN_d-P_HRipugCDR7bVu_viPvpDU1MNM30LVpWkEy4A@mail.gmail.com>

Dear,

For the first point, 50 was an example, the number of points is a
hyperparameter that the user should define first, but I agree with you that
it may be useless specially with higher dimensions.
For the second point, Many thanks for the resources, I will read them
carefully.
For the third point, I agree with you that any one who studied
optimization, should normalize first, I was suggesting that we can add this
normalization feature, and the user can specify a normalize parameter to do
this step for him.

Thanks,
Mazen


On Sun, Apr 11, 2021 at 4:20 PM Andrea Gavana <andrea.gavana at gmail.com>
wrote:

> Hi,
>
> On Sun, 11 Apr 2021 at 15.09, Daniel Schmitz <
> danielschmitzsiegen at googlemail.com> wrote:
>
>> Hey Sayed,
>>
>> my two cents, not being a CoreDeveloper but a python developer interested
>> in Optimization algorithms.
>>
>> The automatic reformulation of the constrained problem into an
>> unconstrained problem sounds similar to nlopt's augmented lagrangian:
>> https://nlopt.readthedocs.io/en/latest/NLopt_Algorithms/#augmented-lagrangian-algorithm
>> . I think this would be a great addition to scipy.optimize. I imagine that
>> you would pass the reformulated objective to minimize then and just reuse
>> the existing algorithms.
>>
>> One objection to your idea about "smart initialization": why exactly 50
>> points and how exactly would they be sampled if no bounds are provided?
>> Theoretically, a grid search over samples generated by for example latin
>> hypercube sampling within a bounded volume could be a better initialization
>> than a random guess. But I am not sure that this is in many cases a good
>> idea. If you have no idea how to initialize your optimizer, I would go for
>> one of the global optimizers.
>>
>
> My 2 cents too - I?m not a SciPy developer but I am passionate about
> optimization.
>
> I tend to agree with Daniel here, randomly choosing 50 points in a
> high-dimensional optimization space is not going to give any advantage. And
> why 50?
>
> The initialization part is one of the most important (and difficult to get
> right) part of any optimization algorithm, but this is mostly true for
> global ones: differential evolution, SHGO, Dual Annealing they?re all have
> their own way. Some of these and many others (especially local algorithms)
> rely on the user to explicitly pass an initial guess and take it from there.
>
> As for the penalty approach, I do agree it would be a nice addition: you
> may want to take a look - whether for inspiration or out of curiosity - at
> the Mystic library (
> https://github.com/uqfoundation/mystic). I believe the author has covered
> most use cases in terms of penalty/barrier methods. Speaking of
> penalty/barrier approaches, you may also wish to consider whether a
> constraint violation results into a point for which the objective function
> *cannot* be evaluated or simply a point you don?t want your algorithm to go
> (but the objective function can be evaluated there).
>
> As for the normalization/denormalization concept, I think it should
> definitely be a feature of all algorithms - and honestly I would expect
> anyone knowledgeable on optimization to always apply some sort of
> normalization if the parameters values spans multiple order of magnitude.
> There?s quite a few algorithms already that apply normalization internally
> no matter what, and of course this process helps in the vast majority of
> optimization problems (why would you make your algorithm sweat in handling
> different variables spanning 10 orders of magnitude?).
>
> Andrea.
>
>
>
>
>> Best,
>>
>> Daniel
>>
>> On Sun, 11 Apr 2021 at 14:41, Mazen Sayed <sayedmazen70 at gmail.com> wrote:
>>
>>> Dear,
>>>
>>> I hope this email finds you well, this is my proposal for scipy.optimize
>>> project, I'm really interested to work on this project.
>>>
>>> Thanks
>>>
>>>
>>> https://drive.google.com/file/d/12Q6NnorN74VkuQw_HRx2kuY-FIoK90V0/view?usp=sharing
>>>
>>>
>>>
>>>
>>>
>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free.
>>> www.avast.com
>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>>> <#m_6208003926892838107_m_-8203804488649032660_m_-111851643062817127_m_6919920764018287639_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at python.org
>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/e5697e72/attachment-0001.html>

From danielschmitzsiegen at googlemail.com  Sun Apr 11 11:39:02 2021
From: danielschmitzsiegen at googlemail.com (Daniel Schmitz)
Date: Sun, 11 Apr 2021 17:39:02 +0200
Subject: [SciPy-Dev] GSoC SciPy Optimization Ideas
In-Reply-To: <CAFTPoAyzN_d-P_HRipugCDR7bVu_viPvpDU1MNM30LVpWkEy4A@mail.gmail.com>
References: <CAFTPoAyk_1Y2c6TeSQkXfKZmRgm1qP69pzMB4agTjADzEsUOkA@mail.gmail.com>
 <CALj43ZRVyJ_mhFRgbfrnkh4d-19JQXM4ayj5P_vQ_1FHxy688w@mail.gmail.com>
 <CAEf70bwQ0xh995xp8Mscqj=zfFNUci9c1qcRUQBdMVTxFoJ2YQ@mail.gmail.com>
 <CAFTPoAyzN_d-P_HRipugCDR7bVu_viPvpDU1MNM30LVpWkEy4A@mail.gmail.com>
Message-ID: <CALj43ZSJD7rqGSQA2kDnkQtzmB8Rnaznyi1xiL4EgM8ROZ7=-w@mail.gmail.com>

For the random initialization I would propose to write a dedicated new
function called for example randomsearch (familiar name as in sklearn) as a
counterpart to brute_force. Given some bounds, a function and a number of
samples, it should return the best sample.

Adding randomsearch as a method for x0 in minimize would then accomplish
what you wanted originally. Not sure though what the core developers think
about this from an API perspective.

Cheers

Mazen Sayed <sayedmazen70 at gmail.com> schrieb am So., 11. Apr. 2021, 16:33:

> Dear,
>
> For the first point, 50 was an example, the number of points is a
> hyperparameter that the user should define first, but I agree with you that
> it may be useless specially with higher dimensions.
> For the second point, Many thanks for the resources, I will read them
> carefully.
> For the third point, I agree with you that any one who studied
> optimization, should normalize first, I was suggesting that we can add this
> normalization feature, and the user can specify a normalize parameter to do
> this step for him.
>
> Thanks,
> Mazen
>
>
> On Sun, Apr 11, 2021 at 4:20 PM Andrea Gavana <andrea.gavana at gmail.com>
> wrote:
>
>> Hi,
>>
>> On Sun, 11 Apr 2021 at 15.09, Daniel Schmitz <
>> danielschmitzsiegen at googlemail.com> wrote:
>>
>>> Hey Sayed,
>>>
>>> my two cents, not being a CoreDeveloper but a python developer
>>> interested in Optimization algorithms.
>>>
>>> The automatic reformulation of the constrained problem into an
>>> unconstrained problem sounds similar to nlopt's augmented lagrangian:
>>> https://nlopt.readthedocs.io/en/latest/NLopt_Algorithms/#augmented-lagrangian-algorithm
>>> . I think this would be a great addition to scipy.optimize. I imagine that
>>> you would pass the reformulated objective to minimize then and just reuse
>>> the existing algorithms.
>>>
>>> One objection to your idea about "smart initialization": why exactly 50
>>> points and how exactly would they be sampled if no bounds are provided?
>>> Theoretically, a grid search over samples generated by for example latin
>>> hypercube sampling within a bounded volume could be a better initialization
>>> than a random guess. But I am not sure that this is in many cases a good
>>> idea. If you have no idea how to initialize your optimizer, I would go for
>>> one of the global optimizers.
>>>
>>
>> My 2 cents too - I?m not a SciPy developer but I am passionate about
>> optimization.
>>
>> I tend to agree with Daniel here, randomly choosing 50 points in a
>> high-dimensional optimization space is not going to give any advantage. And
>> why 50?
>>
>> The initialization part is one of the most important (and difficult to
>> get right) part of any optimization algorithm, but this is mostly true for
>> global ones: differential evolution, SHGO, Dual Annealing they?re all have
>> their own way. Some of these and many others (especially local algorithms)
>> rely on the user to explicitly pass an initial guess and take it from there.
>>
>> As for the penalty approach, I do agree it would be a nice addition: you
>> may want to take a look - whether for inspiration or out of curiosity - at
>> the Mystic library (
>> https://github.com/uqfoundation/mystic). I believe the author has
>> covered most use cases in terms of penalty/barrier methods. Speaking of
>> penalty/barrier approaches, you may also wish to consider whether a
>> constraint violation results into a point for which the objective function
>> *cannot* be evaluated or simply a point you don?t want your algorithm to go
>> (but the objective function can be evaluated there).
>>
>> As for the normalization/denormalization concept, I think it should
>> definitely be a feature of all algorithms - and honestly I would expect
>> anyone knowledgeable on optimization to always apply some sort of
>> normalization if the parameters values spans multiple order of magnitude.
>> There?s quite a few algorithms already that apply normalization internally
>> no matter what, and of course this process helps in the vast majority of
>> optimization problems (why would you make your algorithm sweat in handling
>> different variables spanning 10 orders of magnitude?).
>>
>> Andrea.
>>
>>
>>
>>
>>> Best,
>>>
>>> Daniel
>>>
>>> On Sun, 11 Apr 2021 at 14:41, Mazen Sayed <sayedmazen70 at gmail.com>
>>> wrote:
>>>
>>>> Dear,
>>>>
>>>> I hope this email finds you well, this is my proposal for
>>>> scipy.optimize project, I'm really interested to work on this project.
>>>>
>>>> Thanks
>>>>
>>>>
>>>> https://drive.google.com/file/d/12Q6NnorN74VkuQw_HRx2kuY-FIoK90V0/view?usp=sharing
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free.
>>>> www.avast.com
>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>>>> <#m_421618496721677197_m_6208003926892838107_m_-8203804488649032660_m_-111851643062817127_m_6919920764018287639_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>> _______________________________________________
>>>> SciPy-Dev mailing list
>>>> SciPy-Dev at python.org
>>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>>
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at python.org
>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/42848110/attachment.html>

From wiescirozne at gmail.com  Sun Apr 11 11:45:23 2021
From: wiescirozne at gmail.com (Theo Gates)
Date: Sun, 11 Apr 2021 16:45:23 +0100
Subject: [SciPy-Dev] Subscriptin Request.
Message-ID: <CABJ0Kca5pFXMp6_y1R5f+RGf9VTEp2rko8=4LW2XFUxtyDw=Pg@mail.gmail.com>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/23a2a23d/attachment.html>

From ralf.gommers at gmail.com  Sun Apr 11 13:02:44 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 11 Apr 2021 19:02:44 +0200
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
In-Reply-To: <CABpuv3_PksR-uLMXhu6DMgwcAYsE9v20Lk=-+GT=ByxvXihv7Q@mail.gmail.com>
References: <CABpuv38XtcJWOT6kskF_Rv3T=_0iSoNCVr7gtnupL0kGQixfWg@mail.gmail.com>
 <CABpuv3_PksR-uLMXhu6DMgwcAYsE9v20Lk=-+GT=ByxvXihv7Q@mail.gmail.com>
Message-ID: <CABL7CQh9rPWQbturkDXc+6n+j6DKK153f5uC00S3A+L6+rGE4Q@mail.gmail.com>

On Sun, Apr 11, 2021 at 10:46 AM Tirth Patel <tirthasheshpatel at gmail.com>
wrote:

> Hi everyone,
>
> I have been working on a draft proposal lately and think it is almost
> ready. Please feel free to add suggestions/comments, if any.
>
> Proposal:
> https://docs.google.com/document/d/1xvRdpNVseTL8d7eWdiyY6rcNxBVKCXt81xE9HyXvtqE/edit?usp=sharing
>

Hi Tirth, this is a very well-written proposal. I added one comment in your
GDoc about keeping some time for discussing the API with the community and
if needed iterating on it. Your proposal seems good to submit though, it
has enough detail and the plan and milestones look good.

Cheers,
Ralf


>
> Kind regards,
> Tirth Patel
>
>
>
> On Fri, Apr 2, 2021 at 7:19 PM Tirth Patel <tirthasheshpatel at gmail.com>
> wrote:
>
>> Hi all,
>>
>> I would like to participate in GSoC this year and found this project
>> very interesting!
>>
>> TL; DR: I have a few questions regarding the project:
>>   - Is the user interface desired as a separate python submodule
>> (inside `scipy.stats`) or does it serve as an extension of the `rvs`
>> method?
>>   - Should UNU.RAN C library be included as a submodule within SciPy
>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>> gh-13328)?
>>
>> About Me
>> ********
>> I am Tirth (@tirthasheshpatel on GitHub), a third-year computer
>> science undergrad student. I am quite familiar with Cython and a lot
>> of my college courses make use of C. I have a good knowledge of
>> probability theory and statistics.
>>
>> Open Source work: I have participated in GSoC with the PyMC team last
>> year. I am a contributor to SciPy since May 2020 and recently a
>> maintainer.
>>
>> About Project
>> *************
>> I had a question about the project. Is the user interface desired as a
>> separate python submodule inside `scipy.stats`? like:
>>
>>     import scipy.stats as stats
>>
>>     # sample a 1000 variates from a normal distribution
>>     # with mean 0 and std 1.5. Let UNU.RAN choose the method
>>     rvs = stats.random.normal(0., 1.5, size=1000, method='auto')
>>
>>     # sample 100 samples from the beta distribution using TDR method
>>     beta_rvs = stats.random.beta(1, 2, size=100, method='tdr')
>>
>>     # the `rvs` methods remains unaffected.
>>     norm_rvs = stats.norm.rvs(0, 1.5, size=1000)
>>
>> Or does it serve as an extension of the `rvs` method:
>>
>>     from scipy.stats import norm, beta
>>
>>     # something like this:
>>     # method = None => same behaviour as previous versions
>>     # method = 'auto' => use UNU.RAN and let it choose the method
>>     rvs = norm.rvs(0, 1.5, size=1000, method='auto')
>>
>>     beta_rvs = beta.rvs(1, 2, size=100, method='tdr')
>>
>> Also, should UNU.RAN C library be included as a submodule within SciPy
>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>> gh-13328)?
>>
>>
>> --
>> Kind Regards,
>> Tirth
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/2482399c/attachment.html>

From roy.pamphile at gmail.com  Sun Apr 11 13:08:19 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Sun, 11 Apr 2021 19:08:19 +0200
Subject: [SciPy-Dev] GSoC SciPy Optimization Ideas
In-Reply-To: <CAEf70bwQ0xh995xp8Mscqj=zfFNUci9c1qcRUQBdMVTxFoJ2YQ@mail.gmail.com>
References: <CAFTPoAyk_1Y2c6TeSQkXfKZmRgm1qP69pzMB4agTjADzEsUOkA@mail.gmail.com>
 <CALj43ZRVyJ_mhFRgbfrnkh4d-19JQXM4ayj5P_vQ_1FHxy688w@mail.gmail.com>
 <CAEf70bwQ0xh995xp8Mscqj=zfFNUci9c1qcRUQBdMVTxFoJ2YQ@mail.gmail.com>
Message-ID: <3B55D50F-9DAE-426A-893C-3F3CEB38CF90@gmail.com>

> I tend to agree with Daniel here, randomly choosing 50 points in a high-dimensional optimization space is not going to give any advantage. And why 50? 
> 
> The initialization part is one of the most important (and difficult to get right) part of any optimization algorithm, but this is mostly true for global ones: differential evolution, SHGO, Dual Annealing they?re all have their own way. Some of these and many others (especially local algorithms) rely on the user to explicitly pass an initial guess and take it from there.

Agreed here, the random sampling must not be totally random in order to cover the parameter space in the most efficient way.
The global optimizers all use QMC methods (scipy.stats.qmc) so here you can just rely on these too.

Cheers,
Pamphile

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/49173e79/attachment.html>

From sayedmazen70 at gmail.com  Sun Apr 11 13:26:40 2021
From: sayedmazen70 at gmail.com (Mazen Sayed)
Date: Sun, 11 Apr 2021 19:26:40 +0200
Subject: [SciPy-Dev] GSoC SciPy Optimization Ideas
In-Reply-To: <3B55D50F-9DAE-426A-893C-3F3CEB38CF90@gmail.com>
References: <CAFTPoAyk_1Y2c6TeSQkXfKZmRgm1qP69pzMB4agTjADzEsUOkA@mail.gmail.com>
 <CALj43ZRVyJ_mhFRgbfrnkh4d-19JQXM4ayj5P_vQ_1FHxy688w@mail.gmail.com>
 <CAEf70bwQ0xh995xp8Mscqj=zfFNUci9c1qcRUQBdMVTxFoJ2YQ@mail.gmail.com>
 <3B55D50F-9DAE-426A-893C-3F3CEB38CF90@gmail.com>
Message-ID: <CAFTPoAw3iMJS2NUtcNYcDs3KRT1BSvpg-XfRgjuY68K6RXWuZg@mail.gmail.com>

This sounds really good and more reasonable.

Thanks for your help.


On Sun, Apr 11, 2021 at 7:08 PM Pamphile Roy <roy.pamphile at gmail.com> wrote:

> I tend to agree with Daniel here, randomly choosing 50 points in a
> high-dimensional optimization space is not going to give any advantage. And
> why 50?
>
> The initialization part is one of the most important (and difficult to get
> right) part of any optimization algorithm, but this is mostly true for
> global ones: differential evolution, SHGO, Dual Annealing they?re all have
> their own way. Some of these and many others (especially local algorithms)
> rely on the user to explicitly pass an initial guess and take it from there.
>
>
> Agreed here, the random sampling must not be totally random in order to
> cover the parameter space in the most efficient way.
> The global optimizers all use QMC methods (scipy.stats.qmc) so here you
> can just rely on these too.
>
> Cheers,
> Pamphile
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/f4599dfd/attachment-0001.html>

From robert.kern at gmail.com  Sun Apr 11 13:55:43 2021
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 11 Apr 2021 13:55:43 -0400
Subject: [SciPy-Dev] Sensitivity analysis module proposal
In-Reply-To: <AE4B423D-60B6-4D30-8E86-7893CBC8D747@gmail.com>
References: <37CE6815-2B29-448E-9F65-52273BF3AD14@gmail.com>
 <CAF6FJisMyEKWBf44=+GM4YwO+ShDy8QOkZXP1a06o_KxtVJ9Xg@mail.gmail.com>
 <AE4B423D-60B6-4D30-8E86-7893CBC8D747@gmail.com>
Message-ID: <CAF6FJiutf6bWYF=MpUY5zxPv1CqGV4c=AQU27sgxdC3ZHwCoBg@mail.gmail.com>

On Sun, Apr 11, 2021 at 9:07 AM Pamphile Roy <roy.pamphile at gmail.com> wrote:

>
> On 09.04.2021, at 19:51, Robert Kern <robert.kern at gmail.com> wrote:
>
> On Fri, Apr 9, 2021 at 1:42 PM Pamphile Roy <roy.pamphile at gmail.com>
> wrote:
>
>> Hi everyone,
>>
>> I would like to propose to add sensitivity analysis (SA/GSA) functions.
>> Also called uncertainty quantification (UQ) or verification and validation
>> (V&V) depending on the field.
>>
>
> SALib is actively developed. I recommend contributing there if there are
> any gaps that you think need to be filled.
>
> https://salib.readthedocs.io/en/latest/
>
> In my opinion, the fact that a library exists is not contradictory to
> adding some functionalities in SciPy. We are discussing about including
> UNU.RAN which is arguably the same.
> SALib is a nice library, but as a user you will only find it and be
> willing to use it if you already know about SA. Like all niche products.
>

"It exists elsewhere" isn't my argument. While this is obviously a
judgement call, and my opinion isn't necessarily that of anyone else's, I
do have a rough rubric in mind when I consider things for inclusion in
scipy. The main guiding principle is to make important functionality
available to the scientific Python community. If including that
functionality in scipy advances that, great, that's an argument for
inclusion. But sometimes, inclusion inside scipy is just a neutral move,
and I think that's the case here. That's not dispositive, but then we have
to go to more specific reasons, like wanting to use the functionality
inside other parts of scipy (like QMC in SHGO).

So to take UNU.RAN as an example, it's an old, relatively unmaintained C
library. Its important functionality is *not* currently available to the
scientific Python community. Further, we want to *use* UNU.RAN internally
to provide faster implementations of random sampling for our distributions
that lack `_rvs()` methods. In contrast, SALib is actively maintained by a
multi-developer team; it's a Python library that uses numpy; it's liberally
licensed like scipy; it is used by other projects.


> Having it in SciPy (or another project with a wider scope like
> statsmodels) would allow a greater exposure to the whole scientific
> community to this problematic. Again, this topic is getting more and more
> traction and SA is now a recurring theme for industrial applications.
>
>

> We should really consider the positive fallback it could have. Taking
> scipy.stats.qmc for instance. Now that it?s in, a lot of projects will
> benefit from this inclusion. Not only they can rely on it, but being SciPy,
> we also took great care about the design and fixed things which were not
> that obvious nor even really studied (scikit-optimize, optuna, pydoe, and
> even SALib all had issues with their QMC implementations).
> Thanks to the implementation and review process, 2 articles got written
> and SciPy will be presented during a conference to a new community, the QMC
> community.
> And I believe we could have the same impact here and attract people from
> the SA community. R is still massively used in both cases.
>

The existing packages that did just QMC were often just
individual-maintained projects that are not very sustainable. So the
higher-level packages that *needed* QMC often rolled their own to varying
degrees of effectiveness. Implementing QMC in scipy in the disciplined and
thorough manner that you did means that those projects can now rely on that
solid building block. The important thing wasn't necessarily "in scipy" per
se, it was the discipline and thoroughness. IMO, SALib has done the
discipline and thoroughness just fine outside of scipy.

If the SALib developers expressed interest in merging SALib into scipy,
that'd be one thing. But if they are interested in maintaining it as an
independent project, I would recommend contributing to it to build on their
success instead of starting from scratch. As a tentpole project of the
scientific Python community, we want to support the efforts of the whole
community, not replace them or absorb them.

In the end, if we don?t want any SA in SciPy, it?s fine but it should be
> motivated by something other than: it exists elsewhere. Because we are at
> the point where almost everything exists elsewhere.
> Furthermore, I believe SA matches our scope as we have various types of
> analysis of variance (ANOVA) in the roadmap.
>

I'm not sure that the connection between ANOVA (at least, the specific
tools that are on the roadmap) and SA is apt. You use ANOVA and SA to solve
different problems on different objects (datasets vs models, respectively).
I'm not sure about the comparison being made here. Some SA techniques do
use some ANOVA-like analyses internally on specifically-designed
sample points, but I think that's as far as the connection goes. In any
case, what's on the roadmap for ANOVA is really just implementing a handful
of very standard textbook hypothesis tests.

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/a41b2227/attachment.html>

From ralf.gommers at gmail.com  Sun Apr 11 15:27:41 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 11 Apr 2021 21:27:41 +0200
Subject: [SciPy-Dev] Adding Z-test to scipy.stats
In-Reply-To: <ME2PR01MB5331103C3944EE6184EEF601FA7D9@ME2PR01MB5331.ausprd01.prod.outlook.com>
References: <ME2PR01MB5331103C3944EE6184EEF601FA7D9@ME2PR01MB5331.ausprd01.prod.outlook.com>
Message-ID: <CABL7CQhjX_k=61L1NzZc6fFC9FxD_VYUMHWrsFL-WyZWYdcT_Q@mail.gmail.com>

On Tue, Mar 30, 2021 at 2:53 PM Xingyu Liu <xingyuliu at g.harvard.edu> wrote:

> Hi everyone:
>
>
>
> I'm Xingyu Liu, a first year Data Science master student at Harvard
> University, and I'm thinking of adding Z-test to scipy.stats.
>
>
>
> Z-test is one of the most basic types of hypothesis test and is covered in
> almost all the statistics textbooks (e.g. *Statistics* by David Freedman,
> et,al.) . However, currently, Z-test is not implemented in scipy(see this
> feature request issue https://github.com/scipy/scipy/issues/13662  ). Its
> principles are quite similar to t-test except that it uses the population
> variance rather than the sample variance for calculation, which means that
> the population variance shoud be known in advance. In application, Z-test
> is used when sample size is large (n>50), or the population variance is
> known; t-test is used when sample size is small (n<50) and population
> variance is unknown(https://en.wikipedia.org/wiki/Z-test).
>
>
>
> There are mainly three types of Z-test and the coding part can be quite
> similar to t-test:
>
> 1. One Sample Z-Test: Does the mean of the sample differ from the expected
> mean?
>
> 2. Two Independent Sample Z-Test: Do the means of two independent samples
> differ?
>
> 3. Paired-Sample Z-Test: Do the means of the same sample differ before and
> after?
>
>
>
> Do you think it would be helpful to do this enhancement? If it is, I can
> work on it :)
>

Thanks for proposing this Xingyu. There are multiple papers with 200+
citations on the Z-test, including modified and weighted variants. So it's
probably justified to include it in scipy.stats. That said, there's no one
who responded to this saying they want/need this, so it's not very high
priority. Adding it to the end of your GSoC proposal in case you have time
left seems like the right thing to do here.

Cheers,
Ralf


>
> Cheers,
>
> Xingyu
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210411/9fa13e77/attachment-0001.html>

From tirthasheshpatel at gmail.com  Sun Apr 11 23:00:01 2021
From: tirthasheshpatel at gmail.com (Tirth Patel)
Date: Mon, 12 Apr 2021 08:30:01 +0530
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
In-Reply-To: <CABL7CQh9rPWQbturkDXc+6n+j6DKK153f5uC00S3A+L6+rGE4Q@mail.gmail.com>
References: <CABpuv38XtcJWOT6kskF_Rv3T=_0iSoNCVr7gtnupL0kGQixfWg@mail.gmail.com>
 <CABpuv3_PksR-uLMXhu6DMgwcAYsE9v20Lk=-+GT=ByxvXihv7Q@mail.gmail.com>
 <CABL7CQh9rPWQbturkDXc+6n+j6DKK153f5uC00S3A+L6+rGE4Q@mail.gmail.com>
Message-ID: <CABpuv39W+PbDA2X9kQ0CXbGbYb8tEiHwGHHGgzFW4isyRXmGjw@mail.gmail.com>

Hi Ralf,

Thanks a lot for taking a look and for the positive feedback! I have tried
to address your comments on the doc. Feel free to check them or ask for
more changes.


Kind Regards,
Tirth Patel


On Sun, Apr 11, 2021 at 10:33 PM Ralf Gommers <ralf.gommers at gmail.com>
wrote:

>
>
> On Sun, Apr 11, 2021 at 10:46 AM Tirth Patel <tirthasheshpatel at gmail.com>
> wrote:
>
>> Hi everyone,
>>
>> I have been working on a draft proposal lately and think it is almost
>> ready. Please feel free to add suggestions/comments, if any.
>>
>> Proposal:
>> https://docs.google.com/document/d/1xvRdpNVseTL8d7eWdiyY6rcNxBVKCXt81xE9HyXvtqE/edit?usp=sharing
>>
>
> Hi Tirth, this is a very well-written proposal. I added one comment in
> your GDoc about keeping some time for discussing the API with the community
> and if needed iterating on it. Your proposal seems good to submit though,
> it has enough detail and the plan and milestones look good.
>
> Cheers,
> Ralf
>
>
>
>>
>> Kind regards,
>> Tirth Patel
>>
>>
>>
>> On Fri, Apr 2, 2021 at 7:19 PM Tirth Patel <tirthasheshpatel at gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I would like to participate in GSoC this year and found this project
>>> very interesting!
>>>
>>> TL; DR: I have a few questions regarding the project:
>>>   - Is the user interface desired as a separate python submodule
>>> (inside `scipy.stats`) or does it serve as an extension of the `rvs`
>>> method?
>>>   - Should UNU.RAN C library be included as a submodule within SciPy
>>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>>> gh-13328)?
>>>
>>> About Me
>>> ********
>>> I am Tirth (@tirthasheshpatel on GitHub), a third-year computer
>>> science undergrad student. I am quite familiar with Cython and a lot
>>> of my college courses make use of C. I have a good knowledge of
>>> probability theory and statistics.
>>>
>>> Open Source work: I have participated in GSoC with the PyMC team last
>>> year. I am a contributor to SciPy since May 2020 and recently a
>>> maintainer.
>>>
>>> About Project
>>> *************
>>> I had a question about the project. Is the user interface desired as a
>>> separate python submodule inside `scipy.stats`? like:
>>>
>>>     import scipy.stats as stats
>>>
>>>     # sample a 1000 variates from a normal distribution
>>>     # with mean 0 and std 1.5. Let UNU.RAN choose the method
>>>     rvs = stats.random.normal(0., 1.5, size=1000, method='auto')
>>>
>>>     # sample 100 samples from the beta distribution using TDR method
>>>     beta_rvs = stats.random.beta(1, 2, size=100, method='tdr')
>>>
>>>     # the `rvs` methods remains unaffected.
>>>     norm_rvs = stats.norm.rvs(0, 1.5, size=1000)
>>>
>>> Or does it serve as an extension of the `rvs` method:
>>>
>>>     from scipy.stats import norm, beta
>>>
>>>     # something like this:
>>>     # method = None => same behaviour as previous versions
>>>     # method = 'auto' => use UNU.RAN and let it choose the method
>>>     rvs = norm.rvs(0, 1.5, size=1000, method='auto')
>>>
>>>     beta_rvs = beta.rvs(1, 2, size=100, method='tdr')
>>>
>>> Also, should UNU.RAN C library be included as a submodule within SciPy
>>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>>> gh-13328)?
>>>
>>>
>>> --
>>> Kind Regards,
>>> Tirth
>>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210412/7fc688b5/attachment.html>

From xingyuliu at g.harvard.edu  Mon Apr 12 00:31:34 2021
From: xingyuliu at g.harvard.edu (Xingyu Liu)
Date: Mon, 12 Apr 2021 04:31:34 +0000
Subject: [SciPy-Dev] Adding Z-test to scipy.stats
In-Reply-To: <CABL7CQhjX_k=61L1NzZc6fFC9FxD_VYUMHWrsFL-WyZWYdcT_Q@mail.gmail.com>
References: <ME2PR01MB5331103C3944EE6184EEF601FA7D9@ME2PR01MB5331.ausprd01.prod.outlook.com>,
 <CABL7CQhjX_k=61L1NzZc6fFC9FxD_VYUMHWrsFL-WyZWYdcT_Q@mail.gmail.com>
Message-ID: <ME2PR01MB533143EAAE0933468BFE2DB6FA709@ME2PR01MB5331.ausprd01.prod.outlook.com>

Hi Ralf?

Thanks! I?ve added it to the end of my GSoC timeline. I can do it if time permits. :)

Best,
Xingyu


________________________________
From: SciPy-Dev <scipy-dev-bounces+xingyuliu=g.harvard.edu at python.org> on behalf of Ralf Gommers <ralf.gommers at gmail.com>
Sent: Monday, April 12, 2021 3:27:41 AM
To: SciPy Developers List <scipy-dev at python.org>
Subject: Re: [SciPy-Dev] Adding Z-test to scipy.stats


On Tue, Mar 30, 2021 at 2:53 PM Xingyu Liu <xingyuliu at g.harvard.edu<mailto:xingyuliu at g.harvard.edu>> wrote:

Hi everyone:


I'm Xingyu Liu, a first year Data Science master student at Harvard University, and I'm thinking of adding Z-test to scipy.stats.


Z-test is one of the most basic types of hypothesis test and is covered in almost all the statistics textbooks (e.g. Statistics by David Freedman, et,al.) . However, currently, Z-test is not implemented in scipy(see this feature request issue https://github.com/scipy/scipy/issues/13662  ). Its principles are quite similar to t-test except that it uses the population variance rather than the sample variance for calculation, which means that the population variance shoud be known in advance. In application, Z-test is used when sample size is large (n>50), or the population variance is known; t-test is used when sample size is small (n<50) and population variance is unknown(https://en.wikipedia.org/wiki/Z-test).


There are mainly three types of Z-test and the coding part can be quite similar to t-test:

1. One Sample Z-Test: Does the mean of the sample differ from the expected mean?

2. Two Independent Sample Z-Test: Do the means of two independent samples differ?

3. Paired-Sample Z-Test: Do the means of the same sample differ before and after?


Do you think it would be helpful to do this enhancement? If it is, I can work on it :)

Thanks for proposing this Xingyu. There are multiple papers with 200+ citations on the Z-test, including modified and weighted variants. So it's probably justified to include it in scipy.stats. That said, there's no one who responded to this saying they want/need this, so it's not very high priority. Adding it to the end of your GSoC proposal in case you have time left seems like the right thing to do here.

Cheers,
Ralf


Cheers,

Xingyu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210412/d0e1cf34/attachment.html>

From roy.pamphile at gmail.com  Mon Apr 12 03:49:19 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Mon, 12 Apr 2021 09:49:19 +0200
Subject: [SciPy-Dev] Sensitivity analysis module proposal
In-Reply-To: <CAF6FJiutf6bWYF=MpUY5zxPv1CqGV4c=AQU27sgxdC3ZHwCoBg@mail.gmail.com>
References: <37CE6815-2B29-448E-9F65-52273BF3AD14@gmail.com>
 <CAF6FJisMyEKWBf44=+GM4YwO+ShDy8QOkZXP1a06o_KxtVJ9Xg@mail.gmail.com>
 <AE4B423D-60B6-4D30-8E86-7893CBC8D747@gmail.com>
 <CAF6FJiutf6bWYF=MpUY5zxPv1CqGV4c=AQU27sgxdC3ZHwCoBg@mail.gmail.com>
Message-ID: <B622DD0D-6D29-447D-9937-9E2119494727@gmail.com>

Thank you for your detailed reply Robert.

> On 11.04.2021, at 19:55, Robert Kern <robert.kern at gmail.com> wrote:
> 
> "It exists elsewhere" isn't my argument. While this is obviously a judgement call, and my opinion isn't necessarily that of anyone else's, I do have a rough rubric in mind when I consider things for inclusion in scipy. The main guiding principle is to make important functionality available to the scientific Python community. If including that functionality in scipy advances that, great, that's an argument for inclusion. But sometimes, inclusion inside scipy is just a neutral move, and I think that's the case here. That's not dispositive, but then we have to go to more specific reasons, like wanting to use the functionality inside other parts of scipy (like QMC in SHGO).

I guess this will be very subjective. As for the scientific impact, I will just point out that the known textbooks about SA have thousands of citations and are involved with policy-makers through things like JRC, EU.
Cf. people like Andrea Saltelli, Stefano Tarantola, IM Sobol?.

> If the SALib developers expressed interest in merging SALib into scipy, that'd be one thing. But if they are interested in maintaining it as an independent project, I would recommend contributing to it to build on their success instead of starting from scratch. As a tentpole project of the scientific Python community, we want to support the efforts of the whole community, not replace them or absorb them.

Just to clarify that my proposal does not seek to eat other libraries, more to provide the basic tools for them.

Let see what others think then :)

Cheers,
Pamphile 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210412/3847cd0c/attachment-0001.html>

From ralf.gommers at gmail.com  Mon Apr 12 04:09:52 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Mon, 12 Apr 2021 10:09:52 +0200
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
In-Reply-To: <CABpuv39W+PbDA2X9kQ0CXbGbYb8tEiHwGHHGgzFW4isyRXmGjw@mail.gmail.com>
References: <CABpuv38XtcJWOT6kskF_Rv3T=_0iSoNCVr7gtnupL0kGQixfWg@mail.gmail.com>
 <CABpuv3_PksR-uLMXhu6DMgwcAYsE9v20Lk=-+GT=ByxvXihv7Q@mail.gmail.com>
 <CABL7CQh9rPWQbturkDXc+6n+j6DKK153f5uC00S3A+L6+rGE4Q@mail.gmail.com>
 <CABpuv39W+PbDA2X9kQ0CXbGbYb8tEiHwGHHGgzFW4isyRXmGjw@mail.gmail.com>
Message-ID: <CABL7CQgBTtjStXr8F2iTOVCHnsM5+KarwPLc1QiL17zyn8WbxA@mail.gmail.com>

On Mon, Apr 12, 2021 at 5:00 AM Tirth Patel <tirthasheshpatel at gmail.com>
wrote:

> Hi Ralf,
>
> Thanks a lot for taking a look and for the positive feedback! I have tried
> to address your comments on the doc. Feel free to check them or ask for
> more changes.
>

Looks good. I suggest submitting it, since the deadline is tomorrow - don't
leave it till the last minute:)

Cheers,
Ralf


>
> Kind Regards,
> Tirth Patel
>
>
> On Sun, Apr 11, 2021 at 10:33 PM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>>
>> On Sun, Apr 11, 2021 at 10:46 AM Tirth Patel <tirthasheshpatel at gmail.com>
>> wrote:
>>
>>> Hi everyone,
>>>
>>> I have been working on a draft proposal lately and think it is almost
>>> ready. Please feel free to add suggestions/comments, if any.
>>>
>>> Proposal:
>>> https://docs.google.com/document/d/1xvRdpNVseTL8d7eWdiyY6rcNxBVKCXt81xE9HyXvtqE/edit?usp=sharing
>>>
>>
>> Hi Tirth, this is a very well-written proposal. I added one comment in
>> your GDoc about keeping some time for discussing the API with the community
>> and if needed iterating on it. Your proposal seems good to submit though,
>> it has enough detail and the plan and milestones look good.
>>
>> Cheers,
>> Ralf
>>
>>
>>
>>>
>>> Kind regards,
>>> Tirth Patel
>>>
>>>
>>>
>>> On Fri, Apr 2, 2021 at 7:19 PM Tirth Patel <tirthasheshpatel at gmail.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I would like to participate in GSoC this year and found this project
>>>> very interesting!
>>>>
>>>> TL; DR: I have a few questions regarding the project:
>>>>   - Is the user interface desired as a separate python submodule
>>>> (inside `scipy.stats`) or does it serve as an extension of the `rvs`
>>>> method?
>>>>   - Should UNU.RAN C library be included as a submodule within SciPy
>>>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>>>> gh-13328)?
>>>>
>>>> About Me
>>>> ********
>>>> I am Tirth (@tirthasheshpatel on GitHub), a third-year computer
>>>> science undergrad student. I am quite familiar with Cython and a lot
>>>> of my college courses make use of C. I have a good knowledge of
>>>> probability theory and statistics.
>>>>
>>>> Open Source work: I have participated in GSoC with the PyMC team last
>>>> year. I am a contributor to SciPy since May 2020 and recently a
>>>> maintainer.
>>>>
>>>> About Project
>>>> *************
>>>> I had a question about the project. Is the user interface desired as a
>>>> separate python submodule inside `scipy.stats`? like:
>>>>
>>>>     import scipy.stats as stats
>>>>
>>>>     # sample a 1000 variates from a normal distribution
>>>>     # with mean 0 and std 1.5. Let UNU.RAN choose the method
>>>>     rvs = stats.random.normal(0., 1.5, size=1000, method='auto')
>>>>
>>>>     # sample 100 samples from the beta distribution using TDR method
>>>>     beta_rvs = stats.random.beta(1, 2, size=100, method='tdr')
>>>>
>>>>     # the `rvs` methods remains unaffected.
>>>>     norm_rvs = stats.norm.rvs(0, 1.5, size=1000)
>>>>
>>>> Or does it serve as an extension of the `rvs` method:
>>>>
>>>>     from scipy.stats import norm, beta
>>>>
>>>>     # something like this:
>>>>     # method = None => same behaviour as previous versions
>>>>     # method = 'auto' => use UNU.RAN and let it choose the method
>>>>     rvs = norm.rvs(0, 1.5, size=1000, method='auto')
>>>>
>>>>     beta_rvs = beta.rvs(1, 2, size=100, method='tdr')
>>>>
>>>> Also, should UNU.RAN C library be included as a submodule within SciPy
>>>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>>>> gh-13328)?
>>>>
>>>>
>>>> --
>>>> Kind Regards,
>>>> Tirth
>>>>
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at python.org
>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210412/06bc671e/attachment.html>

From tirthasheshpatel at gmail.com  Mon Apr 12 04:40:41 2021
From: tirthasheshpatel at gmail.com (Tirth Patel)
Date: Mon, 12 Apr 2021 14:10:41 +0530
Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
In-Reply-To: <CABL7CQgBTtjStXr8F2iTOVCHnsM5+KarwPLc1QiL17zyn8WbxA@mail.gmail.com>
References: <CABpuv38XtcJWOT6kskF_Rv3T=_0iSoNCVr7gtnupL0kGQixfWg@mail.gmail.com>
 <CABpuv3_PksR-uLMXhu6DMgwcAYsE9v20Lk=-+GT=ByxvXihv7Q@mail.gmail.com>
 <CABL7CQh9rPWQbturkDXc+6n+j6DKK153f5uC00S3A+L6+rGE4Q@mail.gmail.com>
 <CABpuv39W+PbDA2X9kQ0CXbGbYb8tEiHwGHHGgzFW4isyRXmGjw@mail.gmail.com>
 <CABL7CQgBTtjStXr8F2iTOVCHnsM5+KarwPLc1QiL17zyn8WbxA@mail.gmail.com>
Message-ID: <CABpuv3-+J-8SgxYymnFkh=8iGP0wk2ipjnaQNYZpXTxm00B4sQ@mail.gmail.com>

Submitted :)

On Mon, Apr 12, 2021 at 1:40 PM Ralf Gommers <ralf.gommers at gmail.com> wrote:

>
>
> On Mon, Apr 12, 2021 at 5:00 AM Tirth Patel <tirthasheshpatel at gmail.com>
> wrote:
>
>> Hi Ralf,
>>
>> Thanks a lot for taking a look and for the positive feedback! I have
>> tried to address your comments on the doc. Feel free to check them or ask
>> for more changes.
>>
>
> Looks good. I suggest submitting it, since the deadline is tomorrow -
> don't leave it till the last minute:)
>
> Cheers,
> Ralf
>
>
>>
>> Kind Regards,
>> Tirth Patel
>>
>>
>> On Sun, Apr 11, 2021 at 10:33 PM Ralf Gommers <ralf.gommers at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Sun, Apr 11, 2021 at 10:46 AM Tirth Patel <tirthasheshpatel at gmail.com>
>>> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> I have been working on a draft proposal lately and think it is almost
>>>> ready. Please feel free to add suggestions/comments, if any.
>>>>
>>>> Proposal:
>>>> https://docs.google.com/document/d/1xvRdpNVseTL8d7eWdiyY6rcNxBVKCXt81xE9HyXvtqE/edit?usp=sharing
>>>>
>>>
>>> Hi Tirth, this is a very well-written proposal. I added one comment in
>>> your GDoc about keeping some time for discussing the API with the community
>>> and if needed iterating on it. Your proposal seems good to submit though,
>>> it has enough detail and the plan and milestones look good.
>>>
>>> Cheers,
>>> Ralf
>>>
>>>
>>>
>>>>
>>>> Kind regards,
>>>> Tirth Patel
>>>>
>>>>
>>>>
>>>> On Fri, Apr 2, 2021 at 7:19 PM Tirth Patel <tirthasheshpatel at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I would like to participate in GSoC this year and found this project
>>>>> very interesting!
>>>>>
>>>>> TL; DR: I have a few questions regarding the project:
>>>>>   - Is the user interface desired as a separate python submodule
>>>>> (inside `scipy.stats`) or does it serve as an extension of the `rvs`
>>>>> method?
>>>>>   - Should UNU.RAN C library be included as a submodule within SciPy
>>>>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>>>>> gh-13328)?
>>>>>
>>>>> About Me
>>>>> ********
>>>>> I am Tirth (@tirthasheshpatel on GitHub), a third-year computer
>>>>> science undergrad student. I am quite familiar with Cython and a lot
>>>>> of my college courses make use of C. I have a good knowledge of
>>>>> probability theory and statistics.
>>>>>
>>>>> Open Source work: I have participated in GSoC with the PyMC team last
>>>>> year. I am a contributor to SciPy since May 2020 and recently a
>>>>> maintainer.
>>>>>
>>>>> About Project
>>>>> *************
>>>>> I had a question about the project. Is the user interface desired as a
>>>>> separate python submodule inside `scipy.stats`? like:
>>>>>
>>>>>     import scipy.stats as stats
>>>>>
>>>>>     # sample a 1000 variates from a normal distribution
>>>>>     # with mean 0 and std 1.5. Let UNU.RAN choose the method
>>>>>     rvs = stats.random.normal(0., 1.5, size=1000, method='auto')
>>>>>
>>>>>     # sample 100 samples from the beta distribution using TDR method
>>>>>     beta_rvs = stats.random.beta(1, 2, size=100, method='tdr')
>>>>>
>>>>>     # the `rvs` methods remains unaffected.
>>>>>     norm_rvs = stats.norm.rvs(0, 1.5, size=1000)
>>>>>
>>>>> Or does it serve as an extension of the `rvs` method:
>>>>>
>>>>>     from scipy.stats import norm, beta
>>>>>
>>>>>     # something like this:
>>>>>     # method = None => same behaviour as previous versions
>>>>>     # method = 'auto' => use UNU.RAN and let it choose the method
>>>>>     rvs = norm.rvs(0, 1.5, size=1000, method='auto')
>>>>>
>>>>>     beta_rvs = beta.rvs(1, 2, size=100, method='tdr')
>>>>>
>>>>> Also, should UNU.RAN C library be included as a submodule within SciPy
>>>>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>>>>> gh-13328)?
>>>>>
>>>>>
>>>>> --
>>>>> Kind Regards,
>>>>> Tirth
>>>>>
>>>> _______________________________________________
>>>> SciPy-Dev mailing list
>>>> SciPy-Dev at python.org
>>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>>
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at python.org
>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>

Kind Regards,
Tirth Patel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210412/18e93a96/attachment-0001.html>

From sayedmazen70 at gmail.com  Mon Apr 12 05:05:31 2021
From: sayedmazen70 at gmail.com (Mazen Sayed)
Date: Mon, 12 Apr 2021 11:05:31 +0200
Subject: [SciPy-Dev] GSoC SciPy Optimization Ideas
In-Reply-To: <CAFTPoAw3iMJS2NUtcNYcDs3KRT1BSvpg-XfRgjuY68K6RXWuZg@mail.gmail.com>
References: <CAFTPoAyk_1Y2c6TeSQkXfKZmRgm1qP69pzMB4agTjADzEsUOkA@mail.gmail.com>
 <CALj43ZRVyJ_mhFRgbfrnkh4d-19JQXM4ayj5P_vQ_1FHxy688w@mail.gmail.com>
 <CAEf70bwQ0xh995xp8Mscqj=zfFNUci9c1qcRUQBdMVTxFoJ2YQ@mail.gmail.com>
 <3B55D50F-9DAE-426A-893C-3F3CEB38CF90@gmail.com>
 <CAFTPoAw3iMJS2NUtcNYcDs3KRT1BSvpg-XfRgjuY68K6RXWuZg@mail.gmail.com>
Message-ID: <CAFTPoAzVyyPCmG6JuT-iVeYm7QKLmyaxqG2V07P5gD=fm6W=tQ@mail.gmail.com>

Hello,

I have changed the normalizer to be a user specified argument, also I have
changed the initializer so the user can enter multiple data points, and the
initializer will take the best initial point (that minimize the
unconstrained problem), or the user can let the initializer guess the
initial point by using random search, grid search or global optimizer.

Any modification before submission?

Thanks

https://drive.google.com/file/d/1c9JOJgcq_Ss761rt9SOxnmEtuczuLzAb/view?usp=sharing


On Sun, Apr 11, 2021 at 7:26 PM Mazen Sayed <sayedmazen70 at gmail.com> wrote:

>
> This sounds really good and more reasonable.
>
> Thanks for your help.
>
>
>
> On Sun, Apr 11, 2021 at 7:08 PM Pamphile Roy <roy.pamphile at gmail.com>
> wrote:
>
>> I tend to agree with Daniel here, randomly choosing 50 points in a
>> high-dimensional optimization space is not going to give any advantage. And
>> why 50?
>>
>> The initialization part is one of the most important (and difficult to
>> get right) part of any optimization algorithm, but this is mostly true for
>> global ones: differential evolution, SHGO, Dual Annealing they?re all have
>> their own way. Some of these and many others (especially local algorithms)
>> rely on the user to explicitly pass an initial guess and take it from there.
>>
>>
>> Agreed here, the random sampling must not be totally random in order to
>> cover the parameter space in the most efficient way.
>> The global optimizers all use QMC methods (scipy.stats.qmc) so here you
>> can just rely on these too.
>>
>> Cheers,
>> Pamphile
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210412/36039cf2/attachment.html>

From b.rosa at unistra.fr  Mon Apr 12 05:21:22 2021
From: b.rosa at unistra.fr (Benoit Rosa)
Date: Mon, 12 Apr 2021 11:21:22 +0200
Subject: [SciPy-Dev] Some remarks on least-squares algorithms
Message-ID: <cb34aa9b-8aa1-a25a-0c28-83a63a16e7d7@unistra.fr>

Hi all,

I've been experimenting lately with the least squares algorithms on 
SciPy, and noted a few interesting things.

First, some backstory: with a Grad student, we are looking at soving 
complex BVP problems, with lots of variables. The first step was 
reimplementing a known algorithm from the literature. In the original 
paper, the problem is square (126 elements in the state vector, and 126 
elements in the error vector). But as soon as we changed one variable of 
the problem, it became underdetermined (i.e. less elements in the error 
vector than in the state vector).

The student originally implemented all in MATLAB, using 
Levenberg-Marquardt for the solver. It was working well, even for the 
underdetermined case (I'll come back to this point). However, we noticed 
that on the same machine, with the same solver parameters, results were 
quite different with different versions of Matlab. Sometimes leading to 
failures in the optimization in Matlab 2019 and success in Matlab 2018 
(version numbers not fully accurate, but you get the idea).

For this reason, we decided to switch to Python and SciPy, in order to 
profit from more stability in the optimization algorithms (and also the 
possibility get answers from the dev community in case of problems).

The first remark in the process is that SciPy's LM implementation (based 
on MINPACK) does not allow to solve underdetermined problems. 
Interestingly, both Matlab and SciPy have three available methods to 
solve nonlinear least squares : LM, dogbox and trust region reflective. 
MATLAB does allow underdetermined problems for LM, but not for dogbox 
and trf, while SciPy does allow underdetermined problems for dogbox and 
trf, but not for LM !

The second remark is that, when looking at MATLAB's implementation, I 
found this:

===
 ??? if ~scaleStep?? % Unscaled Jacobian
 ??????? if ~jacIsSparse
 ??????????? scaleMat = sqrt(lambda)*eye(nVar);
 ??????? else
 ??????????? scaleMat = sqrt(lambda)*speye(nVar);
 ??????? end
 ??? else
 ??????? if ~jacIsSparse
 ??????????? scaleMat = diag(sqrt(lambda*diagJacTJac));
 ??????? else
 ??????????? scaleMat = spdiags(sqrt(lambda*diagJacTJac),0,nVar,nVar);
 ??????? end
 ??? end

 ??? % Compute LM step
 ??? if successfulStep
 ??????? % Augmented Jacobian
 ??????? AugJac = [JAC; scaleMat];
 ??????? AugRes = [-costFun; zeroPad]; % Augmented residual
 ??? else
 ??????? % If previous step failed, replace only the part of the matrix 
that has changed
 ??????? AugJac(nfun+1:end,:) = scaleMat;
 ??? end
====

Bottom line is, they augment the Jacobian using "AugJac = [JAC; 
scaleMat];", even if the Jacobian Scaling option is off (first "if'). In 
the end, the augmented Jacobian will always be square and invertible, 
which means their implementation of the algorithm is applicable to any 
problem, determined or underdetermined.

I know that an underdetermined problem is fundamentally problematic, 
since there isn't a unique solution to it. However, in this case, Matlab 
does solve the problem efficiently and SciPy does not. I know SciPy's LM 
is based on MINPACK, which is a verified and benchmarked solution, but 
I'm wondering whether the LM algorithm in Scipy should be updated ? 
Also, a Python (or Cython?) implementation would? allow to follow the 
optimization process (right now, verbose=2 doesn't output anything 
because of the MINPACK routine being called).

I'm afraid I can't share the code I used because it's ongoing research, 
but I'd be glad to get your thoughts on these points.

Best,
Benoit

-- 
Mr. Benoit ROSA, PhD
Research Scientist / Charg? de recherche CNRS

ICube (UMR 7357 CNRS-University of Strasbourg)
s/c IHU
1, place de l'H?pital
67091 Strasbourg Cedex, France

b.rosa at unistra.fr


From robert.kern at gmail.com  Mon Apr 12 11:10:15 2021
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 12 Apr 2021 11:10:15 -0400
Subject: [SciPy-Dev] Sensitivity analysis module proposal
In-Reply-To: <B622DD0D-6D29-447D-9937-9E2119494727@gmail.com>
References: <37CE6815-2B29-448E-9F65-52273BF3AD14@gmail.com>
 <CAF6FJisMyEKWBf44=+GM4YwO+ShDy8QOkZXP1a06o_KxtVJ9Xg@mail.gmail.com>
 <AE4B423D-60B6-4D30-8E86-7893CBC8D747@gmail.com>
 <CAF6FJiutf6bWYF=MpUY5zxPv1CqGV4c=AQU27sgxdC3ZHwCoBg@mail.gmail.com>
 <B622DD0D-6D29-447D-9937-9E2119494727@gmail.com>
Message-ID: <CAF6FJisQoRx6j9+OHZdH4Ok6Sgnht4y28Uy-WA5hX-uR6_ip6Q@mail.gmail.com>

On Mon, Apr 12, 2021 at 3:50 AM Pamphile Roy <roy.pamphile at gmail.com> wrote:

> Thank you for your detailed reply Robert.
>
> On 11.04.2021, at 19:55, Robert Kern <robert.kern at gmail.com> wrote:
>
> "It exists elsewhere" isn't my argument. While this is obviously a
> judgement call, and my opinion isn't necessarily that of anyone else's, I
> do have a rough rubric in mind when I consider things for inclusion in
> scipy. The main guiding principle is to make important functionality
> available to the scientific Python community. If including that
> functionality in scipy advances that, great, that's an argument for
> inclusion. But sometimes, inclusion inside scipy is just a neutral move,
> and I think that's the case here. That's not dispositive, but then we have
> to go to more specific reasons, like wanting to use the functionality
> inside other parts of scipy (like QMC in SHGO).
>
>
> I guess this will be very subjective. As for the scientific impact, I will
> just point out that the known textbooks about SA have thousands of
> citations and are involved with policy-makers through things like JRC, EU.
> Cf. people like Andrea Saltelli, Stefano Tarantola, IM Sobol?.
>

To be clear, I'm taking the importance of methods as a given. Your original
email was quite convincing on that matter.


> If the SALib developers expressed interest in merging SALib into scipy,
> that'd be one thing. But if they are interested in maintaining it as an
> independent project, I would recommend contributing to it to build on their
> success instead of starting from scratch. As a tentpole project of the
> scientific Python community, we want to support the efforts of the whole
> community, not replace them or absorb them.
>
>
> Just to clarify that my proposal does not seek to eat other libraries,
> more to provide the basic tools for them.
>

The issue as I see it is that SALib *is* the basic toolset that you would
want to have. It's not a fancy, heavy UQ framework. It's just some
functions to compute some sample points for your model to evaluate and
other functions to spit out sensitivity reports on those evaluations. If
SALib is missing a few more tools, I encourage contributing them to SALib
and not scipy.

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210412/7f6f4d65/attachment-0001.html>

From robert.kern at gmail.com  Mon Apr 12 11:24:24 2021
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 12 Apr 2021 11:24:24 -0400
Subject: [SciPy-Dev] Some remarks on least-squares algorithms
In-Reply-To: <cb34aa9b-8aa1-a25a-0c28-83a63a16e7d7@unistra.fr>
References: <cb34aa9b-8aa1-a25a-0c28-83a63a16e7d7@unistra.fr>
Message-ID: <CAF6FJiux2-=1oXn9SjyGyeQAZYx4yv_yt91QezVWn97sjDRBJA@mail.gmail.com>

On Mon, Apr 12, 2021 at 5:30 AM Benoit Rosa <b.rosa at unistra.fr> wrote:

>
> The second remark is that, when looking at MATLAB's implementation, I
> found this:
>

First, thank you for your comments.

Second, I know it seems innocuous, but please do not post snippets of
MATLAB's source code here. They are under copyright and presumably MATLAB's
proprietary license. We can't refer to that code when writing our own
implementation, even if we are just reading this email with the snippet. We
can refer to open source code under a liberal license like SciPy's, and we
can refer to papers or even MATLAB's documentation that describe the
algorithm but not in code. Having the snippet posted here puts us in an
awkward position with respect to actually following up on your request.

Thank you for your consideration.

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210412/12ec7d5a/attachment.html>

From roy.pamphile at gmail.com  Mon Apr 12 12:13:29 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Mon, 12 Apr 2021 18:13:29 +0200
Subject: [SciPy-Dev] Roadmap
Message-ID: <35A49B1C-3E58-424A-9280-0D32EA89137F@gmail.com>

Hi everyone,

I have opened a PR to discuss about updating the roadmap: https://github.com/scipy/scipy/pull/13849 <https://github.com/scipy/scipy/pull/13849>

This came up as I was fixing a broken link on Scipy.org <http://scipy.org/> and we said (with Ralf) that we could add the redesign/change of purpose of the
website to the roadmap.

Please come and participate in the discussion :)

Cheers,
Pamphile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210412/127b3c3e/attachment.html>

From cmkleffner at gmail.com  Tue Apr 13 16:01:35 2021
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Tue, 13 Apr 2021 22:01:35 +0200
Subject: [SciPy-Dev] Some remarks on least-squares algorithms
In-Reply-To: <CAF6FJiux2-=1oXn9SjyGyeQAZYx4yv_yt91QezVWn97sjDRBJA@mail.gmail.com>
References: <cb34aa9b-8aa1-a25a-0c28-83a63a16e7d7@unistra.fr>
 <CAF6FJiux2-=1oXn9SjyGyeQAZYx4yv_yt91QezVWn97sjDRBJA@mail.gmail.com>
Message-ID: <CAGGsPMzV1-Nc7TnEoqJ-7WE95XG1WMoGABw2KNfCD1myYrzJjA@mail.gmail.com>

You may take a look at ARLS: Automatically Regularized Linear System Solver
<https://pypi.org/project/arls/> based on scipy.linalg.
It is a heuristic solver (using the picard condition) for over-determined /
under-determined or ill-conditioned problems.

Cheers

Carl

Am Mo., 12. Apr. 2021 um 17:24 Uhr schrieb Robert Kern <
robert.kern at gmail.com>:

> On Mon, Apr 12, 2021 at 5:30 AM Benoit Rosa <b.rosa at unistra.fr> wrote:
>
>>
>> The second remark is that, when looking at MATLAB's implementation, I
>> found this:
>>
>
> First, thank you for your comments.
>
> Second, I know it seems innocuous, but please do not post snippets of
> MATLAB's source code here. They are under copyright and presumably MATLAB's
> proprietary license. We can't refer to that code when writing our own
> implementation, even if we are just reading this email with the snippet. We
> can refer to open source code under a liberal license like SciPy's, and we
> can refer to papers or even MATLAB's documentation that describe the
> algorithm but not in code. Having the snippet posted here puts us in an
> awkward position with respect to actually following up on your request.
>
> Thank you for your consideration.
>
> --
> Robert Kern
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210413/5c23bebd/attachment.html>

From andyfaff at gmail.com  Tue Apr 13 20:53:05 2021
From: andyfaff at gmail.com (Andrew Nelson)
Date: Wed, 14 Apr 2021 10:53:05 +1000
Subject: [SciPy-Dev] GSoC SciPy Optimization Ideas
In-Reply-To: <CAFTPoAzVyyPCmG6JuT-iVeYm7QKLmyaxqG2V07P5gD=fm6W=tQ@mail.gmail.com>
References: <CAFTPoAyk_1Y2c6TeSQkXfKZmRgm1qP69pzMB4agTjADzEsUOkA@mail.gmail.com>
 <CALj43ZRVyJ_mhFRgbfrnkh4d-19JQXM4ayj5P_vQ_1FHxy688w@mail.gmail.com>
 <CAEf70bwQ0xh995xp8Mscqj=zfFNUci9c1qcRUQBdMVTxFoJ2YQ@mail.gmail.com>
 <3B55D50F-9DAE-426A-893C-3F3CEB38CF90@gmail.com>
 <CAFTPoAw3iMJS2NUtcNYcDs3KRT1BSvpg-XfRgjuY68K6RXWuZg@mail.gmail.com>
 <CAFTPoAzVyyPCmG6JuT-iVeYm7QKLmyaxqG2V07P5gD=fm6W=tQ@mail.gmail.com>
Message-ID: <CAAbtOZeDzEhDynwPLEkEh9bqNwMctxp0Lv42z778k2RfP+WTpg@mail.gmail.com>

Dear Mazen,
I don't hink that the proposals you outlined are clear enough at the
moment, and would benefit from addressing the following points.

- the penalty method is the most interesting part of the proposal. I think
a bit more expansion on this point would be useful. For example, how easy
would it be to turn any constrained problem into an objective function that
could be used with any unconstrained minimizer? Could that respecification
work with existing interfaces for expressing constraints? Could the
existing minimizer interface be left untouched, and function wrappers or
classes written to convert those constrained systems into a function given
to unmodified existing code?

- Vectorisation in optimize sounds like a nice idea, but may be very
difficult to implement in practice. From your document it seems that you're
mainly proposing to vectorise derivative calculation using finite
differences. The finite difference routine `approx_derivative` in _numdiff
would have to be rewritten to perform in a vectorised manner, across
different dimensions (R^m -> R->n), with a vectorised objective function.
How would the stack trace from the initial `minimize` call through to
approx_derivative have to change? (this is definitely way more than a weeks
work). If you had profiling to demonstrate that vectorisation would provide
a significant performance gain that would be great.

- it's unclear what the normalisation (section 5) refers to? Is this an
automated way of selecting the correct value for a Lagrangian multiplier?

- is the smart initialisation (section 7.2) a way of selecting the correct
value for a Lagrangian multiplier used for the constrained->unconstrained
transform? Or is it a variation on `brute + polish`? If the latter then how
would this be any better than shgo/dual_annealing/differential_evolution?

- improvement of documentation is always welcomed, but there's not a huge
explanation of the specific target areas for improvement.

regards,
Andrew.

On Mon, 12 Apr 2021 at 19:06, Mazen Sayed <sayedmazen70 at gmail.com> wrote:

> Hello,
>
> I have changed the normalizer to be a user specified argument, also I have
> changed the initializer so the user can enter multiple data points, and the
> initializer will take the best initial point (that minimize the
> unconstrained problem), or the user can let the initializer guess the
> initial point by using random search, grid search or global optimizer.
>
> Any modification before submission?
>
> Thanks
>
>
> https://drive.google.com/file/d/1c9JOJgcq_Ss761rt9SOxnmEtuczuLzAb/view?usp=sharing
>
>
>
> On Sun, Apr 11, 2021 at 7:26 PM Mazen Sayed <sayedmazen70 at gmail.com>
> wrote:
>
>>
>> This sounds really good and more reasonable.
>>
>> Thanks for your help.
>>
>>
>>
>> On Sun, Apr 11, 2021 at 7:08 PM Pamphile Roy <roy.pamphile at gmail.com>
>> wrote:
>>
>>> I tend to agree with Daniel here, randomly choosing 50 points in a
>>> high-dimensional optimization space is not going to give any advantage. And
>>> why 50?
>>>
>>> The initialization part is one of the most important (and difficult to
>>> get right) part of any optimization algorithm, but this is mostly true for
>>> global ones: differential evolution, SHGO, Dual Annealing they?re all have
>>> their own way. Some of these and many others (especially local algorithms)
>>> rely on the user to explicitly pass an initial guess and take it from there.
>>>
>>>
>>> Agreed here, the random sampling must not be totally random in order to
>>> cover the parameter space in the most efficient way.
>>> The global optimizers all use QMC methods (scipy.stats.qmc) so here you
>>> can just rely on these too.
>>>
>>> Cheers,
>>> Pamphile
>>>
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at python.org
>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>
>> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>


-- 
_____________________________________
Dr. Andrew Nelson


_____________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210414/ec5aa08f/attachment.html>

From cmkleffner at gmail.com  Wed Apr 14 06:29:28 2021
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Wed, 14 Apr 2021 12:29:28 +0200
Subject: [SciPy-Dev] Some remarks on least-squares algorithms
In-Reply-To: <CAGGsPMzV1-Nc7TnEoqJ-7WE95XG1WMoGABw2KNfCD1myYrzJjA@mail.gmail.com>
References: <cb34aa9b-8aa1-a25a-0c28-83a63a16e7d7@unistra.fr>
 <CAF6FJiux2-=1oXn9SjyGyeQAZYx4yv_yt91QezVWn97sjDRBJA@mail.gmail.com>
 <CAGGsPMzV1-Nc7TnEoqJ-7WE95XG1WMoGABw2KNfCD1myYrzJjA@mail.gmail.com>
Message-ID: <CAGGsPMwS4Jmr0LMjYxCvYqzecOkr_R18rbt7jJ4+4V31zcAPAA@mail.gmail.com>

I oversight that your problem is non-linear. You may try out DFO-GN:

DFO-GN: A Derivative-Free Gauss-Newton Solver
https://github.com/numericalalgorithmsgroup/dfogn
https://numericalalgorithmsgroup.github.io/dfogn/build/html/index.html

However, I didn't test it myself.

Carl

Am Di., 13. Apr. 2021 um 22:01 Uhr schrieb Carl Kleffner <
cmkleffner at gmail.com>:

> You may take a look at ARLS: Automatically Regularized Linear System
> Solver <https://pypi.org/project/arls/> based on scipy.linalg.
> It is a heuristic solver (using the picard condition) for over-determined
> / under-determined or ill-conditioned problems.
>
> Cheers
>
> Carl
>
> Am Mo., 12. Apr. 2021 um 17:24 Uhr schrieb Robert Kern <
> robert.kern at gmail.com>:
>
>> On Mon, Apr 12, 2021 at 5:30 AM Benoit Rosa <b.rosa at unistra.fr> wrote:
>>
>>>
>>> The second remark is that, when looking at MATLAB's implementation, I
>>> found this:
>>>
>>
>> First, thank you for your comments.
>>
>> Second, I know it seems innocuous, but please do not post snippets of
>> MATLAB's source code here. They are under copyright and presumably MATLAB's
>> proprietary license. We can't refer to that code when writing our own
>> implementation, even if we are just reading this email with the snippet. We
>> can refer to open source code under a liberal license like SciPy's, and we
>> can refer to papers or even MATLAB's documentation that describe the
>> algorithm but not in code. Having the snippet posted here puts us in an
>> awkward position with respect to actually following up on your request.
>>
>> Thank you for your consideration.
>>
>> --
>> Robert Kern
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210414/dfbdaca6/attachment-0001.html>

From samwallan at icloud.com  Fri Apr 16 23:16:58 2021
From: samwallan at icloud.com (Sam Wallan)
Date: Fri, 16 Apr 2021 20:16:58 -0700
Subject: [SciPy-Dev] PR: add trimming to ttest_ind
Message-ID: <E67D9242-7B99-4259-87C6-3A109C61371C@icloud.com>

Good evening,

I have an open PR, https://github.com/scipy/scipy/pull/13696 <https://github.com/scipy/scipy/pull/13696>, regarding adding trimming capabilities to `stats.ttest_ind`. It is based off of a stale PR, and involves the interface change to add the `trim` keyword to the method.

Please feel free to join in on the discussion of this interface addition and the PR. We?d like to leave it open for another week for input.

Thanks in advance,
Samuel Wallan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210416/c0cf4537/attachment.html>

From ralf.gommers at gmail.com  Sat Apr 17 03:34:21 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 17 Apr 2021 09:34:21 +0200
Subject: [SciPy-Dev] PR: add trimming to ttest_ind
In-Reply-To: <E67D9242-7B99-4259-87C6-3A109C61371C@icloud.com>
References: <E67D9242-7B99-4259-87C6-3A109C61371C@icloud.com>
Message-ID: <CABL7CQjNEUi17XLX+6VwhCF3=UEQnxm=4j-dAWs=M8xnd4a4pg@mail.gmail.com>

On Sat, Apr 17, 2021 at 5:36 AM Sam Wallan <samwallan at icloud.com> wrote:

> Good evening,
>
> I have an open PR, https://github.com/scipy/scipy/pull/13696, regarding
> adding trimming capabilities to `stats.ttest_ind`. It is based off of a
> stale PR, and involves the interface change to add the `trim` keyword to
> the method.
>
> Please feel free to join in on the discussion of this interface addition
> and the PR. We?d like to leave it open for another week for input.
>

Thanks for the heads up Samuel, that PR looks quite good.

Cheers,
Ralf


> Thanks in advance,
> Samuel Wallan
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210417/89262d7c/attachment.html>

From nico.schloemer at gmail.com  Sat Apr 17 10:18:16 2021
From: nico.schloemer at gmail.com (=?UTF-8?Q?Nico_Schl=C3=B6mer?=)
Date: Sat, 17 Apr 2021 16:18:16 +0200
Subject: [SciPy-Dev] scipy.optimize: make out.x same shape as x0
Message-ID: <CAK6Z60dyjb2nFANjtJUcdyubwDRjLv3jU=HC-Chs0D83cUcK8A@mail.gmail.com>

Hi everyone,

I made a proposal for scipy.optimize/minimize methods to return the
solution out.x in the same shape as the input x0. Rationale is here
[1], and it was suggested to post the proposal here.

I'd be willing to work on this, of course. Let me know your thoughts!

Cheers,
Nico

[1] https://github.com/scipy/scipy/issues/13869

From roy.pamphile at gmail.com  Sun Apr 18 16:51:52 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Sun, 18 Apr 2021 22:51:52 +0200
Subject: [SciPy-Dev] RNG and examples
Message-ID: <48ABDF94-4F36-468F-9276-3892574692D8@gmail.com>

Hi everyone,

We just merged a PR about using RNG in examples.

From now one, please only use the new NumPy np.random.Generator API and don?t explicitly seed (either the generator or globally).
This is the new canonical way:

>>> rng = np.random.default_rng()
>>> sample = rng.random(...)

The first line is initializing the generator, BUT, it will get overridden to add a seed. So the examples are still deterministic.

We are doing all this to promote good practices among users.
It?s very important to use the new NumPy API and to use sensible seeding only when appropriate (testing for instance).

For more discussions, please see the PR: https://github.com/scipy/scipy/pull/13863 <https://github.com/scipy/scipy/pull/13863>

Thanks everyone.

Cheers,
Pamphile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210418/5c080412/attachment.html>

From fricke at gmail.com  Tue Apr 20 18:30:21 2021
From: fricke at gmail.com (Tobin Fricke)
Date: Tue, 20 Apr 2021 15:30:21 -0700
Subject: [SciPy-Dev] Composability of LTI objects
Message-ID: <CAAeLUT6qmQ7uu522pfiFe+20_dZtb+n2qZCa1r4PVz3xN6P_=g@mail.gmail.com>

Hi SciPy-devs,

I recently began working with LTI objects in SciPy, and was surprised to
discover that objects like scipy.signal.TransferFunction do not appear to
be composable. I would like to be able to compose two transfer functions
with addition, or indicate the series combination of a TransferFunction
object with a ZerosPolesGain object using the * operator.

Is there a reason this feature has not been pursued? If not, I would be
happy to work on it.

thanks,
Tobin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210420/594bc325/attachment.html>

From ilhanpolat at gmail.com  Wed Apr 21 13:02:30 2021
From: ilhanpolat at gmail.com (Ilhan Polat)
Date: Wed, 21 Apr 2021 19:02:30 +0200
Subject: [SciPy-Dev] Composability of LTI objects
In-Reply-To: <CAAeLUT6qmQ7uu522pfiFe+20_dZtb+n2qZCa1r4PVz3xN6P_=g@mail.gmail.com>
References: <CAAeLUT6qmQ7uu522pfiFe+20_dZtb+n2qZCa1r4PVz3xN6P_=g@mail.gmail.com>
Message-ID: <CAEBuzr9pHy=xi+fPH9k5GcLXk2vCuBO3pM-+8pjZiJjkkQmKVg@mail.gmail.com>

Hi Tobin,

It is mostly blocked by certain legacy behaviors. We discussed a few times
the status of the control-ish parts of signal and the majority feels like
people should probably be better off with python-control, harold, or
whatever else specialized package seems to be suitable for their workflow.

This is not to say that we gave up on the LTI objects and that part of the
signal but the wiggle room is indeed quite limited for new features.


Disclosure: I am the author of harold but I am not finding too much time
lately for a new version which is long overdue

On Wed, Apr 21, 2021 at 12:32 AM Tobin Fricke <fricke at gmail.com> wrote:

> Hi SciPy-devs,
>
> I recently began working with LTI objects in SciPy, and was surprised to
> discover that objects like scipy.signal.TransferFunction do not appear to
> be composable. I would like to be able to compose two transfer functions
> with addition, or indicate the series combination of a TransferFunction
> object with a ZerosPolesGain object using the * operator.
>
> Is there a reason this feature has not been pursued? If not, I would be
> happy to work on it.
>
> thanks,
> Tobin
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210421/702116aa/attachment.html>

From tobi.schraink at gmail.com  Fri Apr 23 14:05:31 2021
From: tobi.schraink at gmail.com (Tobias Schraink)
Date: Fri, 23 Apr 2021 14:05:31 -0400
Subject: [SciPy-Dev] Improvement Suggestion for scipy.stats.spearmanr and by
 extension scipy.stats.mstats.spearmanr
Message-ID: <CANcjR1_9e3Yvz_gYq-FCMAUfwctHqTnbLgaVkUPJOBC_+F3L-w@mail.gmail.com>

*Intro:*
scipy.stats.spearmanr calculates the spearman correlation between two 1D
arrays, or when presented with 2D array(s), performs the same operation
pairwise on the comprising 1D arrays.
Currently, scipy.stats.spearmanr uses scipy.stats.mstats.spearmanr under
the hood which is where the issue arises.
When talking about matching non-NaN values between two arrays, consider
these two arrays:
[1,NaN]
[2,3]
position 0 is matching (since both arrays do not have a NaN) and position 1
is not matching.

*The Issue <https://github.com/scipy/scipy/issues/13900  >:*
When using scipy.stats.spearmanr with *nan_policy='omit'*, it will produce
the error *ValueError: The input must have at least 3 entries! * when
comparing two arrays which have exactly 1 matched pair of non-NaN values,
given that one of the arrays contains at least one NaN.
This becomes a problem when using spearmanr on large, sparse datasets where
either only aggressive NaN filtering or manual error-catching may prevent
this error.
According to the nan policy doc
<http://scipy.github.io/devdocs/dev/api-dev/nan_policy.html> with respect
to *nan_policy='omit'*:

*More generally, for functions that return a scalar, func(a,
nan_policy='omit') should behave the same as func(a[~np.isnan(a)]).*


*Suggested Improvement:*
I therefore suggest that scipy.stats.mstats.spearmanr and
scipy.stats.spearmanr be altered so as to return
*SpearmanrResult(correlation=nan,
pvalue=nan)* given two arrays that have exactly 1 matched pair of non-NaN
values. I have been corresponding with mdhaber who mentioned this would be
a difficult first issue for me to contribute, since it would *break
backwards compatibility*. He also pointed me to another issue
<https://github.com/scipy/scipy/issues/12241> that likely stems from the
same issue, resulting in inaccurate p-values and correlation values in
correlations involving arrays containing NaNs and arrays of all 0s.

*Asking for feedback:*
Are there reasons to prefer the error that is currently being raised?
Should scipy.stats.spearmanr instead produce an error for both the cases
with and without NaNs where only a single non-NaN value is matched between
two arrays, i.e. the following should also raise the error?
*spearmanr([1], [2], nan_policy='omit')*

Best,
Tobias Schraink
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210423/692f0e96/attachment.html>

From Albert_Steppi at hms.harvard.edu  Sun Apr 25 00:16:45 2021
From: Albert_Steppi at hms.harvard.edu (Steppi, Albert)
Date: Sun, 25 Apr 2021 04:16:45 +0000
Subject: [SciPy-Dev] Proposal to add Inverse of Log CDF of Normal
 Distribution to scipy.special
Message-ID: <BL0PR07MB49797D68F4E944F5F4C61CD0DA439@BL0PR07MB4979.namprd07.prod.outlook.com>

Hi,

I'm a software developer employed in an academic laboratory working primarily on research projects involving machine learning and biomedical text mining. I've been a longtime scipy user and am a big fan of your work. One of the leads of my team is working on a problem where it has become important to calculate z-scores associated to log p-values which can at times be very small. The naive solution of applying

scipy.special.ndtri(numpy.exp(log_p))

fails when log_p is less than approximately -745 due to underflow. I found a solution to this problem by inspecting the underlying C code to scipy.special.ndtri.and my team lead suggested I post an issue to see if there was any interest in adding an inverse of the log CDF of the normal distribution to scipy.  That issue can be found here https://github.com/scipy/scipy/issues/13923. (You can find the details of how the proposed implementation works there.)

It was pointed out that another user had previously posted an issue,
https://github.com/scipy/scipy/issues/11465, asking for the same function (among other things). 

If there's interest in adding an inverse to the log CDF of the normal distribution to scipy.special I can submit a PR in the next week or two. It's not clear to me what such a function should be called though. The related functions that currently exist in scipy special are called ndtr (CDF of normal distribution), log_ndtr (log of CDF of normal distribution), ndtri (inverse of CDF of normal distribution). log_ndtri is ambiguous, ndtri_exp is unambiguous and possibly acceptable. I've found that in the Julia Stats library this function is called norminvlogcdf, and the analogous functions for all distributions seem to follow the same naming scheme, https://github.com/JuliaStats/StatsFuns.jl/blob/master/src/distrs/norm.jl
(I've checked just now and it appears that the Julia function applies the same technique I propose. I wasn't previously aware of that.) Perhaps whatever name is chosen should be thought of as defining a standard for the names of any inverse log CDF functions that may be added in the future.

I encourage anyone interested in extended discussion to come to the comments section on the related issue https://github.com/scipy/scipy/issues/13923

Thanks,
Albert


Albert Steppi III, Ph.D.
Scientific Software Developer
Laboratory of Systems Pharmacology
Harvard Medical School


From evgeny.burovskiy at gmail.com  Sun Apr 25 03:33:52 2021
From: evgeny.burovskiy at gmail.com (Evgeni Burovski)
Date: Sun, 25 Apr 2021 10:33:52 +0300
Subject: [SciPy-Dev] numpy/npy_math.h vs C/C++ stdlib math.h/<cmath>
Message-ID: <CAMRo0iu7nyutQgW-OS+Y+3jcJw_FJDHntVg=UmyjP868N3L0Ow@mail.gmail.com>

Hi,

Prompted by Sturla's comment,
https://github.com/scipy/scipy/pull/13931#discussion_r619728364,
suggesting to change a call of `npy_log` into an `std::log`, here's a
question:  what is the recommendation going forward for the math
functions (log, exp, trig functions etc) in scipy C/C++ code: do we
import them from `numpy/npy_math.h` or do we rely on the standard
library?

My main concern is more exotic platforms/toolchains and related edge
cases--- do we rely on numpy or the standard library vendors?

`npy_math.h` still often needs to be included for e.g. numpy data
types, so it's not a question of avoiding an extra #include.

Cheers,

Evgeni

From Albert_Steppi at hms.harvard.edu  Sun Apr 25 12:15:46 2021
From: Albert_Steppi at hms.harvard.edu (Steppi, Albert)
Date: Sun, 25 Apr 2021 16:15:46 +0000
Subject: [SciPy-Dev] Proposal to add Inverse of Log CDF of Normal
 Distribution to scipy.special
In-Reply-To: <BL0PR07MB49797D68F4E944F5F4C61CD0DA439@BL0PR07MB4979.namprd07.prod.outlook.com>
References: <BL0PR07MB49797D68F4E944F5F4C61CD0DA439@BL0PR07MB4979.namprd07.prod.outlook.com>
Message-ID: <BL0PR07MB4979497C6AAD51CF22DFE7EDDA439@BL0PR07MB4979.namprd07.prod.outlook.com>

To summarize the state of things. The inverse of the log CDF of the normal distribution is potentially useful. There is a good approximation to it that is already being used in cephes to calculate the inverse of the normal CDF when the probability p is close to zero, it does so by taking the log of the argument and then applying the approximation. The inverse of the log CDF could be added to `scipy.special` with minimal effort by using the approximation in cephes `ndtri` directly for when `log_p` is sufficiently small, otherwise exponentiating and applying `ndtri`. I've written a Cython implementation that does just this and would like to know if there's interest in adding it to `scipy.special`. It's unclear to me what the function should be called though and whatever name is chosen may set a precedent if there was ever an interest in adding other inverse log CDF functions.

Thanks,
Albert

________________________________
From: Steppi, Albert
Sent: Sunday, April 25, 2021 12:16 AM
To: scipy-dev at python.org <scipy-dev at python.org>
Subject: Proposal to add Inverse of Log CDF of Normal Distribution to scipy.special

Hi,

I'm a software developer employed in an academic laboratory working primarily on research projects involving machine learning and biomedical text mining. I've been a longtime scipy user and am a big fan of your work. One of the leads of my team is working on a problem where it has become important to calculate z-scores associated to log p-values which can at times be very small. The naive solution of applying

scipy.special.ndtri(numpy.exp(log_p))

fails when log_p is less than approximately -745 due to underflow. I found a solution to this problem by inspecting the underlying C code to scipy.special.ndtri.and my team lead suggested I post an issue to see if there was any interest in adding an inverse of the log CDF of the normal distribution to scipy.  That issue can be found here https://github.com/scipy/scipy/issues/13923. (You can find the details of how the proposed implementation works there.)

It was pointed out that another user had previously posted an issue,
https://github.com/scipy/scipy/issues/11465, asking for the same function (among other things).

If there's interest in adding an inverse to the log CDF of the normal distribution to scipy.special I can submit a PR in the next week or two. It's not clear to me what such a function should be called though. The related functions that currently exist in scipy special are called ndtr (CDF of normal distribution), log_ndtr (log of CDF of normal distribution), ndtri (inverse of CDF of normal distribution). log_ndtri is ambiguous, ndtri_exp is unambiguous and possibly acceptable. I've found that in the Julia Stats library this function is called norminvlogcdf, and the analogous functions for all distributions seem to follow the same naming scheme, https://github.com/JuliaStats/StatsFuns.jl/blob/master/src/distrs/norm.jl
(I've checked just now and it appears that the Julia function applies the same technique I propose. I wasn't previously aware of that.) Perhaps whatever name is chosen should be thought of as defining a standard for the names of any inverse log CDF functions that may be added in the future.

I encourage anyone interested in extended discussion to come to the comments section on the related issue https://github.com/scipy/scipy/issues/13923

Thanks,
Albert


Albert Steppi III, Ph.D.
Scientific Software Developer
Laboratory of Systems Pharmacology
Harvard Medical School

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210425/74f4f359/attachment.html>

From ralf.gommers at gmail.com  Sun Apr 25 16:31:33 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 25 Apr 2021 22:31:33 +0200
Subject: [SciPy-Dev] Global Optimization Benchmarks
In-Reply-To: <CALj43ZT0FmczXdM6VJf-qTKJZQbNGpO6RMfBCDF86mBN3Jc7mg@mail.gmail.com>
References: <CAEf70byLCufw4rH0MwJ1GML2Jmj++rNoy07UverDb1vRMz8fFw@mail.gmail.com>
 <CALhqKPB11vJUW1rJv9dU-4qX5C=4Drvy2pZ_rK3oqy1yRe=VEw@mail.gmail.com>
 <CAEf70bxVGPHLwU7ej13BZ24yvQDurP2Dzdn8Upq5cAJjBVE1GQ@mail.gmail.com>
 <CALj43ZT0FmczXdM6VJf-qTKJZQbNGpO6RMfBCDF86mBN3Jc7mg@mail.gmail.com>
Message-ID: <CABL7CQgeykUpP5WZVZ0vTEknAL9F7Ax6ihGFRQi5e6_4cCkR+g@mail.gmail.com>

Hi Daniel,

Sorry for the extremely delayed reply.


On Thu, Mar 11, 2021 at 10:19 AM Daniel Schmitz <
danielschmitzsiegen at googlemail.com> wrote:

> Hey all,
>
> after Andrea's suggestions, I did an extensive github search and found
> several global optimization python libraries which mimic the scipy
> API, so that a user only has to change the import statements. Could it
> be useful to add a page in the documentation of these?
>

This does sound useful. Probably not a whole page, more like a note in the
`minimize` docstring with references I'd think.


> Non exhaustive list:
>
> DIRECT: https://github.com/andim/scipydirect
> DE/PSO/CMA-ES: https://github.com/keurfonluu/stochopy
> PSO: https://github.com/jerrytheo/psopy
> Powell's derivative free optimizers: https://www.pdfo.net/index.html
>
> As DIRECT was very competitive on some of Andrea's benchmarks, it
> could be useful to mimic the scipydirect repo for inclusion into scipy
> (MIT license). The code is unfortunately a f2py port of the original
> Fortran implementation which has hard coded bounds on the number of
> function evaluations (90.000) and iterations (6.000). Any opinions on
> this?
>

This sounds like a good idea. Would you mind opening a GitHub issue with
the feature request, so we keep track of this? Contacting the original
author about this would also be useful; if the author would like to
upstream their code, that'd be a good outcome.

Re hard coded bounds, I assume those can be removed again without too much
trouble.


>
> I personally am very impressed by biteopt's performance, and although
> it ranked very high in other global optimization benchmarks there is
> no formal paper on it yet. I understand the scipy guidelines in a way
> that such a paper is a requisite for inclusion into scipy.
>

Well, we do have the very extensive benchmarks - if it does indeed perform
better than what we have, then of course we'd be happy to add a new
algorithm even if it doesn't have a paper. We use papers and the citations
it has as an indication of usefulness only; anything that outperforms our
existing code is clearly useful.

Cheers,
Ralf


>
> Best,
>
> Daniel
>
> Daniel
>
>
> On Sun, 17 Jan 2021 at 14:33, Andrea Gavana <andrea.gavana at gmail.com>
> wrote:
> >
> > Hi Stefan,
> >
> >     You?re most welcome :-) . I?m happy the experts in the community are
> commenting and suggesting things, and constructive criticism is also always
> welcome.
> >
> > On Sun, 17 Jan 2021 at 12.11, Stefan Endres <stefan.c.endres at gmail.com>
> wrote:
> >>
> >> Dear Andrea,
> >>
> >> Thank you very much for this detailed analysis. I don't think I've seen
> such a large collection of benchmark test suites or collection of DFO
> algorithms since the publication by Rios and Sahinidis in 2013. Some
> questions:
> >>
> >> Many of the commercial algorithms offer free licenses for benchmarking
> problems of less than 10 dimensions. Would you be willing to include some
> of these in your benchmarks at some point? It would be a great reference to
> use.
> >
> >
> > I?m definitely willing to include those commercial algorithms. The test
> suite per se is almost completely automated, so it?s not that complicated
> to add one or more solvers. I?m generally more inclined in testing open
> source algorithms but there?s nothing stopping the inclusion of commercial
> ones.
> >
> > I welcome any suggestions related to commercial solvers, as long as they
> can run on Python 2 / Python 3 and on Windows (I might be able to setup a
> Linux virtual machine if absolutely needed but that would defy part of the
> purpose of the exercise - SHGO, Dual Annealing and the other SciPy solvers
> run on all platforms that support SciPy).
> >
> >> The collection of test suites you've garnered could be immensely useful
> for further algorithm development. Is there a possibility of releasing the
> code publicly (presumably after you've published the results in a journal)?
> >>
> >> In this case I would also like to volunteer to run some of the
> commercial solvers on the benchmark suite.
> >> It would also help to have a central repository for fixing bugs and
> adding lower global minima when they are found (of which there are quite
> few ).
> >
> >
> >
> > I?m still sorting out all the implications related to a potential paper
> with my employer, but as far as I can see there shouldn?t be any problem
> with that: assuming everything goes as it should, I will definitely push
> for making the code open source.
> >
> >
> >>
> >> Comments on shgo:
> >>
> >> High RAM use in higher dimensions:
> >>
> >> In the higher dimensions the new simplicial sampling can be used (not
> pushed to scipy yet; I still need to update some documentation before the
> PR). This alleviates, but does not eliminate the memory leak issue. As
> you've said SHGO is best suited to problems below 10 dimensions as any
> higher leaves the realm of DFO problems and starts to enter the domain of
> NLP problems. My personal preference in this case is to use the stochastic
> algorithms (basinhopping and differential evolution) on problems where it
> is known that a gradient based solver won't work.
> >>
> >> An exception to this "rule" is when special grey box information such
> as symmetry of the objective function (something that can be supplied to
> shgo to push the applicability of the algorithm up to ~100 variables) or
> pre-computed bounds on the Lipschitz constants is known.
> >>
> >> In the symmetry case SHGO can solve these by supplying the `symmetry`
> option (which was used in the previous benchmarks done by me for the JOGO
> publication, although I did not specifically check if performance was
> actually improved on those problems, but shgo did converge on all benchmark
> problems in the scipy test suite).
> >>
> >> I have had a few reports of memory leaks from various users. I have
> spoken to a few collaborators about the possibility of finding a Masters
> student to cythonize some of the code or otherwise improve it. Hopefully,
> this will happen in the summer semester of 2021.
> >
> >
> > To be honest I wouldn?t be so concerned in general: SHGO is an excellent
> global optimization algorithm and it consistently ranks at the top, no
> matter what problems you throw at it. Together with Dual Annealing, SciPy
> has gained two phenomenal nonlinear solvers and I?m very happy to see that
> SciPy is now at the cutting edge of the open source global optimization
> universe.
> >
> > Andrea.
> >
> >> Thank you again for compiling this large set of benchmark results.
> >>
> >> Best regards,
> >> Stefan
> >> On Fri, Jan 8, 2021 at 10:21 AM Andrea Gavana <andrea.gavana at gmail.com>
> wrote:
> >>>
> >>> Dear SciPy Developers & Users,
> >>>
> >>>     long time no see :-) . I thought to start 2021 with a bit of a
> bang, to try and forget how bad 2020 has been... So I am happy to present
> you with a revamped version of the Global Optimization Benchmarks from my
> previous exercise in 2013.
> >>>
> >>> This new set of benchmarks pretty much superseeds - and greatly
> expands - the previous analysis that you can find at this location:
> http://infinity77.net/global_optimization/ .
> >>>
> >>> The approach I have taken this time is to select as many benchmark
> test suites as possible: most of them are characterized by test function
> generators, from which we can actually create almost an unlimited number of
> unique test problems. Biggest news are:
> >>>
> >>> This whole exercise is made up of 6,825 test problems divided across
> 16 different test suites: most of these problems are of low dimensionality
> (2 to 6 variables) with a few benchmarks extending to 9+ variables. With
> all the sensitivities performed during this exercise on those benchmarks,
> the overall grand total number of functions evaluations stands at
> 3,859,786,025 - close to 4 billion. Not bad.
> >>> A couple of "new" optimization algorithms I have ported to Python:
> >>>
> >>> MCS: Multilevel Coordinate Search, it?s my translation to Python of
> the original Matlab code from A. Neumaier and W. Huyer (giving then for
> free also GLS and MINQ) I have added a few, minor improvements compared to
> the original implementation.
> >>> BiteOpt: BITmask Evolution OPTimization , I have converted the C++
> code into Python and added a few, minor modifications.
> >>>
> >>>
> >>> Enough chatting for now. The 13 tested algorithms are described here:
> >>>
> >>> http://infinity77.net/go_2021/
> >>>
> >>> High level description & results of the 16 benchmarks:
> >>>
> >>> http://infinity77.net/go_2021/thebenchmarks.html
> >>>
> >>> Each benchmark test suite has its own dedicated page, with more
> detailed results and sensitivities.
> >>>
> >>> List of tested algorithms:
> >>>
> >>> AMPGO: Adaptive Memory Programming for Global Optimization: this is my
> Python implementation of the algorithm described here:
> >>>
> >>>
> http://leeds-faculty.colorado.edu/glover/fred%20pubs/416%20-%20AMP%20(TS)%20for%20Constrained%20Global%20Opt%20w%20Lasdon%20et%20al%20.pdf
> >>>
> >>> I have added a few improvements here and there based on my Master
> Thesis work on the standard Tunnelling Algorithm of Levy, Montalvo and
> Gomez. After AMPGO was integrated in lmfit, I have improved it even more -
> in my opinion.
> >>>
> >>> BasinHopping: Basin hopping is a random algorithm which attempts to
> find the global minimum of a smooth scalar function of one or more
> variables. The algorithm was originally described by David Wales:
> >>>
> >>> http://www-wales.ch.cam.ac.uk/
> >>>
> >>> BasinHopping is now part of the standard SciPy distribution.
> >>>
> >>> BiteOpt: BITmask Evolution OPTimization, based on the algorithm
> presented in this GitHub link:
> >>>
> >>> https://github.com/avaneev/biteopt
> >>>
> >>> I have converted the C++ code into Python and added a few, minor
> modifications.
> >>>
> >>> CMA-ES: Covariance Matrix Adaptation Evolution Strategy, based on the
> following algorithm:
> >>>
> >>> http://www.lri.fr/~hansen/cmaesintro.html
> >>>
> >>> http://www.lri.fr/~hansen/cmaes_inmatlab.html#python (Python code for
> the algorithm)
> >>>
> >>> CRS2: Controlled Random Search with Local Mutation, as implemented in
> the NLOpt package:
> >>>
> >>>
> http://ab-initio.mit.edu/wiki/index.php/NLopt_Algorithms#Controlled_Random_Search_.28CRS.29_with_local_mutation
> >>>
> >>> DE: Differential Evolution, described in the following page:
> >>>
> >>> http://www1.icsi.berkeley.edu/~storn/code.html
> >>>
> >>> DE is now part of the standard SciPy distribution, and I have taken
> the implementation as it stands in SciPy:
> >>>
> >>>
> https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html#scipy.optimize.differential_evolution
> >>>
> >>> DIRECT: the DIviding RECTangles procedure, described in:
> >>>
> >>>
> https://www.tol-project.org/export/2776/tolp/OfficialTolArchiveNetwork/NonLinGloOpt/doc/DIRECT_Lipschitzian%20optimization%20without%20the%20lipschitz%20constant.pdf
> >>>
> >>>
> http://ab-initio.mit.edu/wiki/index.php/NLopt_Algorithms#DIRECT_and_DIRECT-L
> (Python code for the algorithm)
> >>>
> >>> DualAnnealing: the Dual Annealing algorithm, taken directly from the
> SciPy implementation:
> >>>
> >>>
> https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.dual_annealing.html#scipy.optimize.dual_annealing
> >>>
> >>> LeapFrog: the Leap Frog procedure, which I have been recommended for
> use, taken from:
> >>>
> >>> https://github.com/flythereddflagg/lpfgopt
> >>>
> >>> MCS: Multilevel Coordinate Search, it?s my translation to Python of
> the original Matlab code from A. Neumaier and W. Huyer (giving then for
> free also GLS and MINQ):
> >>>
> >>> https://www.mat.univie.ac.at/~neum/software/mcs/
> >>>
> >>> I have added a few, minor improvements compared to the original
> implementation. See the MCS section for a quick and dirty comparison
> between the Matlab code and my Python conversion.
> >>>
> >>> PSWARM: Particle Swarm optimization algorithm, it has been described
> in many online papers. I have used a compiled version of the C source code
> from:
> >>>
> >>> http://www.norg.uminho.pt/aivaz/pswarm/
> >>>
> >>> SCE: Shuffled Complex Evolution, described in:
> >>>
> >>> Duan, Q., S. Sorooshian, and V. Gupta, Effective and efficient global
> optimization for conceptual rainfall-runoff models, Water Resour. Res., 28,
> 1015-1031, 1992.
> >>>
> >>> The version I used was graciously made available by Matthias Cuntz via
> a personal e-mail.
> >>>
> >>> SHGO: Simplicial Homology Global Optimization, taken directly from the
> SciPy implementation:
> >>>
> >>>
> https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.shgo.html#scipy.optimize.shgo
> >>>
> >>>
> >>> List of benchmark test suites:
> >>>
> >>> SciPy Extended: 235 multivariate problems (where the number of
> independent variables ranges from 2 to 17), again with multiple
> local/global minima.
> >>>
> >>> I have added about 40 new functions to the standard SciPy benchmarks
> and fixed a few bugs in the existing benchmark models in the SciPy
> repository.
> >>>
> >>> GKLS: 1,500 test functions, with dimensionality varying from 2 to 6,
> generated with the super famous GKLS Test Functions Generator. I have taken
> the original C code (available at http://netlib.org/toms/) and converted
> it to Python.
> >>>
> >>> GlobOpt: 288 tough problems, with dimensionality varying from 2 to 5,
> created with another test function generator which I arbitrarily named
> ?GlobOpt?:
> https://www.researchgate.net/publication/225566516_A_new_class_of_test_functions_for_global_optimization
> . The original code is in C++ and I have bridged it to Python using Cython.
> >>>
> >>> Many thanks go to Professor Marco Locatelli for providing an updated
> copy of the C++ source code.
> >>>
> >>> MMTFG: sort-of an acronym for ?Multi-Modal Test Function with multiple
> Global minima?, this test suite implements the work of Jani Ronkkonen:
> https://www.researchgate.net/publication/220265526_A_Generator_for_Multimodal_Test_Functions_with_Multiple_Global_Optima
> . It contains 981 test problems with dimensionality varying from 2 to 4.
> The original code is in C and I have bridge it to Python using Cython.
> >>>
> >>> GOTPY: a generator of benchmark functions using the Bocharov-Feldbaum
> ?Method-Min?, containing 400 test problems with dimensionality varying from
> 2 to 5. I have taken the Python implementation from
> https://github.com/redb0/gotpy and improved it in terms of runtime.
> >>>
> >>> Original paper from
> http://www.mathnet.ru/php/archive.phtml?wshow=paper&jrnid=at&paperid=11985&option_lang=eng
> .
> >>>
> >>> Huygens: this benchmark suite is very different from the rest, as it
> uses a ?fractal? approach to generate test functions. It is based on the
> work of Cara MacNish on Fractal Functions. The original code is in Java,
> and at the beginning I just converted it to Python: given it was slow as a
> turtle, I have re-implemented it in Fortran and wrapped it using f2py, then
> generating 600 2-dimensional test problems out of it.
> >>>
> >>> LGMVG: not sure about the meaning of the acronym, but the
> implementation follows the ?Max-Set of Gaussians Landscape Generator?
> described in http://boyuan.global-optimization.com/LGMVG/index.htm .
> Source code is given in Matlab, but it?s fairly easy to convert it to
> Python. This test suite contains 304 problems with dimensionality varying
> from 2 to 5.
> >>>
> >>> NgLi: Stemming from the work of Chi-Kong Ng and Duan Li, this is a
> test problem generator for unconstrained optimization, but it?s fairly easy
> to assign bound constraints to it. The methodology is described in
> https://www.sciencedirect.com/science/article/pii/S0305054814001774 ,
> while the Matlab source code can be found in
> http://www1.se.cuhk.edu.hk/~ckng/generator/ . I have used the Matlab
> script to generate 240 problems with dimensionality varying from 2 to 5 by
> outputting the generator parameters in text files, then used Python to
> create the objective functions based on those parameters and the benchmark
> methodology.
> >>>
> >>> MPM2: Implementing the ?Multiple Peaks Model 2?, there is a Python
> implementation at
> https://github.com/jakobbossek/smoof/blob/master/inst/mpm2.py . This is a
> test problem generator also used in the smoof library, I have taken the
> code almost as is and generated 480 benchmark functions with dimensionality
> varying from 2 to 5.
> >>>
> >>> RandomFields: as described in
> https://www.researchgate.net/publication/301940420_Global_optimization_test_problems_based_on_random_field_composition
> , it generates benchmark functions by ?smoothing? one or more
> multidimensional discrete random fields and composing them. No source code
> is given, but the implementation is fairly straightforward from the article
> itself.
> >>>
> >>> NIST: not exactly the realm of Global Optimization solvers, but the
> NIST StRD dataset can be used to generate a single objective function as
> ?sum of squares?. I have used the NIST dataset as implemented in lmfit,
> thus creating 27 test problems with dimensionality ranging from 2 to 9.
> >>>
> >>> GlobalLib: Arnold Neumaier maintains a suite of test problems termed
> ?COCONUT Benchmark? and Sahinidis has converted the GlobalLib and
> PricentonLib AMPL/GAMS dataset into C/Fortran code (
> http://archimedes.cheme.cmu.edu/?q=dfocomp ). I have used a simple C
> parser to convert the benchmarks from C to Python.
> >>>
> >>> The global minima are taken from Sahinidis or from Neumaier or refined
> using the NEOS server when the accuracy of the reported minima is too low.
> The suite contains 181 test functions with dimensionality varying between 2
> and 9.
> >>>
> >>> CVMG: another ?landscape generator?, I had to dig it out using the
> Wayback Machine at
> http://web.archive.org/web/20100612044104/https://www.cs.uwyo.edu/~wspears/multi.kennedy.html
> , the acronym stands for ?Continuous Valued Multimodality Generator?.
> Source code is in C++ but it?s fairly easy to port it to Python. In
> addition to the original implementation (that uses the Sigmoid as a
> softmax/transformation function) I have added a few others to create varied
> landscapes. 360 test problems have been generated, with dimensionality
> ranging from 2 to 5.
> >>>
> >>> NLSE: again, not really the realm of Global optimization solvers, but
> Nonlinear Systems of Equations can be transformed to single objective
> functions to optimize. I have drawn from many different sources
> (Publications, ALIAS/COPRIN and many others) to create 44 systems of
> nonlinear equations with dimensionality ranging from 2 to 8.
> >>>
> >>> Schoen: based on the early work of Fabio Schoen and his short note on
> a simple but interesting idea on a test function generator, I have taken
> the C code in the note and converted it into Python, thus creating 285
> benchmark functions with dimensionality ranging from 2 to 6.
> >>>
> >>> Many thanks go to Professor Fabio Schoen for providing an updated copy
> of the source code and for the email communications.
> >>>
> >>> Robust: the last benchmark test suite for this exercise, it is
> actually composed of 5 different kind-of analytical test function
> generators, containing deceptive, multimodal, flat functions depending on
> the settings. Matlab source code is available at
> http://www.alimirjalili.com/RO.html , I simply converted it to Python and
> created 420 benchmark functions with dimensionality ranging from 2 to 6.
> >>>
> >>>
> >>> Enjoy, and Happy 2021 :-) .
> >>>
> >>>
> >>> Andrea.
> >>>
> >>> _______________________________________________
> >>>
> >>>
> >>> SciPy-Dev mailing list
> >>> SciPy-Dev at python.org
> >>> https://mail.python.org/mailman/listinfo/scipy-dev
> >>
> >>
> >>
> >> --
> >> Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering)
> >>
> >> Wissenchaftlicher Mitarbeiter: Leibniz Institute for Materials
> Engineering IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany
> >> Work phone (DE): +49 (0) 421 218 51238
> >> Cellphone (DE): +49 (0) 160 949 86417
> >> Cellphone (ZA): +27 (0) 82 972 42 89
> >> E-mail (work): s.endres at iwt.uni-bremen.de
> >> Website: https://stefan-endres.github.io/
> >> _______________________________________________
> >> SciPy-Dev mailing list
> >> SciPy-Dev at python.org
> >> https://mail.python.org/mailman/listinfo/scipy-dev
> >
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev at python.org
> > https://mail.python.org/mailman/listinfo/scipy-dev
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210425/59ef0b04/attachment-0001.html>

From tyler.je.reddy at gmail.com  Sun Apr 25 22:19:08 2021
From: tyler.je.reddy at gmail.com (Tyler Reddy)
Date: Sun, 25 Apr 2021 20:19:08 -0600
Subject: [SciPy-Dev] ANN: SciPy 1.6.3
Message-ID: <CAHPuU_a2Qe4sMt8=-B3DWAXq5Dri6wZ-xoNXbw7+4DwOxa1JPQ@mail.gmail.com>

Hi all,

On behalf of the SciPy development team I'm pleased to announce
the release of SciPy 1.6.3, which is a bug fix release.

Sources and binary wheels can be found at:
https://pypi.org/project/scipy/
and at: https://github.com/scipy/scipy/releases/tag/v1.6.3
<https://github.com/scipy/scipy/releases/tag/v1.6.1>

One of a few ways to install this release with pip:

pip install scipy==1.6.3

=====================
SciPy 1.6.3 Release Notes
=====================

SciPy 1.6.3 is a bug-fix release with no new features
compared to 1.6.2.

Authors
======

* Peter Bell
* Ralf Gommers
* Matt Haberland
* Peter Mahler Larsen
* Tirth Patel
* Tyler Reddy
* Pamphile ROY +
* Xingyu Liu +

A total of 8 people contributed to this release.
People with a "+" by their names contributed a patch for the first time.
This list of names is automatically generated, and may not be fully
complete.

Issues closed for 1.6.3
-------------------------------

* `#13772 <https://github.com/scipy/scipy/issues/13772>`__: Divide by zero
in distance.yule
* `#13796 <https://github.com/scipy/scipy/issues/13796>`__: CI:
prerelease_deps failures
* `#13890 <https://github.com/scipy/scipy/issues/13890>`__: TST: spatial
rotation failure in (1.6.3) wheels repo (ARM64)


Pull requests for 1.6.3
------------------------------

* `#13755 <https://github.com/scipy/scipy/pull/13755>`__: CI: fix the
matplotlib warning emitted during builing docs
* `#13773 <https://github.com/scipy/scipy/pull/13773>`__: BUG: Divide by
zero in yule dissimilarity of constant vectors
* `#13799 <https://github.com/scipy/scipy/pull/13799>`__: CI/MAINT:
deprecated np.typeDict
* `#13819 <https://github.com/scipy/scipy/pull/13819>`__: substitute
np.math.factorial with math.factorial
* `#13895 <https://github.com/scipy/scipy/pull/13895>`__: TST: add random
seeds in Rotation module

Checksums
=========

MD5
~~~

3b75d493f6c93b662f927d6c2ac60053
 scipy-1.6.3-cp37-cp37m-macosx_10_9_x86_64.whl
ada6fa32f066dc58033ab47a4fbcd208  scipy-1.6.3-cp37-cp37m-manylinux1_i686.whl
29c6d6edbe9c2ba17dc4edd89ed31abb
 scipy-1.6.3-cp37-cp37m-manylinux1_x86_64.whl
cde2f4824337fda5b2522795fff2135d
 scipy-1.6.3-cp37-cp37m-manylinux2014_aarch64.whl
27fa4101babcfa7928e4c23e33212e71  scipy-1.6.3-cp37-cp37m-win32.whl
b0ef1b6ba4d3b54ad8f17dd10156c3f8  scipy-1.6.3-cp37-cp37m-win_amd64.whl
66b3ea5bb7869af010ae9d6a5015360b
 scipy-1.6.3-cp38-cp38-macosx_10_9_x86_64.whl
0ef8582f203fdd4afe1ca56fc5b309f5  scipy-1.6.3-cp38-cp38-manylinux1_i686.whl
1194e9d88ef98595619acd657439fae4
 scipy-1.6.3-cp38-cp38-manylinux1_x86_64.whl
4b8fe1b85fd8d60bbc787a06c0d545ee
 scipy-1.6.3-cp38-cp38-manylinux2014_aarch64.whl
1d55aff23261c2fa8d7882bc7220c25d  scipy-1.6.3-cp38-cp38-win32.whl
e82a80c799f1e5190be0cbd25b554766  scipy-1.6.3-cp38-cp38-win_amd64.whl
8309d581e025539bafaa98ba9c3122a7
 scipy-1.6.3-cp39-cp39-macosx_10_9_x86_64.whl
75a9ba6865ff2fefcb5df917a412d99d  scipy-1.6.3-cp39-cp39-manylinux1_i686.whl
fbfc53ba0ea23f56afb62c2a159e6303
 scipy-1.6.3-cp39-cp39-manylinux1_x86_64.whl
4076d3cdb2b16f7d64229ac76c945b6f
 scipy-1.6.3-cp39-cp39-manylinux2014_aarch64.whl
bd553432d6a55a6e139f76951766f31d  scipy-1.6.3-cp39-cp39-win32.whl
6b9526aca2adf7b062ac92f184328147  scipy-1.6.3-cp39-cp39-win_amd64.whl
05b4ca400aa1157290abff69016f1cab  scipy-1.6.3.tar.gz
cef7aa950dfc7f5be79b8c630bfe0cc1  scipy-1.6.3.tar.xz
f13ea8ab38fb03c17e38c3cda7877972  scipy-1.6.3.zip

SHA256
~~~~~~

2a799714bf1f791fb2650d73222b248d18d53fd40d6af2df2c898db048189606
 scipy-1.6.3-cp37-cp37m-macosx_10_9_x86_64.whl
9e3302149a369697c6aaea18b430b216e3c88f9a61b62869f6104881e5f9ef85
 scipy-1.6.3-cp37-cp37m-manylinux1_i686.whl
b79104878003487e2b4639a20b9092b02e1bad07fc4cf924b495cf413748a777
 scipy-1.6.3-cp37-cp37m-manylinux1_x86_64.whl
44d452850f77e65e25b1eb1ac01e25770323a782bfe3a1a3e43847ad4266d93d
 scipy-1.6.3-cp37-cp37m-manylinux2014_aarch64.whl
b30280fbc1fd8082ac822994a98632111810311a9ece71a0e48f739df3c555a2
 scipy-1.6.3-cp37-cp37m-win32.whl
10dbcc7de03b8d635a1031cb18fd3eaa997969b64fdf78f99f19ac163a825445
 scipy-1.6.3-cp37-cp37m-win_amd64.whl
1b21c6e0dc97b1762590b70dee0daddb291271be0580384d39f02c480b78290a
 scipy-1.6.3-cp38-cp38-macosx_10_9_x86_64.whl
1caade0ede6967cc675e235c41451f9fb89ae34319ddf4740194094ab736b88d
 scipy-1.6.3-cp38-cp38-manylinux1_i686.whl
19aeac1ad3e57338723f4657ac8520f41714804568f2e30bd547d684d72c392e
 scipy-1.6.3-cp38-cp38-manylinux1_x86_64.whl
ad7269254de06743fb4768f658753de47d8b54e4672c5ebe8612a007a088bd48
 scipy-1.6.3-cp38-cp38-manylinux2014_aarch64.whl
d647757373985207af3343301d89fe738d5a294435a4f2aafb04c13b4388c896
 scipy-1.6.3-cp38-cp38-win32.whl
33d1677d46111cfa1c84b87472a0274dde9ef4a7ef2e1f155f012f5f1e995d8f
 scipy-1.6.3-cp38-cp38-win_amd64.whl
d449d40e830366b4c612692ad19fbebb722b6b847f78a7b701b1e0d6cda3cc13
 scipy-1.6.3-cp39-cp39-macosx_10_9_x86_64.whl
23995dfcf269ec3735e5a8c80cfceaf384369a47699df111a6246b83a55da582
 scipy-1.6.3-cp39-cp39-manylinux1_i686.whl
fdf606341cd798530b05705c87779606fcdfaf768a8129c348ea94441da15b04
 scipy-1.6.3-cp39-cp39-manylinux1_x86_64.whl
f68eb46b86b2c246af99fcaa6f6e37c7a7a413e1084a794990b877f2ff71f7b6
 scipy-1.6.3-cp39-cp39-manylinux2014_aarch64.whl
01b38dec7e9f897d4db04f8de4e20f0f5be3feac98468188a0f47a991b796055
 scipy-1.6.3-cp39-cp39-win32.whl
3274ce145b5dc416c49c0cf8b6119f787f0965cd35e22058fe1932c09fe15d77
 scipy-1.6.3-cp39-cp39-win_amd64.whl
a75b014d3294fce26852a9d04ea27b5671d86736beb34acdfc05859246260707
 scipy-1.6.3.tar.gz
3851fdcb1e6877241c3377aa971c85af0d44f90c57f4dd4e54e1b2bbd742635e
 scipy-1.6.3.tar.xz
0a723da627af61665c7793fdf869ad2606be9cf32e2c0abc0d230f62c4f4914f
 scipy-1.6.3.zip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210425/355fa94c/attachment.html>

From warren.weckesser at gmail.com  Wed Apr 28 15:08:38 2021
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Wed, 28 Apr 2021 15:08:38 -0400
Subject: [SciPy-Dev] Can a PR be configured to run
Message-ID: <CAGzF1ufp84sdFoC_LMnTGVEyOJ6f3qL4um1f8kcX1NySi-bTRw@mail.gmail.com>


From warren.weckesser at gmail.com  Wed Apr 28 15:17:26 2021
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Wed, 28 Apr 2021 15:17:26 -0400
Subject: [SciPy-Dev] Can a PR be configured to run only the build of the
 docs?
Message-ID: <CAGzF1ueX1yz0mDBnhNDirQr2cQYyRVPWhjEoak86=Dv1oz4FUg@mail.gmail.com>

I am working on an addition to the docs that only touches one existing
.rst file and adds one new .rst file.    For the initial pull request,
I want to run just CircleCI and generate the CircleCI docs artifact. I
haven't followed all the changes to how CI is currently configured and
run.  Is there text that I can add to the commit message (or some
other method) so that only the CircleCI jobs are run?

Warren

From ralf.gommers at gmail.com  Wed Apr 28 17:17:50 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Wed, 28 Apr 2021 23:17:50 +0200
Subject: [SciPy-Dev] Can a PR be configured to run only the build of the
 docs?
In-Reply-To: <CAGzF1ueX1yz0mDBnhNDirQr2cQYyRVPWhjEoak86=Dv1oz4FUg@mail.gmail.com>
References: <CAGzF1ueX1yz0mDBnhNDirQr2cQYyRVPWhjEoak86=Dv1oz4FUg@mail.gmail.com>
Message-ID: <CABL7CQiLpYXiGmiBT5AOdAPK3Y6_h40L6529dbwdnTypvFpBaQ@mail.gmail.com>

On Wed, Apr 28, 2021 at 9:18 PM Warren Weckesser <warren.weckesser at gmail.com>
wrote:

> I am working on an addition to the docs that only touches one existing
> .rst file and adds one new .rst file.    For the initial pull request,
> I want to run just CircleCI and generate the CircleCI docs artifact. I
> haven't followed all the changes to how CI is currently configured and
> run.  Is there text that I can add to the commit message (or some
> other method) so that only the CircleCI jobs are run?
>

Not all providers support this, Azure is the worst. Travis CI has `[skip
travis]`  and for GH Actions we use a custom `[skip github]` inside our
action yaml files (I think it works, but not 100% sure - let us know if you
try it).

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210428/91bb508f/attachment.html>

From andyfaff at gmail.com  Wed Apr 28 17:44:26 2021
From: andyfaff at gmail.com (Andrew Nelson)
Date: Thu, 29 Apr 2021 07:44:26 +1000
Subject: [SciPy-Dev] GitHub actions restrictions
Message-ID: <CAAbtOZd-QT+Md0zpiky3F=Z-VCNLFmMX5snBx337bKrwJ1KBEQ@mail.gmail.com>

https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks

>From the GH website:
When a first-time contributor submits a pull request to a public
repository, a maintainer with write access must approve any workflow runs.

The purpose behind this is so that things like mining for Bitcoin is
prevented.

A.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210429/2109d597/attachment.html>

From warren.weckesser at gmail.com  Wed Apr 28 18:29:27 2021
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Wed, 28 Apr 2021 18:29:27 -0400
Subject: [SciPy-Dev] Can a PR be configured to run only the build of the
 docs?
In-Reply-To: <CABL7CQiLpYXiGmiBT5AOdAPK3Y6_h40L6529dbwdnTypvFpBaQ@mail.gmail.com>
References: <CAGzF1ueX1yz0mDBnhNDirQr2cQYyRVPWhjEoak86=Dv1oz4FUg@mail.gmail.com>
 <CABL7CQiLpYXiGmiBT5AOdAPK3Y6_h40L6529dbwdnTypvFpBaQ@mail.gmail.com>
Message-ID: <CAGzF1ueWGbSTm1MK-Z_Kso9LvRRC2Viqgbrm=rhg32wDs9=3Mg@mail.gmail.com>

On 4/28/21, Ralf Gommers <ralf.gommers at gmail.com> wrote:
> On Wed, Apr 28, 2021 at 9:18 PM Warren Weckesser
> <warren.weckesser at gmail.com>
> wrote:
>
>> I am working on an addition to the docs that only touches one existing
>> .rst file and adds one new .rst file.    For the initial pull request,
>> I want to run just CircleCI and generate the CircleCI docs artifact. I
>> haven't followed all the changes to how CI is currently configured and
>> run.  Is there text that I can add to the commit message (or some
>> other method) so that only the CircleCI jobs are run?
>>
>
> Not all providers support this, Azure is the worst. Travis CI has `[skip
> travis]`  and for GH Actions we use a custom `[skip github]` inside our
> action yaml files (I think it works, but not 100% sure - let us know if you
> try it).

Thanks Ralf.  I used [skip travis] and [skip actions] to skip Travis
and GitHub Actions, resp. (see
https://github.blog/changelog/2021-02-08-github-actions-skip-pull-request-and-push-workflows-with-skip-ci/).

I found https://github.com/Microsoft/azure-pipelines-agent/issues/1270,
and immediately tried [skip azurepipelines] when I saw that in the
thread.  If I had continued reading, I would have seen the comments
(including yours!) about this not working in PRs.  So instead, I added
a temporary commit to the PR that simply removes the file
'azure-pipelines.yml'.

Warren

>
> Cheers,
> Ralf
>

From warren.weckesser at gmail.com  Wed Apr 28 19:11:58 2021
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Wed, 28 Apr 2021 19:11:58 -0400
Subject: [SciPy-Dev] Can a PR be configured to run only the build of the
 docs?
In-Reply-To: <CAGzF1ueWGbSTm1MK-Z_Kso9LvRRC2Viqgbrm=rhg32wDs9=3Mg@mail.gmail.com>
References: <CAGzF1ueX1yz0mDBnhNDirQr2cQYyRVPWhjEoak86=Dv1oz4FUg@mail.gmail.com>
 <CABL7CQiLpYXiGmiBT5AOdAPK3Y6_h40L6529dbwdnTypvFpBaQ@mail.gmail.com>
 <CAGzF1ueWGbSTm1MK-Z_Kso9LvRRC2Viqgbrm=rhg32wDs9=3Mg@mail.gmail.com>
Message-ID: <CAGzF1ueu1z7Mr3YLjci1ohMhz+85X1BeL1N9X3a05ZRqVRfiBw@mail.gmail.com>

On 4/28/21, Warren Weckesser <warren.weckesser at gmail.com> wrote:
> On 4/28/21, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>> On Wed, Apr 28, 2021 at 9:18 PM Warren Weckesser
>> <warren.weckesser at gmail.com>
>> wrote:
>>
>>> I am working on an addition to the docs that only touches one existing
>>> .rst file and adds one new .rst file.    For the initial pull request,
>>> I want to run just CircleCI and generate the CircleCI docs artifact. I
>>> haven't followed all the changes to how CI is currently configured and
>>> run.  Is there text that I can add to the commit message (or some
>>> other method) so that only the CircleCI jobs are run?
>>>
>>
>> Not all providers support this, Azure is the worst. Travis CI has `[skip
>> travis]`  and for GH Actions we use a custom `[skip github]` inside our
>> action yaml files (I think it works, but not 100% sure - let us know if
>> you
>> try it).
>
> Thanks Ralf.  I used [skip travis] and [skip actions] to skip Travis
> and GitHub Actions, resp. (see
> https://github.blog/changelog/2021-02-08-github-actions-skip-pull-request-and-push-workflows-with-skip-ci/).
>
> I found https://github.com/Microsoft/azure-pipelines-agent/issues/1270,
> and immediately tried [skip azurepipelines] when I saw that in the
> thread.  If I had continued reading, I would have seen the comments
> (including yours!) about this not working in PRs.  So instead, I added
> a temporary commit to the PR that simply removes the file
> 'azure-pipelines.yml'.

By the way, the PR is https://github.com/scipy/scipy/pull/13955.  It
proposes some new coding and docstring guidelines.  It is marked
"draft" because of the commit that removes the Azure pipelines
configuration file--it should definitely not be merged as is!--but it
is actually ready for comments.  The motivation for the new guidelines
is given in the description of the PR.  A direct link to the proposed
guidelines on CircleCI is
https://29996-1460385-gh.circle-artifacts.com/0/html-scipyorg/dev/missing-bits.html

Warren

>
> Warren
>
>>
>> Cheers,
>> Ralf
>>
>

From larson.eric.d at gmail.com  Wed Apr 28 19:28:30 2021
From: larson.eric.d at gmail.com (Eric Larson)
Date: Wed, 28 Apr 2021 19:28:30 -0400
Subject: [SciPy-Dev] Can a PR be configured to run only the build of the
 docs?
In-Reply-To: <CAGzF1ueu1z7Mr3YLjci1ohMhz+85X1BeL1N9X3a05ZRqVRfiBw@mail.gmail.com>
References: <CAGzF1ueX1yz0mDBnhNDirQr2cQYyRVPWhjEoak86=Dv1oz4FUg@mail.gmail.com>
 <CABL7CQiLpYXiGmiBT5AOdAPK3Y6_h40L6529dbwdnTypvFpBaQ@mail.gmail.com>
 <CAGzF1ueWGbSTm1MK-Z_Kso9LvRRC2Viqgbrm=rhg32wDs9=3Mg@mail.gmail.com>
 <CAGzF1ueu1z7Mr3YLjci1ohMhz+85X1BeL1N9X3a05ZRqVRfiBw@mail.gmail.com>
Message-ID: <CAGu2niU1HMckaTyAKLaNj5-bC9L-uv3v2Ain9sBhv=YQn9C=PQ@mail.gmail.com>

>
>  Is there text that I can add to the commit message (or some other method)
> so that only the CircleCI jobs are run?
>

As of 22 days ago you can use [skip azp] (among other options) in your
commit message to skip the Azure pipelines run:

https://github.com/scipy/scipy/pull/13811

Feel free in your PR to add [skip azurepipelines] to the list of possible
skip messages that's looked for.

Eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210428/131b6cef/attachment.html>

From ilhanpolat at gmail.com  Thu Apr 29 01:33:21 2021
From: ilhanpolat at gmail.com (Ilhan Polat)
Date: Thu, 29 Apr 2021 07:33:21 +0200
Subject: [SciPy-Dev] GitHub actions restrictions
In-Reply-To: <CAAbtOZd-QT+Md0zpiky3F=Z-VCNLFmMX5snBx337bKrwJ1KBEQ@mail.gmail.com>
References: <CAAbtOZd-QT+Md0zpiky3F=Z-VCNLFmMX5snBx337bKrwJ1KBEQ@mail.gmail.com>
Message-ID: <CAEBuzr8259sbbpiRHFOqarvOu0hGesqCpZCudh1-KsAmg26aTw@mail.gmail.com>

There are also a lot of complaints from OS maintainers to GitHub in the
private OSS Feedback Group. It is indeed tiring for large and active
projects to go through and click every PR. TravisCI took the path to kick
out OS to combat and GH is doing this basically pushing the burden on the
OS maintainers. A proposal was given to restrict this to cases in which
only if the pull request touches sensitive files .github or .sh-alike
however GH has not commented yet. I am not sure if that is going to be
sufficient to block malicious code executions though.


On Wed, Apr 28, 2021 at 11:45 PM Andrew Nelson <andyfaff at gmail.com> wrote:

>
> https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks
>
> From the GH website:
> When a first-time contributor submits a pull request to a public
> repository, a maintainer with write access must approve any workflow runs.
>
> The purpose behind this is so that things like mining for Bitcoin is
> prevented.
>
> A.
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210429/1ddef66f/attachment.html>

From ralf.gommers at gmail.com  Thu Apr 29 03:53:56 2021
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Thu, 29 Apr 2021 09:53:56 +0200
Subject: [SciPy-Dev] GitHub actions restrictions
In-Reply-To: <CAEBuzr8259sbbpiRHFOqarvOu0hGesqCpZCudh1-KsAmg26aTw@mail.gmail.com>
References: <CAAbtOZd-QT+Md0zpiky3F=Z-VCNLFmMX5snBx337bKrwJ1KBEQ@mail.gmail.com>
 <CAEBuzr8259sbbpiRHFOqarvOu0hGesqCpZCudh1-KsAmg26aTw@mail.gmail.com>
Message-ID: <CABL7CQh6XmT63-h6R3j8sunH1498a4aXAAdtWXwD_aSuEmubgQ@mail.gmail.com>

On Thu, Apr 29, 2021 at 7:33 AM Ilhan Polat <ilhanpolat at gmail.com> wrote:

> There are also a lot of complaints from OS maintainers to GitHub in the
> private OSS Feedback Group. It is indeed tiring for large and active
> projects to go through and click every PR. TravisCI took the path to kick
> out OS to combat and GH is doing this basically pushing the burden on the
> OS maintainers. A proposal was given to restrict this to cases in which
> only if the pull request touches sensitive files .github or .sh-alike
> however GH has not commented yet. I am not sure if that is going to be
> sufficient to block malicious code executions though.
>

I don't think it's too bad. What GitHub does now seems sensible, and I
think it's a very small price to pay for free CI. TravisCI pulling support
almost completely was a very different story, that caused months of pain.

Cheers,
Ralf


>
>
>
> On Wed, Apr 28, 2021 at 11:45 PM Andrew Nelson <andyfaff at gmail.com> wrote:
>
>>
>> https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks
>>
>> From the GH website:
>> When a first-time contributor submits a pull request to a public
>> repository, a maintainer with write access must approve any workflow runs.
>>
>> The purpose behind this is so that things like mining for Bitcoin is
>> prevented.
>>
>> A.
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210429/aa23255a/attachment.html>

From roy.pamphile at gmail.com  Thu Apr 29 04:04:58 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Thu, 29 Apr 2021 10:04:58 +0200
Subject: [SciPy-Dev] GitHub actions restrictions
In-Reply-To: <CABL7CQh6XmT63-h6R3j8sunH1498a4aXAAdtWXwD_aSuEmubgQ@mail.gmail.com>
References: <CAAbtOZd-QT+Md0zpiky3F=Z-VCNLFmMX5snBx337bKrwJ1KBEQ@mail.gmail.com>
 <CAEBuzr8259sbbpiRHFOqarvOu0hGesqCpZCudh1-KsAmg26aTw@mail.gmail.com>
 <CABL7CQh6XmT63-h6R3j8sunH1498a4aXAAdtWXwD_aSuEmubgQ@mail.gmail.com>
Message-ID: <D943D3CE-0FFC-4D51-B769-C0D2F0896CCC@gmail.com>

Hi,

I saw the button to allow running workflows. But I did not really get it as the other CI ran. If I had a complaint, it would be more in terms of equal integration/workflow across platforms.
Like having to click on 10 buttons just to see logs for Azure, less for CircleCI and none for GitHub actions.

I don?t think it?s a problem to allow a first-time contributor to use the CI as in the end, a maintainer will have to check the issue for reviews.
But I would prefer to have a global control acting on all CI.

And in general I would personally prefer to trigger manually the CI on feature branches.

Cheers,
Pamphile


> On 29.04.2021, at 09:53, Ralf Gommers <ralf.gommers at gmail.com> wrote:
> 
> 
> 
> On Thu, Apr 29, 2021 at 7:33 AM Ilhan Polat <ilhanpolat at gmail.com <mailto:ilhanpolat at gmail.com>> wrote:
> There are also a lot of complaints from OS maintainers to GitHub in the private OSS Feedback Group. It is indeed tiring for large and active projects to go through and click every PR. TravisCI took the path to kick out OS to combat and GH is doing this basically pushing the burden on the OS maintainers. A proposal was given to restrict this to cases in which only if the pull request touches sensitive files .github or .sh-alike however GH has not commented yet. I am not sure if that is going to be sufficient to block malicious code executions though. 
> 
> I don't think it's too bad. What GitHub does now seems sensible, and I think it's a very small price to pay for free CI. TravisCI pulling support almost completely was a very different story, that caused months of pain.
> 
> Cheers,
> Ralf
>  
> 
> 
> 
> On Wed, Apr 28, 2021 at 11:45 PM Andrew Nelson <andyfaff at gmail.com <mailto:andyfaff at gmail.com>> wrote:
> https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks <https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks>
> 
> From the GH website: 
> When a first-time contributor submits a pull request to a public repository, a maintainer with write access must approve any workflow runs.
> 
> The purpose behind this is so that things like mining for Bitcoin is prevented.
> 
> A.
> 
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org <mailto:SciPy-Dev at python.org>
> https://mail.python.org/mailman/listinfo/scipy-dev <https://mail.python.org/mailman/listinfo/scipy-dev>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org <mailto:SciPy-Dev at python.org>
> https://mail.python.org/mailman/listinfo/scipy-dev <https://mail.python.org/mailman/listinfo/scipy-dev>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org <mailto:SciPy-Dev at python.org>
> https://mail.python.org/mailman/listinfo/scipy-dev <https://mail.python.org/mailman/listinfo/scipy-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210429/3adbd7d9/attachment-0001.html>

From spanda3 at jhu.edu  Thu Apr 29 07:36:29 2021
From: spanda3 at jhu.edu (Sambit Panda)
Date: Thu, 29 Apr 2021 11:36:29 +0000
Subject: [SciPy-Dev] ENH: Add Distance Correlation to scipy.stats
References: <96428ef5-f632-4294-9d15-5c57d98b578d@Spark>
Message-ID: <f216fb58-8741-4744-80da-b1365db4e8c9@Spark>

Hi all,

I hope that everyone was is doing well. I just wanted feedback on my issue supporting Distance Correlation within SciPy: https://github.com/scipy/scipy/issues/13728https://github.com/scipy/scipy/issues/13728. The core motivation for inclusion is mentioned in the issue:


> Multivariate independence testing is a desirable problem to solve in many fields such as connectomics. Looking through the scipy.stats module, there are very few multivariate tests. In fact, there is 1, multiscale_graphcorr<https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.multiscale_graphcorr.html>, which I had added a few months ago. Distance correlation (Dcorr) is another highly cited independence test that operates very well on multivariate data [1]. In fact, due to the fact the distance and kernel methods are equivalent [2], and that independence and two-sample testing is equivalent [3], we would be able to condense the implementations of Dcorr, Hilbert-Schmidt Independence Criterion (Hsic), Maximal Mean Discrepancy (MMD), and Energy into a single test. Our implementation also includes a number of optimizations, including a test statistic implementation that operates in n log n time instead of n^2 time when the data is univariate and the metric used is euclidean [4], and a chi-square approximation to the p-value that skips the computationally-intense permutation test [5]. Adding this provides another accurate multivariate test to SciPy, and makes the scipy.stats module as a whole far better.
>  1. https://projecteuclid.org/journals/annals-of-statistics/volume-35/issue-6/Measuring-and-testing-dependence-by-correlation-of-distances/10.1214/009053607000000505.full
> 2. https://link.springer.com/article/10.1007/s10182-020-00378-1
> 3. https://arxiv.org/abs/1910.08883
> 4. https://www.sciencedirect.com/science/article/abs/pii/S0167947319300313
> 5. https://arxiv.org/abs/1912.12150

As for modifications to the SciPy source code, they should be fairly minimal. I had received a comment that my idea was similar to this issue: https://github.com/scipy/scipy/issues/10680https://github.com/scipy/scipy/issues/10680, but the key difference is that I don?t propose adding the original package of the source code: https://github.com/neurodata/hyppohttps://github.com/neurodata/hyppo as a dependency. When I merged scipy.stats.multiscale_graphcorr, I added a number of helper functions to help me compute the test statistic. The majority of the modifications I propose will likely be to those helper functions, and a new function that computes the test statistic and p-value in a similar structure to those tests in scipy.stats. I mention what needs to be ported and locations where I need to make changes in the issue. It is also worth mentioning that similar Cython files exist in an old repository, in a less refined state, i.e. https://github.com/neurodata/mgcpy-old/blob/master/mgcpy/independence_tests/utils/distance_transform.pyxhttps://github.com/neurodata/mgcpy-old/blob/master/mgcpy/independence_tests/utils/distance_transform.pyx.

Licensing was also mentioned as a potential problem, but I don?t think that it should be. I created the hyppo package and give permission for inclusion within SciPy. I also believe that Apache License v2.0 is a fairly permissive and common open source license, right?

Sorry for the long email,

Sambit Panda
BME Ph.D. Student
NeuroData<https://neurodata.io/> @ Johns Hopkins
Website<https://sampan.me/>?GitHub<https://github.com/sampan501>?Scholar<https://scholar.google.com/citations?user=-V3CmPoAAAAJ&hl=en>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210429/243c1969/attachment-0001.html>

From roy.pamphile at gmail.com  Thu Apr 29 07:57:05 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Thu, 29 Apr 2021 13:57:05 +0200
Subject: [SciPy-Dev] ENH: Add Distance Correlation to scipy.stats
In-Reply-To: <f216fb58-8741-4744-80da-b1365db4e8c9@Spark>
References: <96428ef5-f632-4294-9d15-5c57d98b578d@Spark>
 <f216fb58-8741-4744-80da-b1365db4e8c9@Spark>
Message-ID: <961353F4-B7EE-43D5-BB4A-20D1DF8FE7DC@gmail.com>

Hi,

Thank you for following up with this email.

> Licensing was also mentioned as a potential problem, but I don?t think that it should be. I created the hyppo package and give permission for inclusion within SciPy. I also believe that Apache License v2.0 is a fairly permissive and common open source license, right?

Apache 2 is not compatible with 3 clause BSD as stated in our documentation (https://scipy.github.io/devdocs/dev/core-dev/index.html#licensing <https://scipy.github.io/devdocs/dev/core-dev/index.html#licensing>). That?s why I asked you about relicensing.
Since you agree and seem to be the copyright owner, there is no problem.

Cheers,
Pamphile


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210429/9233d7ef/attachment.html>

From warren.weckesser at gmail.com  Thu Apr 29 10:28:27 2021
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Thu, 29 Apr 2021 10:28:27 -0400
Subject: [SciPy-Dev] Can a PR be configured to run only the build of the
 docs?
In-Reply-To: <CAGu2niU1HMckaTyAKLaNj5-bC9L-uv3v2Ain9sBhv=YQn9C=PQ@mail.gmail.com>
References: <CAGzF1ueX1yz0mDBnhNDirQr2cQYyRVPWhjEoak86=Dv1oz4FUg@mail.gmail.com>
 <CABL7CQiLpYXiGmiBT5AOdAPK3Y6_h40L6529dbwdnTypvFpBaQ@mail.gmail.com>
 <CAGzF1ueWGbSTm1MK-Z_Kso9LvRRC2Viqgbrm=rhg32wDs9=3Mg@mail.gmail.com>
 <CAGzF1ueu1z7Mr3YLjci1ohMhz+85X1BeL1N9X3a05ZRqVRfiBw@mail.gmail.com>
 <CAGu2niU1HMckaTyAKLaNj5-bC9L-uv3v2Ain9sBhv=YQn9C=PQ@mail.gmail.com>
Message-ID: <CAGzF1ufa+nGuRrc6XrNDma5kRWThZckD6VD9uocQerEL6ycaVQ@mail.gmail.com>

On 4/28/21, Eric Larson <larson.eric.d at gmail.com> wrote:
>>
>>  Is there text that I can add to the commit message (or some other
>> method)
>> so that only the CircleCI jobs are run?
>>
>
> As of 22 days ago you can use [skip azp] (among other options) in your
> commit message to skip the Azure pipelines run:
>
> https://github.com/scipy/scipy/pull/13811
>

Thanks Eric, I'll use this when I update the PR.

Warren


> Feel free in your PR to add [skip azurepipelines] to the list of possible
> skip messages that's looked for.
>
> Eric
>

From roy.pamphile at gmail.com  Fri Apr 30 05:48:48 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Fri, 30 Apr 2021 11:48:48 +0200
Subject: [SciPy-Dev] SciPy Community Meetups are back!
Message-ID: <1605BE4D-E642-4CD7-848A-5FA3BBE56982@gmail.com>

Hi everyone,

I would like to propose regular SciPy Community meetings again!

https://doodle.com/poll/fd62d6755texie5s <https://doodle.com/poll/fd62d6755texie5s> 

For this first meeting, it would be good if as many people as possible could join as we would decide things like frequency of such meeting.

Everyone is invited and encouraged to
join in and edit the work-in-progress meeting topics and notes at:

https://hackmd.io/@tupui/scipy-meetup/edit <https://hackmd.io/@tupui/scipy-meetup/edit> 

Cheers,
Pamphile

PS. share the news: https://twitter.com/PamphileRoy/status/1388067651721830401?s=20 <https://twitter.com/PamphileRoy/status/1388067651721830401?s=20> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210430/f2ced4e1/attachment.html>

From rlucente at pipeline.com  Fri Apr 30 07:47:26 2021
From: rlucente at pipeline.com (rlucente at pipeline.com)
Date: Fri, 30 Apr 2021 07:47:26 -0400
Subject: [SciPy-Dev] SciPy Community Meetups are back!
In-Reply-To: <1605BE4D-E642-4CD7-848A-5FA3BBE56982@gmail.com>
References: <1605BE4D-E642-4CD7-848A-5FA3BBE56982@gmail.com>
Message-ID: <053701d73db6$97f00230$c7d00690$@pipeline.com>

I went to the links to sign up / show interest but I am embarrassed that I
couldn't find a place to do that

 
Any specific suggestions would be appreciated

 
From: SciPy-Dev <scipy-dev-bounces+rlucente=pipeline.com at python.org> On
Behalf Of Pamphile Roy
Sent: Friday, April 30, 2021 5:49 AM
To: SciPy Developers List <scipy-dev at python.org>
Subject: [SciPy-Dev] SciPy Community Meetups are back!

 
Hi everyone,

 
I would like to propose regular SciPy Community meetings again!

 
https://doodle.com/poll/fd62d6755texie5s 

 
For this first meeting, it would be good if as many people as possible could
join as we would decide things like frequency of such meeting.

 
Everyone is invited and encouraged to

join in and edit the work-in-progress meeting topics and notes at:


https://hackmd.io/@tupui/scipy-meetup/edit 

 
Cheers,

Pamphile

 
PS. share the news:
https://twitter.com/PamphileRoy/status/1388067651721830401?s=20 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210430/07291777/attachment.html>

From roy.pamphile at gmail.com  Fri Apr 30 10:28:35 2021
From: roy.pamphile at gmail.com (Pamphile Roy)
Date: Fri, 30 Apr 2021 16:28:35 +0200
Subject: [SciPy-Dev] SciPy Community Meetups are back!
In-Reply-To: <053701d73db6$97f00230$c7d00690$@pipeline.com>
References: <1605BE4D-E642-4CD7-848A-5FA3BBE56982@gmail.com>
 <053701d73db6$97f00230$c7d00690$@pipeline.com>
Message-ID: <7A406548-B445-4615-A657-3FD7639DBB42@gmail.com>

Hi,

If you are talking about HackMD, you need to click on sign-in (top right end corner). You will see options to sign in with various options like using your GitHub profile,
and at the bottom of the popup there is a sign up.

For Doodle there should be nothing to do. Just enter your name and vote.

Thanks for your interest

Cheers,
Pamphile

> On 30.04.2021, at 13:47, <rlucente at pipeline.com> <rlucente at pipeline.com> wrote:
> 
> I went to the links to sign up / show interest but I am embarrassed that I couldn?t find a place to do that
>  
> Any specific suggestions would be appreciated
>  
> From: SciPy-Dev <scipy-dev-bounces+rlucente=pipeline.com at python.org <mailto:scipy-dev-bounces+rlucente=pipeline.com at python.org>> On Behalf Of Pamphile Roy
> Sent: Friday, April 30, 2021 5:49 AM
> To: SciPy Developers List <scipy-dev at python.org <mailto:scipy-dev at python.org>>
> Subject: [SciPy-Dev] SciPy Community Meetups are back!
>  
> Hi everyone,
>  
> I would like to propose regular SciPy Community meetings again!
>  
> https://doodle.com/poll/fd62d6755texie5s <https://doodle.com/poll/fd62d6755texie5s> 
>  
> For this first meeting, it would be good if as many people as possible could join as we would decide things like frequency of such meeting.
>  
> Everyone is invited and encouraged to
> join in and edit the work-in-progress meeting topics and notes at:
> 
> https://hackmd.io/@tupui/scipy-meetup/edit <https://hackmd.io/@tupui/scipy-meetup/edit> 
>  
> Cheers,
> Pamphile
>  
> PS. share the news: https://twitter.com/PamphileRoy/status/1388067651721830401?s=20 <https://twitter.com/PamphileRoy/status/1388067651721830401?s=20>_______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org <mailto:SciPy-Dev at python.org>
> https://mail.python.org/mailman/listinfo/scipy-dev <https://mail.python.org/mailman/listinfo/scipy-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210430/cf6be193/attachment-0001.html>