[SciPy-Dev] SciPy-Dev Digest, Vol 191, Issue 15

rlucas7 at vt.edu rlucas7 at vt.edu
Wed Sep 18 14:36:50 EDT 2019


 
> On Sep 18, 2019, at 9:45 AM, scipy-dev-request at python.org wrote:
> 
> Send SciPy-Dev mailing list submissions to
>    scipy-dev at python.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>    https://mail.python.org/mailman/listinfo/scipy-dev
> or, via email, send a message with subject or body 'help' to
>    scipy-dev-request at python.org
> 
> You can reach the person managing the list at
>    scipy-dev-owner at python.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of SciPy-Dev digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: improvement to binned statistic (Ralf Gommers)
>   2. Adding alpha complexes/filtrations to scipy.spatial?
>      (Hamilton, Wesley)
>   3. Re: Improvement to regular grid interpolation (Simon S. Clift)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Wed, 18 Sep 2019 15:02:17 +0200
> From: Ralf Gommers <ralf.gommers at gmail.com>
> To: SciPy Developers List <scipy-dev at python.org>
> Subject: Re: [SciPy-Dev] improvement to binned statistic
> Message-ID:
>    <CABL7CQhHJ-qJmbNnmJeGYATLKZQZCc6z9EB-RivXxKBUo8pscA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
> 
> Hi Edouard,
> 
> 
> On Wed, Sep 18, 2019 at 11:29 AM Edouard Goudenhoofdt <egouden at gmail.com>
> wrote:
> 
>> Dear scipy developers,
>> 
>> One could use scipy.stats.binned_statistic_dd for the same sample points
>> but for values available at different times.
>> Currently this involves the computation of the bin numbers every time the
>> function is called.
>> Therefore I would like to add an optional argument "binnumbers" to skip
>> this step when calling the function again.
>> 
> 
> That seems sensible. Could you check that creating the bin numbers really
> takes the majority of the time? There's also a fair amount of input
> validation that shouldn't be skipped even when a new `binnumbers` is passed
> in. If that is the case, sending a PR with a benchmark would be very
> welcome.
> 
> Cheers,
> Ralf

IIUC Edouard what you’d like to do is take input data, run binned_statistic_dd() and then do the same thing with the bin edges calculated from this first call either on a new input dataset or on the same data(perhaps calculating on a new statistic?). 

AFAIK the binned_statistic_dd() function isn’t able to take binedges as an argument. If you want multiple stats for the same data I think you can achieve that via a custom callable() that returns multiple statistics rather than a single scalar, but I haven’t done this so you should confirm that the approach would work fine. 

If you want to take that up I’m happy to review the PR. 

If not, and this is something others agree is useful and should be implemented, it seems reasonable to do. I can implement if you don’t have time or are otherwise unable to open a PR. 

Let me know either way. 

-Lucas Roberts


More information about the SciPy-Dev mailing list