[SciPy-Dev] Bootstrap confidence limits code

Thu Aug 9 15:32:28 EDT 2012

Replying to Skipper's message to get the Statsmodels folks...

> On Wed, Aug 8, 2012 at 2:38 PM, Constantine Evans <cevans at evanslabs.org>
> wrote:
>>
>> Hello everyone,

On Wed, Aug 8, 2012 at 11:57 AM, Skipper Seabold <jsseabold at gmail.com> wrote:
> Hi,
>
>>
>> A few years ago I implemented a scikit for bootstrap confidence limits
>> (https://github.com/cgevans/scikits-bootstrap). I didn’t think much
>> about it after that until recently, when I realized that some people
>> are actually using it, and that there’s apparently been some talk
>> about implementing this functionality in either scipy.stats or
>> statsmodels (I should thank Randal Olson for discussing this and
>> bringing it to my attention).
>>
>> As such I’ve rewritten most of the code, and written up some
>> docstrings. The current code can do confidence intervals with basic
>> percentile interval, bias-corrected accelerated, and approximate
>> bootstrap confidence methods, and can also provide bootstrap and
>> jackknife indexes. Most of it is implemented from the descriptions in
>> Efron and Tibshirani’s Introduction to the Bootstrap, but the ABC code
>> at the moment is a port from the modified-BSD-licensed bootstrap
>> package for R (not the boot package) as I’m not entirely confident in
>> my understanding of the method.

I can't comment on the ABC method, but your BCA method appears to be
consistent with my own implementation.

>> And so, I have a few questions for everyone:
>>
>> * Is there any interest in including this sort of code in either
>> scipy.stats or statsmodels? If so, where do people think would be the
>> better place? The code is relatively small; at the moment it is less
>> than 200 lines, with docstrings probably making up 100 of those lines.
>
>
> I think it would be great to have this in statsmodels. I filed an
> enhancement ticket about it this morning (also brought to my attention by
> Randy's blog post).
>
> https://github.com/statsmodels/statsmodels/issues/420
>

As a user, I would also love to see this in statsmodels

>>
>> * Also, if so, what would need to be changed, added, and improved
>> beyond what is mentioned in the Contributing to Scipy part of the
>> reference guide? I’m never a fan of my own code, and imagine quite a
>> bit would need to be fixed; I know tests will need to be added too.

I can only speak to the BCA method, but I propose the following when
you compute the acceleration:
https://gist.github.com/3307341

Everyone's data is different and probably 99.99% of the time, SCD
won't turn out to be 0 and raise a ZeroDivision error, but it happened
to me and that's how I fixed it. Just a thought.

Cheers,
-paul