[SciPy-User] incomplete beta function B(x; k+1, 0) ?
josef.pktd at gmail.com
josef.pktd at gmail.com
Sun Jun 27 12:51:05 EDT 2010
On Sat, Jun 26, 2010 at 6:56 PM, Pauli Virtanen <pav at iki.fi> wrote:
> Sat, 26 Jun 2010 17:59:09 -0400, josef.pktd wrote:
>> Is there a incomplete beta function with a zero argument B(x; k+1, 0)
>> available in scipy?
>>
>>>>> special.betainc( np.arange(5)+1, 0, 0.5)
>> array([ NaN, NaN, NaN, NaN, NaN])
>
>>>> print special.betainc.__doc__
> betainc(x1, x2, x3[, out])
> y=betainc(a,b,x) returns the incomplete beta integral of the
> arguments, evaluated from zero to x: gamma(a+b) / (gamma(a)*gamma(b))
> * integral(t**(a-1) (1-t)**(b-1), t=0..x).
>
> So the function in Scipy has an additional normalization prefactor. The
> prefactor however seems to be zero for b=0, so this function in Scipy
> probably can't easily be used for finding the value of the integral at
> b=0.
>
> But if you just set b=1e-99 (or anything smaller than the machine
> epsilon) and divide the prefactor away, you should end up with the result
> you are looking for.
>
> At b=0 the integral is logarithmically divergent towards x->1, and a
> finite b > 0 cuts this off around x ~ 1 - exp(-1/b) --- so it shouldn't
> matter. The rest of the integrand should also saturate at b ~ machine
> epsilon.
>
> But apparently, there's something fishy in the betainc algorithm for x
> close to 1 and b close to 0, for example this discontinuity:
>
> >>> sc.betainc(1, 1.000002e-6, 1 - 1e-6)
> 1.3815442755027441e-05
> >>> sc.betainc(1, 1.000001e-6, 1 - 1e-6)
> 1.2397031262149499e-05
>
> So while things in principle are OK, you shouldn't probably trust things
> beyond x > 1 - 1e-3.
Thanks, it works this way, including low precision for x close to 1
>>> a=np.arange(20)
>>> b=1.000002e-16
>>> x=gamma(a+b) / (gamma(a)*gamma(b))
>>> p=0.5; (1 + special.betainc(np.arange(20)+1, 1.000002e-16, p)/x/np.log(1-p))[1:] - stats.logser.cdf(np.arange(1,20),p)
array([ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
-1.11022302e-16, -1.11022302e-16, -1.11022302e-16,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
1.11022302e-16, 1.11022302e-16, 1.11022302e-16,
1.11022302e-16])
>>> p=1-1e-6; (1 + special.betainc(np.arange(20)+1, 1.000002e-16, p)/x/np.log(1-p))[1:] - stats.logser.cdf(np.arange(1,20),p)
array([ 0.10246633, 0.10226642, 0.10206728, 0.10186891, 0.1016713 ,
0.10147445, 0.10127835, 0.101083 , 0.10088839, 0.10069451,
0.10050137, 0.10030896, 0.10011726, 0.09992629, 0.09973603,
0.09954648, 0.09935764, 0.09916949, 0.09898204])
>>> p=1-1e-3; (1 + special.betainc(np.arange(20)+1, 1.000002e-16, p)/x/np.log(1-p))[1:] - stats.logser.cdf(np.arange(1,20),p)
array([ 1.16573418e-14, -1.38777878e-16, -6.77236045e-15,
9.88098492e-15, 3.65263375e-14, -3.55271368e-15,
-1.98729921e-14, 1.66533454e-16, 3.55271368e-15,
1.42663659e-14, 7.10542736e-15, -1.14908083e-14,
2.05946371e-14, 1.94289029e-15, 1.62647673e-14,
1.17683641e-14, -2.16493490e-15, 2.55351296e-15,
1.35447209e-14])
>>> p=1-1e-4; (1 + special.betainc(np.arange(20)+1, 1.000002e-16, p)/x/np.log(1-p))[1:] - stats.logser.cdf(np.arange(1,20),p)
array([ 3.78329849e-06, 3.70895520e-06, 3.63619061e-06,
3.56496856e-06, 3.49525364e-06, 3.42701179e-06,
3.36020887e-06, 3.29481244e-06, 3.23079046e-06,
3.16811184e-06, 3.10674608e-06, 3.04666343e-06,
2.98783512e-06, 2.93023291e-06, 2.87382897e-06,
2.81859665e-06, 2.76450961e-06, 2.71154229e-06,
2.65966984e-06])
So, I guess this is not really worth it since the generic calculations
is just a sum of relatively simpler terms, which only in the case of p
close to 1 has a larger number of terms to sum up.
Josef
>
> --
> Pauli Virtanen
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
More information about the SciPy-User
mailing list