[SciPy-User] NumPy Binomial BTPE method Problem

Josh Lawrence josh.k.lawrence at gmail.com
Wed Oct 3 16:05:55 EDT 2012


Also, the for loops should be i=m+1 and i=y+1 for the left and right
tails, respectively. Again, I do'nt think this tangibly changes
things, but the algorithm shows that you set i=m (or i=y), and the
first step of the loop in both cases is i=i+1. Here's a link to the
paper if you have access to ACM.

http://dl.acm.org/citation.cfm?id=42381 .

So I think it's just the two changes. I have implemented those and get
very similar results from doing a histogram.

On Wed, Oct 3, 2012 at 2:59 PM,  <josef.pktd at gmail.com> wrote:
> On Wed, Oct 3, 2012 at 3:07 PM,  <josef.pktd at gmail.com> wrote:
>> On Wed, Oct 3, 2012 at 2:42 PM, Josh Lawrence <josh.k.lawrence at gmail.com> wrote:
>>> Hey all,
>>>
>>> I received access to the paper and it seems it was originally based
>>> purely on the paper written by Kachitvichyanukul in 1988. I still
>>> think there's a whoopsies with the if ... else if ... else, block
>>> though.
>>
>> the c code "else" looks strange to me,
>> however, checking a few cases with large p*n for a large sample (1
>> million draws), I don't see any difference of the frequency count to
>> the theoretical distribution from scipy.binom.
>
>
> I'm pretty sure you are right.
> (If my reading as non c programmer is correct)
> The else block means that Step 50 is never used, instead it uses Step
> 52, which uses a different approximation that is intended for the
> tails.
> If Step 52 is relatively close to the result of Step 50, then it will
> not be very visible in the final results.
> >From my reading of the code there should be a small distortion around the mean.
>
> Josef
>
>>
>> (but with all the goto's I'm not sure if I really trigger that path.)
>>
>> Josef
>>
>>>
>>> On Wed, Oct 3, 2012 at 11:54 AM, Josh Lawrence
>>> <josh.k.lawrence at gmail.com> wrote:
>>>> Hello all,
>>>>
>>>> I am implementing a binomial random variable in MATLAB. The default
>>>> method in the statistics toolbox is extremely slow for large
>>>> population/trial size. I am needing to do trials for n as large as
>>>> 2**28. I found in NumPy some code that implements a binomial random
>>>> draw in numpy/random/mtrand/distributions.c. I was trying to convert
>>>> the code to MATLAB and the BTPE method seems to have an error in lines
>>>> 337-341 of distributions.c. The if ... else if ... else statement I
>>>> think is incorrect. I think it should be an if ... else ... statement
>>>> followed by the contents of the original else which starts on line
>>>> 337.
>>>>
>>>> The if ... else if ... else block is as follows:
>>>>
>>>> #### begin code snippet ####
>>>> if (m < y)
>>>> {
>>>>     for (i=m; i<=y; i++)
>>>>     {
>>>>         F *= (a/i - s);
>>>>     }
>>>> }
>>>> else if (m > y)
>>>> {
>>>>     for (i=y; i<=m; i++)
>>>>     {
>>>>         F /= (a/i - s);
>>>>     }
>>>> }
>>>> else
>>>> {
>>>>     if (v > F) goto Step10;
>>>>     goto Step60;
>>>> }
>>>> #### end code snippet ####
>>>>
>>>> From what I can tell, the variable F is only used in the comparison
>>>> within the else{} statment (i.e. the if(v > F) statement) and nowhere
>>>> else within the scope of the function.
>>>>
>>>> I also found a fortran implementation here:
>>>> http://wstein.org/home/wstein/www/home/mhansen/spkgs_in_progress/octave-3.2.4/src/libcruft/ranlib/ignbin.f
>>>> and it appears this is from where the code was originally adapted as
>>>> the variable names are the same.
>>>>
>>>> My parsing of fortran GOTOs is a bit rusty, but I think the contents
>>>> of the else block in above snippet should be not be conditional.
>>>>
>>>> I don't understand the underlying algorithm very well and don't have
>>>> access the the BTPE paper, so I can't comment on the validity of the
>>>> fortran code. There just seems to be an error in logic in the above
>>>> code. So please have someone who understands it look at it. It appears
>>>> Robert Kern wrote the function a decent portion of the file at some
>>>> point in the past.
>>>>
>>>> I hope this helps.
>>>>
>>>> Cheers,
>>>>
>>>> --
>>>> Josh Lawrence
>>>>
>>>>
>>>> P.S. I apologize if my email is inconvenient, but I could not figure
>>>> out how to tell gmail to set the reply-to field to be
>>>> scipy-user at scipy.org.
>>>
>>>
>>>
>>> --
>>> Josh Lawrence
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user



-- 
Josh Lawrence



More information about the SciPy-User mailing list