[SciPy-User] vectorized cumulative integration?

Thu May 12 16:31:49 EDT 2011

On Thu, May 12, 2011 at 1:47 PM,  <josef.pktd at gmail.com> wrote:
> On Thu, May 12, 2011 at 12:30 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>> I have a pdf that I want to integrate from -np.inf to each point in
>> the support to get the cdf. Right now, I can use list comprehension to
>> do something like
>>
>> cdf = [integrate.quad(pdf, -np.inf, end, args=(some_data_array,) for
>> end in support]
>>
>> But this takes a few seconds. Alternatively, I can do
>>
>> cdf = integrate.cumtrapz(pdf_estimate, support)
>> lower_tail = integrate.quad(pdf, -np.inf, support[0], args=(some_data_array,))
>> cdf = np.r_[lower_tail, cdf+lower_tail]
>>
>> The latter seems like it might be a crude approximation, though I'm
>> not sure. The former is too slow. Any other ideas? I tried to
>> vectorize the former approach like the generic cdf in
>> stats.distributions, but since my pdf takes an array argument I had
>> some trouble and it would take some ugly workarounds I think.
>
> I don't have a solution, but you could try piecewise quad, then at
> least you don't have to integrate the full range each time
>
> probs = [integrate.quad(pdf, end-delta, end, args=(some_data_array,)
> for end in support[1:]]
> cdf = np.cumsum(probs)
>
> or better quad the intervals end[i-1] to end[i]  for i in
> range(len(support)) (or use a proper pair iterator)
>

Ah right, this one is much faster for a first shot. Thanks,

Skipper