[SciPy-User] fast small matrix multiplication with cython?
Skipper Seabold
jsseabold at gmail.com
Wed Dec 8 23:15:55 EST 2010
On Wed, Dec 8, 2010 at 8:08 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> On Tue, Dec 7, 2010 at 7:37 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>>
>> On Tue, Dec 7, 2010 at 5:09 PM, <josef.pktd at gmail.com> wrote:
>>>
>>> On Tue, Dec 7, 2010 at 6:45 PM, Charles R Harris
>>> <charlesr.harris at gmail.com> wrote:
>>> >
>>> >
>>> > On Tue, Dec 7, 2010 at 10:39 AM, Skipper Seabold <jsseabold at gmail.com>
>>> > wrote:
>>> >>
>>> >> On Tue, Dec 7, 2010 at 12:17 PM, Charles R Harris
>>> >> <charlesr.harris at gmail.com> wrote:
>>> >> >
>>> >> >
>>> >> > On Tue, Dec 7, 2010 at 10:05 AM, <josef.pktd at gmail.com> wrote:
>>> >>
>>> >> <snip>
>>> >>
>>> >> >> It's still a linear filter, non-linear optimization comes in because
>>> >> >> the exact loglikelihood function for ARMA is non-linear in the
>>> >> >> coefficients.
>>> >> >> (There might be a way to calculate the derivative in the same loop,
>>> >> >> but that's a different issue.)
>>> >> >>
>>> >> >
>>> >> > The unscented Kalman filter is a better way to estimate the
>>> >> > covariance
>>> >> > of a
>>> >> > non-linear process, think of it as a better integrator. If the
>>> >> > propagation
>>> >> > is easy to compute, which seems to be the case here, it will probably
>>> >> > save
>>> >> > you some time. You might even be able to use the basic idea and skip
>>> >> > the
>>> >> > Kalman part altogether.
>>> >> >
>>> >> > My general aim here is to optimize the algorithm first before getting
>>> >> > caught
>>> >> > up in the details of matrix multiplication in c. Premature
>>> >> > optimization
>>> >> > and
>>> >> > all that.
>>> >> >
>>> >>
>>> >> Hmm I haven't seen this mentioned much in what I've been reading or
>>> >> the documentation on existing software for ARMA processes, so I never
>>> >> thought much about it. I will have a closer look. Well, google turns
>>> >> up this thread...
>>> >>
>>> >
>>> > I've started reading up a bit on what you are doing and the application
>>> > doesn't use extended Kalman filters, so the suggestion to use unscented
>>> > Kalman filters is irrelevant. Sorry about that ;) I'm still wading
>>> > through
>>> > the various statistical notation thickets to see if there might be a
>>> > better
>>> > form to use for the problem but I don't see one at the moment.
>>>
>>> There are faster ways to get the likelihood for a simple ARMA process
>>> than using a Kalman Filter. I think the main advantage and reason for
>>> the popularity of Kalman Filter for this is that it is easier to
>>> extend. So using too many tricks that are specific to the simple ARMA
>>> might take away much of the advantage of getting a fast Kalman Filter.
>>>
>>> I didn't read much of the details for the Kalman Filter for this, but
>>> that was my conclusion from the non-Kalman Filter literature.
>>>
>
> That's the idea. The advantages of the KF are that it's inherently
> structural and it's *very* general. The ARMA case was just a jumping
> off point, but has also proved to be a sticking point. I'd like to
> have a fast and general linear Gaussian KF available for larger state
> space models, as it's the baseline workhorse for estimating linearized
> large macroeconomic models at the moment.
>
>>
>> Well, there are five forms of the standard Kalman filter that I am somewhat
>> familiar with and some are better suited to some applications than others.
>> But at this point I don't see that there is any reason not to use the common
>> form for the ARMA case. It would be interesting to see some profiling since
>> the matrix inversions are likely to dominate as the number of variables go
>> up.
>>
>
> If interested, I am using Durbin and Koopman's "Time Series Analysis
> by State Space Methods" and Andrew Harvey's "Forecasting, Structural
> Time Series Models, and the Kalman Filter" as my main references for
> this. The former is nice and concise but has a lot of details,
> suggestions, and use cases.
>
> I have looked some more and it does seem that the filter converges to
> its steady state after maybe 2 % of the iterations depending on the
> properties of the series, so for the ARMA case I can switch to the
> fast recursions only updating the state (not quite sure on the time
> savings yet), but I am moving away from my goal of a fast and general
> KF implementation...
>
> About the matrix inversions, the ARMA model right now is only
> univariate, so there is no real inverting of matrices. The suggestion
> of Durbin and Koopman for larger, multivariate cases is to split it
> into a series of univariate problems in order to avoid inversion.
> They provide in the book some benchmarks on computational efficiency
> in terms of multiplications needed based on their experience writing
> http://www.ssfpack.com/index.html.
>
> Skipper
>
It looks like I don't save too much time with just Python/scipy
optimizations. Apparently ~75% of the time is spent in l-bfgs-b,
judging by its user time output and the profiler's CPU time output(?).
Non-cython versions:
Brief and rough profiling on my laptop for ARMA(2,2) with 1000
observations. Optimization uses fmin_l_bfgs_b with m = 12 and iprint
= 0.
Full Kalman Filter, starting parameters found via iterations in Python
-----------------------------------------------------------------------------------------------------
1696041 function calls (1695957 primitive calls) in 7.622 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
114 3.226 0.028 5.123 0.045 kalmanf.py:504(loglike)
1368384 1.872 0.000 1.872 0.000 {numpy.core._dotblas.dot}
102 1.736 0.017 2.395 0.023 arima.py:196(loglike_css)
203694 0.622 0.000 0.622 0.000 {sum}
90 0.023 0.000 0.023 0.000 {numpy.linalg.lapack_lite.dgesdd}
218 0.020 0.000 0.024 0.000 arima.py:117(_transparams)
46 0.015 0.000 0.016 0.000
function_base.py:494(asarray_chkfinite)
1208 0.013 0.000 0.013 0.000 {numpy.core.multiarray.array}
102163 0.012 0.000 0.012 0.000 {method 'append' of
'list' objects}
46 0.010 0.000 0.028 0.001 decomp_svd.py:12(svd)
<snip>
Full Kalman Filter, starting parameters found with scipy.signal.lfilter
--------------------------------------------------------------------------------------------------
1249493 function calls (1249409 primitive calls) in 4.596 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
102 2.862 0.028 4.437 0.043 kalmanf.py:504(loglike)
1224360 1.556 0.000 1.556 0.000 {numpy.core._dotblas.dot}
270 0.029 0.000 0.029 0.000 {sum}
90 0.025 0.000 0.025 0.000 {numpy.linalg.lapack_lite.dgesdd}
194 0.018 0.000 0.021 0.000 arima.py:117(_transparams)
46 0.016 0.000 0.017 0.000
function_base.py:494(asarray_chkfinite)
46 0.011 0.000 0.029 0.001 decomp_svd.py:12(svd)
<snip>
Kalman Filter with fast recursions, starting parameters with lfilter
---------------------------------------------------------------------------------------------
1097454 function calls (1097370 primitive calls) in 4.465 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
90 2.860 0.032 4.305 0.048 kalmanf.py:504(loglike)
1073757 1.431 0.000 1.431 0.000 {numpy.core._dotblas.dot}
270 0.029 0.000 0.029 0.000 {sum}
90 0.025 0.000 0.025 0.000 {numpy.linalg.lapack_lite.dgesdd}
182 0.016 0.000 0.019 0.000 arima.py:117(_transparams)
46 0.016 0.000 0.018 0.000
function_base.py:494(asarray_chkfinite)
46 0.011 0.000 0.030 0.001 decomp_svd.py:12(svd)
<snip>
Skipper
More information about the SciPy-User
mailing list