[SciPy-user] Performance problem/suggestion for scikits.timeseries.convert

Abiel X Reinhart abiel.x.reinhart at jpmchase.com
Tue Apr 7 12:17:29 EDT 2009


I have recently begun working with the useful scikits.timeseries package, and noticed some performance issues in the ts.convert() function. For example, when converting 1000 monthly values to a quarterly frequency using the ma.mean() function, it took me about 0.6 seconds. This isn't that bad, but it definitely can become an issue when working with many series or longer timespans.

After looking at the scikits.timeseries source code, I found essentially all the delay was coming from the ma.apply_along_axis() call inside _convert1d() function. I am not that familiar with the numpy functions, but it seems that ma.apply_along_axis can be be rather slow. For instance, consider the following code:

a = np.arange(300000).reshape(30000,10)
b = ma.mean(a,-1)
c = ma.apply_along_axis(ma.mean, -1, a)

In this example, b = c, but b is generated much quicker. My system was always able to generate b in less than 0.02 seconds. but took about 4.3 seconds to generate c.

Perhaps an improvement could be made to the convert() function by recognizing a standard set of built-in numpy functions like ma.mean and applying the method used to generate "b" above, and only using ma.apply_along_axis() for custom functions. Since I imagine most people use standard aggregation functions like ma.mean and ma.sum, this could lead to a big speed improvement. I am building a GUI application, and this would make the difference between an application that reacts essentially instantly and one that hangs slightly in many situations.

One other possible solution seems to be leave scikits.timeseries unchanged, and do something like the following:

Let t be a monthly time series.

t = t.convert(freq="Q")
t = ts.time_series(ma.mean(t,-1), freq="Q", start_date=t.start_date)

The downside of this is its just more verbose, and many users may not even think of it.

Thanks very much.

Abiel



This communication is for informational purposes only. It is not
intended as an offer or solicitation for the purchase or sale of
any financial instrument or as an official confirmation of any
transaction. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change
without notice. Any comments or statements made herein do not
necessarily reflect those of JPMorgan Chase & Co., its subsidiaries
and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase &
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.

Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to European legal entities.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090407/174635a5/attachment.html>


More information about the SciPy-User mailing list