[SciPy-user] Performance problem/suggestion for scikits.timeseries.convert

Pierre GM pgmdevlist at gmail.com
Tue Apr 7 12:38:54 EDT 2009


Abiel,
Thanks a lot for the report.
There could be some improvement to do on ma.apply_along_axis, but I'm  
afraid I won;'t be able to work on it any time soon. I'll try to find  
a substitute to apply_along_axis in the mean time, we'll keep you  
posted.

Personally, I rarely use the func parameter of convert, preferring to  
deal directly w/ the 2D series a basic .convert(.., func=None)  
outputs. It's quite useful when you have several operations to perform  
on the same series (eg, getting the mean & std deviation...), as you  
convert the series only once.
Note that in your example, you shouldn't have to recreate a series:

qt = t.convert(freq="Q")
mqt = qt.mean(-1)




Mmh. We really need to get these "Examples/FAQ" sections in the doc...

Cheers and thx again
P.



On Apr 7, 2009, at 12:17 PM, Abiel X Reinhart wrote:

> I have recently begun working with the useful scikits.timeseries  
> package, and noticed some performance issues in the ts.convert()  
> function. For example, when converting 1000 monthly values to a  
> quarterly frequency using the ma.mean() function, it took me about  
> 0.6 seconds. This isn't that bad, but it definitely can become an  
> issue when working with many series or longer timespans.
>
> After looking at the scikits.timeseries source code, I found  
> essentially all the delay was coming from the ma.apply_along_axis()  
> call inside _convert1d() function. I am not that familiar with the  
> numpy functions, but it seems that ma.apply_along_axis can be be  
> rather slow. For instance, consider the following code:
>
> a = np.arange(300000).reshape(30000,10)
> b = ma.mean(a,-1)
> c = ma.apply_along_axis(ma.mean, -1, a)
>
> In this example, b = c, but b is generated much quicker. My system  
> was always able to generate b in less than 0.02 seconds. but took  
> about 4.3 seconds to generate c.
>
> Perhaps an improvement could be made to the convert() function by  
> recognizing a standard set of built-in numpy functions like ma.mean  
> and applying the method used to generate "b" above, and only using  
> ma.apply_along_axis() for custom functions. Since I imagine most  
> people use standard aggregation functions like ma.mean and ma.sum,  
> this could lead to a big speed improvement. I am building a GUI  
> application, and this would make the difference between an  
> application that reacts essentially instantly and one that hangs  
> slightly in many situations.
>
> One other possible solution seems to be leave scikits.timeseries  
> unchanged, and do something like the following:
>
> Let t be a monthly time series.
>
> t = t.convert(freq="Q")
> t = ts.time_series(ma.mean(t,-1), freq="Q", start_date=t.start_date)
>
> The downside of this is its just more verbose, and many users may  
> not even think of it.
>
> Thanks very much.
>
> Abiel
>
>
>
> This communication is for informational purposes only. It is not  
> intended as an offer or solicitation for the purchase or sale of any  
> financial instrument or as an official confirmation of any  
> transaction. All market prices, data and other information are not  
> warranted as to completeness or accuracy and are subject to change  
> without notice. Any comments or statements made herein do not  
> necessarily reflect those of JPMorgan Chase & Co., its subsidiaries  
> and affiliates. This transmission may contain information that is  
> privileged, confidential, legally privileged, and/or exempt from  
> disclosure under applicable law. If you are not the intended  
> recipient, you are hereby notified that any disclosure, copying,  
> distribution, or use of the information contained herein (including  
> any reliance thereon) is STRICTLY PROHIBITED. Although this  
> transmission and any attachments are believed to be free of any  
> virus or other defect that might affect any computer system into  
> which it is received and opened, it is the responsibility of the  
> recipient to ensure that it is virus free and no responsibility is  
> accepted by JPMorgan Chase & Co., its subsidiaries and affiliates,  
> as applicable, for any loss or damage arising in any way from its  
> use. If you received this transmission in error, please immediately  
> contact the sender and destroy the material in its entirety, whether  
> in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures 
>  for disclosures relating to European legal entities.
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user




More information about the SciPy-User mailing list