[SciPy-user] Performance problem/suggestion for scikits.timeseries.convert
Pierre GM
pgmdevlist at gmail.com
Tue Apr 7 12:38:54 EDT 2009
Abiel,
Thanks a lot for the report.
There could be some improvement to do on ma.apply_along_axis, but I'm
afraid I won;'t be able to work on it any time soon. I'll try to find
a substitute to apply_along_axis in the mean time, we'll keep you
posted.
Personally, I rarely use the func parameter of convert, preferring to
deal directly w/ the 2D series a basic .convert(.., func=None)
outputs. It's quite useful when you have several operations to perform
on the same series (eg, getting the mean & std deviation...), as you
convert the series only once.
Note that in your example, you shouldn't have to recreate a series:
qt = t.convert(freq="Q")
mqt = qt.mean(-1)
Mmh. We really need to get these "Examples/FAQ" sections in the doc...
Cheers and thx again
P.
On Apr 7, 2009, at 12:17 PM, Abiel X Reinhart wrote:
> I have recently begun working with the useful scikits.timeseries
> package, and noticed some performance issues in the ts.convert()
> function. For example, when converting 1000 monthly values to a
> quarterly frequency using the ma.mean() function, it took me about
> 0.6 seconds. This isn't that bad, but it definitely can become an
> issue when working with many series or longer timespans.
>
> After looking at the scikits.timeseries source code, I found
> essentially all the delay was coming from the ma.apply_along_axis()
> call inside _convert1d() function. I am not that familiar with the
> numpy functions, but it seems that ma.apply_along_axis can be be
> rather slow. For instance, consider the following code:
>
> a = np.arange(300000).reshape(30000,10)
> b = ma.mean(a,-1)
> c = ma.apply_along_axis(ma.mean, -1, a)
>
> In this example, b = c, but b is generated much quicker. My system
> was always able to generate b in less than 0.02 seconds. but took
> about 4.3 seconds to generate c.
>
> Perhaps an improvement could be made to the convert() function by
> recognizing a standard set of built-in numpy functions like ma.mean
> and applying the method used to generate "b" above, and only using
> ma.apply_along_axis() for custom functions. Since I imagine most
> people use standard aggregation functions like ma.mean and ma.sum,
> this could lead to a big speed improvement. I am building a GUI
> application, and this would make the difference between an
> application that reacts essentially instantly and one that hangs
> slightly in many situations.
>
> One other possible solution seems to be leave scikits.timeseries
> unchanged, and do something like the following:
>
> Let t be a monthly time series.
>
> t = t.convert(freq="Q")
> t = ts.time_series(ma.mean(t,-1), freq="Q", start_date=t.start_date)
>
> The downside of this is its just more verbose, and many users may
> not even think of it.
>
> Thanks very much.
>
> Abiel
>
>
>
> This communication is for informational purposes only. It is not
> intended as an offer or solicitation for the purchase or sale of any
> financial instrument or as an official confirmation of any
> transaction. All market prices, data and other information are not
> warranted as to completeness or accuracy and are subject to change
> without notice. Any comments or statements made herein do not
> necessarily reflect those of JPMorgan Chase & Co., its subsidiaries
> and affiliates. This transmission may contain information that is
> privileged, confidential, legally privileged, and/or exempt from
> disclosure under applicable law. If you are not the intended
> recipient, you are hereby notified that any disclosure, copying,
> distribution, or use of the information contained herein (including
> any reliance thereon) is STRICTLY PROHIBITED. Although this
> transmission and any attachments are believed to be free of any
> virus or other defect that might affect any computer system into
> which it is received and opened, it is the responsibility of the
> recipient to ensure that it is virus free and no responsibility is
> accepted by JPMorgan Chase & Co., its subsidiaries and affiliates,
> as applicable, for any loss or damage arising in any way from its
> use. If you received this transmission in error, please immediately
> contact the sender and destroy the material in its entirety, whether
> in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures
> for disclosures relating to European legal entities.
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
More information about the SciPy-User
mailing list