[SciPy-Dev] scipy.integrate optimisation for pandas.TimeSeries

Dražen Lučanin drazen.lucanin at gmail.com
Sat May 4 07:33:20 EDT 2013


Hi all,

I wrote a GSoC project proposal. Unfortunately I didn't manage to get
through a feedback loop to improve it based on your comments - had some
trouble registering for the mailing list before. It is up on Melange as
"SciPy: Improving Numerical Integration of Time Series" - probably under
this link:

https://google-melange.appspot.com/gsoc/proposal/review/google/gsoc2013/kermit666/2#

My main motivation is that the current way to integrate a time series in
Python (due to Pandas using miliseconds as its underlying structure [1]):

    integrate.simps(ts, ts.index.astype(np.int64) / 10**9)

executes with a big overhead (of first having to divide every element to
get a 1 integer unit = 1 second representation) and feels somewhat
unpythonic. This gist illustrates the performance overhead that's troubling
me:

http://nbviewer.ipython.org/5512857

I would like to explore ways to rely on the basic timestamp arithmetic in
scipy (dynamically, without introducing any dependencies), instead of
forcing the user to transform the whole data domain.

If there is any time left after this, the usability of scipy.integrate for
time series integration could be further improved by adding some new
features to Pandas too [2].

Is there perhaps anyone willing to mentor such work?

Regards,
Dražen Lučanin

[1]:
http://stackoverflow.com/questions/15203623/convert-pandas-datetimeindex-to-unix-time
[2]: https://github.com/pydata/pandas/issues/2704
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20130504/20eb3e4b/attachment.html>


More information about the SciPy-Dev mailing list