[SciPy-User] Status of TimeSeries SciKit

Andreas lists at hilboll.de
Wed Jul 27 12:28:35 EDT 2011


While we're at it:

> Frequency conversion flexibility:
>     - when going from a higher frequency to lower frequency (eg. daily to
>       monthly), the timeseries module adds an extra dimension and groups the
>       points so you still have all the original data rather than discarding
>       data

I'm using scikits.timeseries for analysis of atmospheric measurements.
I've always wanted several things, and now that discussion is under way,
maybe it's a good time to point them out:

* When plotting a series, have the flexibility to have the value marked
down at the center of the frequency. What I mean is, when I have monthly
data and make a plot of one year, have each value be printed at the
middle of the corresponding month, e.g. Jan 16, etc. Otherwise, It's not
obvious to the reader whether the value printed on July 1 is actually
that for June or that for July.

* Have full support for n-dimensional series. When I have a n-d array of
data values for each point in time (n>0), many things don't work. The
biggest problem here seems to be that pickling actually *seems* to work
(a file is created), but when I load the file again, the entries in the
array are somehow screwed up (like transposed).

* Enable rolling means for sparse data. For example, if I have irregular
(in time) measurements, say, every one to six days, I would still like
to be able to calculate a rolling n-day-average. Missing values should
be ignored (speaking numpy: timeslice.compressed().mean())

I don't know if any of this is already implemented in pandas, as I've
never used it up till now. But perhaps someone would be interested in
implementing these issues ...

Cheers,
Andreas.



More information about the SciPy-User mailing list