[SciPy-User] Status of TimeSeries SciKit

Wed Jul 27 13:31:24 EDT 2011

On Wed, Jul 27, 2011 at 1:27 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
> On Wed, Jul 27, 2011 at 10:16 AM, Wes McKinney <wesmckinn at gmail.com> wrote:
>> On Wed, Jul 27, 2011 at 12:28 PM, Andreas <lists at hilboll.de> wrote:
>
>>> * Enable rolling means for sparse data. For example, if I have irregular
>>> (in time) measurements, say, every one to six days, I would still like
>>> to be able to calculate a rolling n-day-average. Missing values should
>>> be ignored (speaking numpy: timeslice.compressed().mean())
>>
>> Either pandas or bottleneck will do this for you, so you can say something like:
>>
>> rolling_mean(ts, window=50, min_periods=5)
>>
>> and any sample with at least 5 data points in the window will compute
>> a value, missing (NaN) data will be excluded. Bottleneck has move_mean
>> and move_nanmean which will outperform pandas.rolling_mean a little
>> bit since the Cython code is more specialized.
>
> Another use case is when your data is irregularly spaced in time but
> you still want a moving min/mean/median/whatever over a fixed time
> window instead of a fixed number of data points. That might be
> Andreas's use case.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>

True. In pandas parlance I think what you would do is:

rolling_mean(ts.valid(), window).reindex(ts.index, method='ffill')