[SciPy-User] scikits.timeseries question

Pierre GM pgmdevlist at gmail.com
Mon Nov 30 20:39:39 EST 2009


On Nov 30, 2009, at 8:16 PM, Christopher Barker wrote:

> nope -- not duplicated, but maybe there are missing ones. The point is 
> that I have an array of "days since", and I want array of 
> timeseries.dates (which is a DateArray, yes?)

Got it. Duplicated and/or missing dates correspond to the same problem: you can't assume that your dates are regularly spaced, so you can't use start_date and length.

>> np.array(...) + sd gives you a ndarray of Date objects (so its dtype
>> is np.object), and you use that as the input of date_array. The
>> frequency should be recognized properly.
> 
> OK -- though it seems I SHOULD be able to go straight to an DateArray, 
> and I'm still confused about what this means:

Well, that depends on the type of starting date, actually. If it's a Date, adding a ndarray to it will give you a  ndarray of Date objects. If it's a DateArray of length 1, it'll give you a DateArray. (Note to self: we could probably be a bit more consistent on this one...)


>>> In [43]: da = ts.date_array((1,2,3,4), start_date=sd)
>> 
>> Check the doc for date_array: the first argument can be
>>        * an existing :class:`DateArray` object;
>>        * a sequence of :class:`Date` objects with the same frequency;
>>        * a sequence of :class:`datetime.datetime` objects;
>>        * a sequence of dates in string format;
>>        * a sequence of integers corresponding to the representation of 
>>          :class:`Date` objects.
> 
> That's what I have: a sequence of integers corresponding to the 
> representation of the Date objects (doesn't it represent them as "units 
> since start date" where units is the "freq" ?

No, not exactly: the representation of a Date objects is relative to an absolute build-in reference (Day #1 being 01/01/01). (Likewise,  nump.datetime64 uses the standard 1970/01/01). 
We can't have a variable reference as it would be far too messy too quickly. Instead, you have to use the trick start_date + ndarray of integers to get what you want.

> If that's not what if means, then what does it mean?

If you have a 'A' frequency, that'd be a sequence like 2001, 2002, ...
For a 'M' frequency, that'd be 24001 (for 2001/01), 24002 (for 2001/02)...
For a 'D' frequency, that'd be 730486, 730487... for 2001/01/01, 2001/01/02...
In other terms, the nb of units since the absolute reference.
> 
> hmm -- I see this:
> 
> Definition:
>        ts_lib.mov_average(data, span, dtype=None)
> Docstring:
>     Calculates the moving average of a series.
> 
> Parameters
>     ----------
>     data : array-like
>         Input data, as a sequence or (subclass of) ndarray.
>         Masked arrays and TimeSeries objects are also accepted.
>         The input array should be 1D or 2D at most.
>         If the input array is 2D, the function is applied on each
>         column.
> 
> I've got a 3-d array -- darn! Maybe I'll poke into it and see if it can 
> be generalized.


3D ? What are your actual variables ? Keep in mind that when we talk about dimensions with time series, we zap the time one, so if you have a series of maps, your array is only 2D in our terminology. 
If you have a time series of (lat, lon), mov_average will average your lats independently of your lons




More information about the SciPy-User mailing list