[SciPy-user] creating timeseries for non convertional custom frequencies

Pierre GM pgmdevlist at gmail.com
Thu Apr 3 20:17:47 EDT 2008


Marco,

Letting the user define their own frequency is on our to-do list, but not very 
high: the main problem we have is that it would require some significant C 
hacking, which I'm not prepared to do (and I know that Matt is swamped as 
well). 

However, in most cases, you can work around by using the fact that your data 
don't have to be regularly spaced in time: 
In this example:
>>>import scikits.timeseries as ts, numpy.ma as ma
>>>series=ts.time_series(numpy.arange(12), start_date=ts.now('M'))
>>>newseries=series[::2]
newseries has a monthly frequency, but gaps.

If you have your two series of dates and corresponding data, you can create 
your series with time_series(data, dates=dates), regardless of the potential 
gaps. If you don't have the dates but only the starting point, you can try to 
create a temporary list of dates, regularly spaced, and then take only the 
dates you want. For example, we could have created the newseries of the 
previous example with:
>>>ts.time_series(numpy.arange(12), 
dates=ts.date_array(freq='M',start_date=ts.now('M'),length=24)[::2])



> I have data that has been recorded from a data logger. Due to memory
> constraints, the logger has been set to only save the observations
> on a 5-minute basis (1 data point every 5 minutes).
> How do I create a hourly data set / timeseries from such a data?

Import your data at a minute frequency, fill it (with fill_missing_dates), 
convert it to hour: you'll have to decide what to do with your 5-min data: 
sum per hour ? Average per hour ?


> To give two more examples:
> * Another data set I have has only data point at every 6 hours (NCAR
> reanalysis data).
> How do I convert such data into a normal frequencies such as daily or
> monthly? 

Same thing: import it with an hour frequency, fill it and convert it to the 
new frequency.

> * A restaurant records statistics about its guests. The place is 
> closed every Monday. So there will not be any attendance numbers for
> Monday. If I use the daily frequency the timeseries will mess up.
> They'd not count the Monday as "empty."

Mmh, could you use a daily frequency and mask every Monday ?
series[(series.weekday==0)] = masked

> Basicly, I am looking for a way to create my time series object with data
> that is not complete by purpose, irregular or of a non convertional
> frequency. Something like the business day frequency with the difference
> that the gaps are different.

Once again, because TimeSeries objects are MaskedArrays with an extra array 
attached, you should be able to achieve what you want by masking the data you 
want.



More information about the SciPy-User mailing list