[SciPy-dev] time series implementation approach
Pierre GM
pgmdevlist at gmail.com
Wed Dec 13 06:06:17 EST 2006
David,
Merci beaucoup pour les sources.
I guess a lot of people in this list have their own implementation of
timeseries...
Storing a date along data gives indeed the possibility to deal with gapped
series. However, things becomes quite messy when you have to combine several
series: how do you deal with the gaps? Your approach (for the glance I got)
assumes either a step or a linear interpolation, which should work nicely in
your case, but is doubtfully applicable to other cases.
In fact, you just gave me an idea. In maskedarrays, the mask can be nomask
(viz, no masked data, surprise), which greatly simplifies most operations.
Here, we could have a frequency of nofreq, which would indicate that the time
step is not constant: one simple option is then to cast it to the smallest
reasonable timestep, and set the missing values to "masked". (By reasonable,
I mean average: if most of your data is roughly monthly but for a couple of
daily ones, stick to monthly. Unless you really want to go daily. Dammit, we
gonna have to leave this possibility).
So yeah, we may eventually have to consider varying frequencies. Which would
mean that the shifting array approach will have to be reexamined. Maybe not:
if freq is nofreq, then it'll panic, for sure. But if freq is set, then it
remains a quite viable option.
OK, now I stop thinking aloud. David, could you tell us more about what you
need ? What kind of data do you have to work with ? Are missing dates
something you have to deal with on a very regular basis ?
More information about the SciPy-Dev
mailing list