[SciPy-User] Comparing variable time-shifted two measurements

Thu Nov 5 23:48:21 EST 2009

2009/11/5 Gökhan Sever <gokhansever at gmail.com>:
> Hello,
>
> I have two aircraft based aerosol measurements. The first one is dccnConSTP
> (blue), and the latter is CPCConc (red) as shown in this screen capture.
> (http://img513.imageshack.us/img513/7498/ccncpclag.png). My goal is to
> compare these two measurements. It is expected to see that they must have a
> positive correlation throughout the flight. However, the instrument that
> gives CPCConc was experiencing a sampling issue and therefore making a
> varying time-shifted measurements with respect to the first instrument.
> (From the first box it is about 20 seconds, 24 from the seconds before the
> dccnConSTP measurements shows up.) In other words in different altitude
> levels, I have varying time differences in between these two measurements in
> terms of their shapes. So, my goal turns to addressing this variable
> shifting issue before I start doing the comparisons.
>
> Is there a known automated approach to correct this mentioned varying-lag
> issue? If so, how?

There are several tools you can use, depending on exactly what the problem is.

If the problem is that there's a constant lag for each data set but
you don't know what it is, then you can use the correlation to fit for
the lag - if you take the correlation of two vectors, then the highest
peak in the correlation vector is the lag where the two vectors are
most similar. Correlations can be calculated rapidly using FFTs.

If the lag isn't constant over a data set, you can try using
correlations to find the lag at several points in the data set and
interpolate to get the lag as a function of time (but be careful -
depending on what caused the lag, a steadily-drifting model isn't
necessarily appropriate; maybe you'll have periods of constant offset
separated by jumps).

If you know the lag, but it isn't constant and you're not sure how to
resample your data set to remove the lag, look at scipy's ndimage.
This should have the tools to do what you want.

If your data sets are unevenly sampled, so that you can't use simple
correlations, I'm not sure quite what to suggest, except perhaps
interpolating them to evenly-spaced samples and then running the
correlation. For this try scipy.interpolate.

If you do end up fitting for the lag, keep in mind that you'll have
adjusted the lags to make the time series as similar as possible, so
that there's a risk of overestimating their similarities. But the only
way around that problem is to know the lags from some independent
source.

Anne

> Thank you.
>
> --
> Gökhan
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>