Programmatically finding "significant" data points
robert
no-spam at no-spam-no-spam.invalid
Sun Nov 19 06:27:22 EST 2006
erikcw wrote:
> Hi all,
>
> I have a collection of ordered numerical data in a list. The numbers
> when plotted on a line chart make a low-high-low-high-high-low (random)
> pattern. I need an algorithm to extract the "significant" high and low
> points from this data.
>
> Here is some sample data:
> data = [0.10, 0.50, 0.60, 0.40, 0.39, 0.50, 1.00, 0.80, 0.60, 1.20,
> 1.10, 1.30, 1.40, 1.50, 1.05, 1.20, 0.90, 0.70, 0.80, 0.40, 0.45, 0.35,
> 0.10]
>
> In this data, some of the significant points include:
> data[0]
> data[2]
> data[4]
> data[6]
> data[8]
> data[9]
> data[13]
> data[14]
> ....
>
> How do I sort through this data and pull out these points of
> significance?
Its obviously a kind of time series and you are search for a "moving_max(data,t,window)>data(t)" / "moving_min(data,t,window)<data(t)": an extremum within a certain (time) window. And obviously your time window is as low as 2 or 3 or so.
Unfortunately a moving_max func is not yet in numpy and probably not achievable from other existing array functions. You have to create slow looping code.
Robert
More information about the Python-list
mailing list