Programmatically finding "significant" data points

Tue Nov 14 09:21:37 EST 2006

erikcw wrote:

> Hi all,
> 
> I have a collection of ordered numerical data in a list.  The numbers
> when plotted on a line chart make a low-high-low-high-high-low (random)
> pattern.  I need an algorithm to extract the "significant" high and low
> points from this data.
> 
> Here is some sample data:
> data = [0.10, 0.50, 0.60, 0.40, 0.39, 0.50, 1.00, 0.80, 0.60, 1.20,
> 1.10, 1.30, 1.40, 1.50, 1.05, 1.20, 0.90, 0.70, 0.80, 0.40, 0.45, 0.35,
> 0.10]
> 
> In this data, some of the significant points include:
> data[0]
> data[2]
> data[4]
> data[6]
> data[8]
> data[9]
> data[13]
> data[14]
> ....
> 
> How do I sort through this data and pull out these points of
> significance?

I think you are looking for "extrema":

def w3(items):
    items = iter(items)
    view = None, items.next(), items.next()
    for item in items:
        view = view[1:] + (item,)
        yield view

for i, (a, b, c) in enumerate(w3(data)):
    if a > b < c:
        print i+1, "min", b
    elif a < b > c:
        print i+1, "max", b
    else:
        print i+1, "---", b

Peter