How to program efficient pattern searches in a list of float numbers?

Bengt Richter bokr at oz.net
Mon Sep 19 20:22:09 EDT 2005


On Mon, 19 Sep 2005 22:58:40 GMT, bokr at oz.net (Bengt Richter) wrote:

>On 19 Sep 2005 00:02:34 -0700, "malv" <malvert at telenet.be> wrote:
>
>>Simple case:
>>In this list, how to find all occurences of intervals of n adjacent
>>indexes having at least one list-member with a value between given
>>limits.
>>Visualizing the list as a two-dimensional curve, this is like
>>horizontally dragging a given rectangle over the curve and finding the
>>x coordinates where the curve passes through the rectangle.(Define such
>>a x-index coordinate as the left corner of the rectangle.)
>>
>>More complicated case:
>>Given a pair of rectangles spaced relatively to each other in a fixed
>>manner. Drag this rectangle pair horizontally over the above
>>two-dimensional curve and list the indexes of the occurences where the
>>curve passes simultaneously through both rectangles.
>>(Define such a x-index coordinate as the leftmost corner of the
>>rectangle pair).
>>
>>These problems can be solved by programming a naive search advancing
>>index by index. It seems obvious that due to the localized properties
>>searched for, much more efficient searches should be possible.
>>After having found the occurence-indexes for one particular rectangle
>>set, how to find the pattern occurences after changing one or more
>>rectangle parameters?
>>
>
>for the first problem, maybe (untested beyond the below):
>
> >>> from itertools import groupby
> >>> data = [5+i%5 for i in xrange(40)]
> >>> data
> [5, 6, 7, 8, 9, 5, 6, 7, 8, 9, 5, 6, 7, 8, 9, 5, 6, 7, 8, 9, 5, 6, 7, 8, 9, 5, 6, 7, 8, 9, 5, 6,
>  7, 8, 9, 5, 6, 7, 8, 9]
> >>> [g.next()[0] for k,g in groupby(enumerate(data), lambda t: 6<t[1]<9) if k]
> [2, 7, 12, 17, 22, 27, 32, 37]
>
>By the time you figure a more efficient way, this can have solved it for quite a long sequence.
>But since an efficient solution seems a lot like homework, I'll leave it to you ;-)
>Or is this a bioinfo problem in disguise?
>
Wrong problem, sorry. I misread too quickly. I'll try to misread more slowly next time ;-/

Regards,
Bengt Richter



More information about the Python-list mailing list