Identifying the start of good data in a list

tdmj at hotmail.com tdmj at hotmail.com
Tue Aug 26 20:04:19 EDT 2008


On Aug 26, 5:49 pm, tkp... at hotmail.com wrote:
> I have a list that starts with zeros, has sporadic data, and then has
> good data. I define the point at  which the data turns good to be the
> first index with a non-zero entry that is followed by at least 4
> consecutive non-zero data items (i.e. a week's worth of non-zero
> data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
> 9], I would define the point at which data turns good to be 4 (1
> followed by 2, 3, 4, 5).
>
> I have a simple algorithm to identify this changepoint, but it looks
> crude: is there a cleaner, more elegant way to do this?
>
>     flag = True
>     i=-1
>     j=0
>     while flag and i < len(retHist)-1:
>         i += 1
>         if retHist[i] == 0:
>             j = 0
>         else:
>             j += 1
>             if j == 5:
>                 flag = False
>
>     del retHist[:i-4]
>
> Thanks in advance for your help
>
> Thomas Philips

With regular expressions:

import re

hist = [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
hist_str = ''.join(str(i) for i in hist)
match = re.search(r'[1-9]{5, }', hist_str)
hist = hist[-5:] if match is None else hist[match.start():]

Or slightly more concise:

import re

hist = [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
match = re.search(r'[1-9]{5, }', ''.join(str(i) for i in hist))
hist = hist[-5:] if match is None else hist[match.start():]

Tommy McDaniel



More information about the Python-list mailing list