Identifying the start of good data in a list

tdmj at hotmail.com tdmj at hotmail.com
Thu Aug 28 18:56:11 EDT 2008


On Aug 27, 11:50 am, Steven D'Aprano <st... at REMOVE-THIS-
cybersource.com.au> wrote:
> On Tue, 26 Aug 2008 17:04:19 -0700, tdmj wrote:
> > On Aug 26, 5:49 pm, tkp... at hotmail.com wrote:
> >> I have a list that starts with zeros, has sporadic data, and then has
> >> good data. I define the point at  which the data turns good to be the
> >> first index with a non-zero entry that is followed by at least 4
> >> consecutive non-zero data items (i.e. a week's worth of non-zero data).
> >> For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], I
> >> would define the point at which data turns good to be 4 (1 followed by
> >> 2, 3, 4, 5).
>
> ...
>
> > With regular expressions:
>
> Good grief. If you're suggesting that as a serious proposal, and not just
> to prove it can be done, that's surely an example of "when all you have
> is a hammer, everything looks like a nail" thinking.
>
> In this particular case, your regex "solution" gives the wrong result,
> indicating that you didn't test your code before posting. Hint:
>
> re.search(r'[1-9]{5, }', "123456")
>
> returns None.
>
> The obvious fix for that specific bug is to use r'[1-9]{5,5}', but even
> that will fail. Hint: what happens if an item has more than one digit?
>
> Before posting another regex solution, make sure it does the right thing
> with this:
>
> [0, 0, 101, 0, 1002, 203, 3050, 4105, 5110, 623, 777]
>
> --
> Steven

Hey, it's clearer than a lot of the other proposals here. Too bad it
doesn't work. This is why you don't post after 8 p.m. after being at
work all day. I was seeing what I now recall as incorrect answers, but
at the time I was in the midst of a brainfart and for some reason took
them to be right. It can be made to work by removing the space in
"{5, }", inserting some kind of marker between the numbers, and using
the right regular expression to recognize nonzero numbers between the
markers, but I think I've already said too much in this thread.

Tommy McDaniel



More information about the Python-list mailing list