grouping subsequences with BIO tags
Bengt Richter
bokr at oz.net
Fri Apr 22 22:10:57 EDT 2005
On Fri, 22 Apr 2005 16:01:42 -0700, Michael Spencer <mahs at telcopartners.com> wrote:
>Steven Bethard wrote:
>> Bengt Richter wrote:
>>
>>> On Thu, 21 Apr 2005 15:37:03 -0600, Steven Bethard
>>> <steven.bethard at gmail.com> wrote:
>>>
>>>> I have a list of strings that looks something like:
>>>> ['O', 'B_X', 'B_Y', 'I_Y', 'O', 'B_X', 'I_X', 'B_X']
>
>[snip]
>>>
>>> With error checks on predecessor relationship,
>>> I think I'd do the whole thing in a generator,
>
>I'm curious why you (Bengt or Steve) think the generator is an advantage here.
>As Steve stated, the data already exists in lists of strings.
I hadn't seen your post[1], which I think is a nice crisp and clever solution ;-)
I just wrote what I thought was a straightforward solution, anticipating that
the imput list might be some huge bioinfo thing, and you might want to iterate
through the sublists one at a time and not want to build the whole list of
lists as represented by your stack.
[1] I don't know why stuff arrives almost instantly sometimes, and sometimes quite
delayed and out of order, but it is a bit embarrassing to post a me-too without
relevant comment, or being able to decide whether to play open source leapfrog.
In this case, I don't see a lily pad on the other side of your code, other than
the memory aspect ;-)
>
>The direct list-building solution I posted is simpler, and quite a bit faster.
>
>L = ['O', 'B_X', 'B_Y', 'I_Y', 'O', 'B_X', 'I_X', 'B_X']
>
>def timethem(lst, funcs = (get_runsSB, get_runsMS, get_runsBR)):
> for func in funcs:
> print shell.timefunc(func, lst)
>
> >>> timethem(L)
> get_runsSB(...) 7877 iterations, 63.48usec per call
> get_runsMS(...) 31081 iterations, 16.09usec per call
> get_runsBR(...) 16114 iterations, 31.03usec per call
>
>
>Michael
>
Regards,
Bengt Richter
More information about the Python-list
mailing list