grouping subsequences with BIO tags
Michael Spencer
mahs at telcopartners.com
Fri Apr 22 19:01:42 EDT 2005
Steven Bethard wrote:
> Bengt Richter wrote:
>
>> On Thu, 21 Apr 2005 15:37:03 -0600, Steven Bethard
>> <steven.bethard at gmail.com> wrote:
>>
>>> I have a list of strings that looks something like:
>>> ['O', 'B_X', 'B_Y', 'I_Y', 'O', 'B_X', 'I_X', 'B_X']
[snip]
>>
>> With error checks on predecessor relationship,
>> I think I'd do the whole thing in a generator,
I'm curious why you (Bengt or Steve) think the generator is an advantage here.
As Steve stated, the data already exists in lists of strings.
The direct list-building solution I posted is simpler, and quite a bit faster.
L = ['O', 'B_X', 'B_Y', 'I_Y', 'O', 'B_X', 'I_X', 'B_X']
def timethem(lst, funcs = (get_runsSB, get_runsMS, get_runsBR)):
for func in funcs:
print shell.timefunc(func, lst)
>>> timethem(L)
get_runsSB(...) 7877 iterations, 63.48usec per call
get_runsMS(...) 31081 iterations, 16.09usec per call
get_runsBR(...) 16114 iterations, 31.03usec per call
Michael
More information about the Python-list
mailing list