grouping subsequences with BIO tags
Steven Bethard
steven.bethard at gmail.com
Fri Apr 22 19:16:47 EDT 2005
Michael Spencer wrote:
> Steven Bethard wrote:
>
>> Bengt Richter wrote:
>>
>>> On Thu, 21 Apr 2005 15:37:03 -0600, Steven Bethard
>>> <steven.bethard at gmail.com> wrote:
>>>
>>>> I have a list of strings that looks something like:
>>>> ['O', 'B_X', 'B_Y', 'I_Y', 'O', 'B_X', 'I_X', 'B_X']
>
> [snip]
>
>>> With error checks on predecessor relationship,
>>> I think I'd do the whole thing in a generator,
>
> I'm curious why you (Bengt or Steve) think the generator is an advantage
> here. As Steve stated, the data already exists in lists of strings.
>
> The direct list-building solution I posted is simpler, and quite a bit
> faster.
Aren't they basically just the same solution, with your stack.append
replaced by a yield (and with a little additional error checking)? As
far as I'm concerned, either solution is great and writes the code that
I couldn't. ;)
If you're still interested, in the real problem, the data doesn't exist
as a list of strings; it exists as a list of objects for which there is
a Python wrapper to a C API that retrieves the string. I don't know
exactly what happens in the wrapping, but it's possible that I can
conserve some memory by using the generator function. But I'd have to
profile it to know for sure.
STeVe
More information about the Python-list
mailing list