[Python-ideas] proposed sequence method: index_subseq()

Tal Einat taleinat at gmail.com
Wed Aug 28 10:21:36 CEST 2013


On Tue, Aug 27, 2013 at 7:19 PM, Jess Austin <jess.austin at gmail.com> wrote:
> On Tue, Aug 27, 2013 at 10:27 AM, Tal Einat <taleinat at gmail.com> wrote:
>>
>> There are many algorithms for sub-sequence search, most of which can
>> be significantly more efficient than the one you used under common
>> circumstances. For example, see the Knuth-Morris-Pratt algorithm [1],
>> and an example Python implementation on the ActiveState cookbook [2].
>
>
> Good point; O(n+m) is better than O(n*m). Minor observation: KMP would
> disallow the possibility I raised of subseq being just an iterator, rather
> than a sequence. I think that's OK, since my use cases haven't had iterators
> here. It actually seems more likely that the "containing" object will be an
> iterator, which the recipe you linked would allow. Hmmm....

You meant that the searched sequence would be an iterator, not the
sub-sequence, right? This should be possible with KMP or similar
algorithms (e.g. as described under the "Variants" section in the KMP
Wikipedia article).

>> A good implementation that would work relatively well in most common
>> cases could be useful, much as Python's sorting implementation is. I
>> think, however, that it would be better to start this as a 3rd party
>> module rather than push for immediate inclusion in the stdlib.
>
> No push here! I'm just asking questions. Thanks for your insights.

My pleasure. I'd be happy to help with the development of this, if
you're going to work on it.

- Tal


More information about the Python-ideas mailing list