Detecting repeated subsequences of identical items

Oscar Benjamin oscar.j.benjamin at gmail.com
Thu Apr 21 04:53:04 EDT 2016


On 21 April 2016 at 04:07, Steven D'Aprano <steve at pearwood.info> wrote:
> I want to group repeated items in a sequence. For example, I can group
> repeated sequences of a single item at a time using groupby:
>
>
> from itertools import groupby
> for key, group in groupby("AAAABBCDDEEEFFFF"):
>     group = list(group)
>     print(key, "count =", len(group))
>
>
> outputs:
>
> A count = 4
> B count = 2
> C count = 1
> D count = 2
> E count = 3
> F count = 4
>
>
> Now I want to group subsequences. For example, I have:
>
> "ABCABCABCDEABCDEFABCABCABCB"
>
> and I want to group it into repeating subsequences. I can see two ways to
> group it:
>
> ABC ABC ABCDE ABCDE F ABC ABC ABC B

There are some algorithms (helpfully shown in Python) here:

https://en.wikipedia.org/wiki/Cycle_detection

Note that those are for a sequence made as x[n+1] = f(x[n]) for some
function f. In your case that's just the function that gets the next
frame up/down in the call stack.

--
Oscar

--
Oscar



More information about the Python-list mailing list