Detecting repeated subsequences of identical items

Michael Selik michael.selik at gmail.com
Thu Apr 21 03:05:29 EDT 2016


On Thu, Apr 21, 2016 at 2:55 AM Vlastimil Brom <vlastimil.brom at gmail.com>
wrote:

> 2016-04-21 5:07 GMT+02:00 Steven D'Aprano <steve at pearwood.info>:
> > I want to group subsequences.
> > "ABCABCABCDEABCDEFABCABCABCB"
> > ABC ABC ABCDE ABCDE F ABC ABC ABC B
> > or:
> > ABC ABC ABC D E A B C D E F ABC ABC ABC B
>
> if I am not missing something, the latter form of grouping might be
> achieved with the following regex: [snip]
> The former one seems to be more tricky...
>

Right. If the problem is constrained to say that repeated subsequences can
have no nested repeated subsequences, it's much easier to solve.

If you had "ABCABCABCABC" should that result in
ABC ABC ABC ABC, with 4 repetitions
or ABCABC ABCABC with 2 repetitions?
In this example, one might say the higher count is obviously better, but I
think it depends on the context. Maybe the user is looking for the biggest
patterns rather than the biggest counts.



More information about the Python-list mailing list