Detecting repeated subsequences of identical items

Nobody nobody at nowhere.invalid
Thu Apr 21 08:02:20 EDT 2016


On Thu, 21 Apr 2016 18:05:40 +1000, Steven D'Aprano wrote:

> The specific problem I am trying to solve is that I have a sequence of 
> strings (in this case, error messages from a Python traceback) and I'm 
> looking for repeated groups that may indicate mutually recursive calls. E.g. 
> suppose I have a function f which calls g, and g calls h, and h calls f 
> again, and there's an exception, you will see a traceback in part:

This is a specific case of finding cycles in a directed graph. But
treating it as such probably isn't useful here, because you're interested
in a specific traversal of that graph rather than the graph itself.

One way to approach it is:

sofar = []
for line in traceback:
    if line in sofar:
        j = sofar.index(line)
        if sofar[:j] == sofar[j:j*2]:
            # found repeat
    sofar = [line] + sofar

Note that sofar needs to be in reverse order, because list doesn't have
.rindex() or .rfind().

Detecting nested cycles is somewhat harder because given e.g.

	ababxabababababxababab

you'd want the five repeats of ab in the middle to be treated as two
repeats of ab followed by three repeats of ab, but there's no way to
spot that until later.




More information about the Python-list mailing list