Generating all ordered substrings of a string

Bruno Desthuilliers bdesth.quelquechose at free.quelquepart.fr
Tue Jul 11 20:43:18 EDT 2006


girish at it.usyd.edu.au a écrit :
> Hi,
>  I want to generate all non-empty substrings of a string of length >=2.
> Also,
> each substring is to be paired with 'string - substring' part and vice
> versa.
>  Thus, ['abc'] gives me [['a', 'bc'], ['bc', 'a'], ['ab', 'c'], ['c',
> 'ab'], ['b', 'ac'], ['ac', 'b']] etc.
>  Similarly, 'abcd' should give me [['a', 'bcd'], ['bcd', 'a'], ['abc',
> 'd'], ['d', 'abc'], ['b', 'acd'], ['acd', 'b'],['c', 'abd'], ['abd', 'c'],
> ['ab', 'cd'], ['cd', 'ab'], ['bc', 'ad'], ['ad', 'bc'], ['ac',
> 'bd'],['bd','ac']]
> 
>  I've tried the following but i cant prevent duplicates and i'm missing
> some substrings:
> 
> 
>>>>colocn = 'abcd'
>>>>k = 4
>>>>for i in range(k-1):
> 
>             for j in range(1,k):
>                 rule1 = [colocn[i:i+j],colocn[:i]+colocn[i+j:]]
>                 rule2 = [colocn[:i]+colocn[i+j:],colocn[i:i+j]]
>                 rules.append(rule1)
>                 rules.append(rule2)
> 
>>>>rules
> 
> [['a', 'bcd'], ['bcd', 'a'], ['ab', 'cd'], ['cd', 'ab'], ['abc', 'd'],
> ['d', 'abc'], ['b', 'acd'], ['acd', 'b'], ['bc', 'ad'], ['ad', 'bc'],
> ['bcd', 'a'], ['a', 'bcd'], ['c', 'abd'], ['abd', 'c'], ['cd', 'ab'],
> ['ab', 'cd'], ['cd', 'ab'], ['ab', 'cd']]
> 
> 
> Any ideas??

Algorithmic problem.

First, you need to get all permutations. This is a known algorithm, that 
you'll find examples of on the net. Then for each permutation, list 
possibles 'pair-splits'.

Here's a (probably sub-optimal, but it's getting late here) possible 
implementation in functional style:

def rotate(lst):
     yield lst
     max = len(lst)
     for i in range(1, max):
         yield lst[i:] + lst[:-(max-i)]

def permute(lst):
     if len(lst) > 2:
         for rotated in rotate(lst):
             head, tail = rotated[0], rotated[1:]
             for permuted in permute(tail):
                 yield [head] + permuted
     elif len(lst) == 2:
         yield lst
         yield lst[::-1]
     else:
         yield lst

def splits(lst):
     for i in range(1, len(lst)):
         yield lst[0:i], lst[i:]

def allsplits(lst):
     for permuted in permute(lst):
         for pair in splits(permuted):
             yield pair

def listsubstrings(thestr):
     format = lambda pair: (''.join(pair[0]), ''.join(pair[1]))
     return [format(list(pair)) for pair in allsplits(list(thestr))]


print listsubstrings("abcd")









































More information about the Python-list mailing list