Generating all ordered substrings of a string

girish at it.usyd.edu.au girish at it.usyd.edu.au
Tue Jul 11 21:44:23 EDT 2006


Quoting Bruno Desthuilliers <bdesth.quelquechose at free.quelquepart.fr>:

> girish at it.usyd.edu.au a écrit :
> > Hi,
> >  I want to generate all non-empty substrings of a string of length
> >=2.
> > Also,
> > each substring is to be paired with 'string - substring' part and vice
> > versa.
> >  Thus, ['abc'] gives me [['a', 'bc'], ['bc', 'a'], ['ab', 'c'], ['c',
> > 'ab'], ['b', 'ac'], ['ac', 'b']] etc.
> >  Similarly, 'abcd' should give me [['a', 'bcd'], ['bcd', 'a'], ['abc',
> > 'd'], ['d', 'abc'], ['b', 'acd'], ['acd', 'b'],['c', 'abd'], ['abd',
> 'c'],
> > ['ab', 'cd'], ['cd', 'ab'], ['bc', 'ad'], ['ad', 'bc'], ['ac',
> > 'bd'],['bd','ac']]
> >
> >  I've tried the following but i cant prevent duplicates and i'm
> missing
> > some substrings:
> >
> >
> >>>>colocn = 'abcd'
> >>>>k = 4
> >>>>for i in range(k-1):
> >
> >             for j in range(1,k):
> >                 rule1 = [colocn[i:i+j],colocn[:i]+colocn[i+j:]]
> >                 rule2 = [colocn[:i]+colocn[i+j:],colocn[i:i+j]]
> >                 rules.append(rule1)
> >                 rules.append(rule2)
> >
> >>>>rules
> >
> > [['a', 'bcd'], ['bcd', 'a'], ['ab', 'cd'], ['cd', 'ab'], ['abc', 'd'],
> > ['d', 'abc'], ['b', 'acd'], ['acd', 'b'], ['bc', 'ad'], ['ad', 'bc'],
> > ['bcd', 'a'], ['a', 'bcd'], ['c', 'abd'], ['abd', 'c'], ['cd', 'ab'],
> > ['ab', 'cd'], ['cd', 'ab'], ['ab', 'cd']]
> >
> >
> > Any ideas??
>
> Algorithmic problem.
>
> First, you need to get all permutations. This is a known algorithm, that
> you'll find examples of on the net. Then for each permutation, list
> possibles 'pair-splits'.
>
> Here's a (probably sub-optimal, but it's getting late here) possible
> implementation in functional style:
>
> def rotate(lst):
>      yield lst
>      max = len(lst)
>      for i in range(1, max):
>          yield lst[i:] + lst[:-(max-i)]
>
> def permute(lst):
>      if len(lst) > 2:
>          for rotated in rotate(lst):
>              head, tail = rotated[0], rotated[1:]
>              for permuted in permute(tail):
>                  yield [head] + permuted
>      elif len(lst) == 2:
>          yield lst
>          yield lst[::-1]
>      else:
>          yield lst
>
> def splits(lst):
>      for i in range(1, len(lst)):
>          yield lst[0:i], lst[i:]
>
> def allsplits(lst):
>      for permuted in permute(lst):
>          for pair in splits(permuted):
>              yield pair
>
> def listsubstrings(thestr):
>      format = lambda pair: (''.join(pair[0]), ''.join(pair[1]))
>      return [format(list(pair)) for pair in allsplits(list(thestr))]
>
>
> print listsubstrings("abcd")
Thanks Bruno. I wanted to avoid permutations as it would take more time,
but i guess will have to deal with them now :((
Also this one gives each rule twice...i've to search for some double
counting...

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>




----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.



More information about the Python-list mailing list