String Splitter Brain Teaser

Jp Calderone exarkun at divmod.com
Sun Mar 27 18:09:53 EST 2005


On Sun, 27 Mar 2005 14:39:06 -0800, James Stroud <jstroud at mbi.ucla.edu> wrote:
>Hello,
> 
> I have strings represented as a combination of an alphabet (AGCT) and a an 
> operator "/", that signifies degeneracy. I want to split these strings into 
> lists of lists, where the degeneracies are members of the same list and 
> non-degenerates are members of single item lists. An example will clarify 
> this:
> 
> "ATT/GATA/G"
> 
> gets split to
> 
> [['A'], ['T'], ['T', 'G'], ['A'], ['T'], ['A', 'G']]
> 
> I have written a very ugly function to do this (listed below for the curious), 
> but intuitively I think this should only take a couple of lines for one 
> skilled in regex and/or listcomp. Any takers?

    >>> import re
    >>> s = 'ATT/GATA/G'
    >>> re.findall('(./.|.)', s)
    ['A', 'T', 'T/G', 'A', 'T', 'A/G']
    >>> 

  If it is really important to have ['A'] instead of 'A', etc, looping over the result and noticing strings of length 3 vs length 1, then applying the appropriate transformation, should be simple enough.

  Jp



More information about the Python-list mailing list