[Baypiggies] string to list question

Glen Jarvis glen at glenjarvis.com
Thu Aug 5 18:01:38 CEST 2010


Vikram,

    I recognize this domain in many of the questions that have been asked.
There are several times where I've thought, "That *so* isn't the most ideal
'Computer Science' way to do something." But, I also recognize that,
especially in the Biological world, we have no control how we receive the
data and thus, we still have to solve problems like those reviewed.

   So, I normally don't challenge the base assumption in the question
because I know from experience, we don't always get the most ideal inputs to
work with. HOWEVER, I do want to challenge this one because I know there's a
standard way that this is represented in the Biological community without
using three characters for a single base. I recognize your original question
of z = 'AT/CG' to mean, In Biological terms, that:

"Zee equals the string of three nucleotide bases. The first base is Adenine.
The second base is either Thymine or Cytosine. The third base is Guanine."

There's a *much* better (and commonly accepted) way to represent this.

The way this is traditionally is represented is with the extended
genetic alphabet (
http://www.hrbc-genomics.net/training/bcd/Curric/PrwAli/node7.html). In this
case, the middle base would be represented by the letter Y as that means
either Thymine or Cytosine.

I feel it's much better to represent this as:

z = 'AYG'

Then, the string will work without any expected manipulations. I would
always work with the alphabet and not put the three character string back in
as this alphabet is defined and accepted in the community. However, if one
wanted to they still could later represent this in a 'lookup dictionary'
such as follows if the output ever needed to be in a the format in question.

lookup = {'R': 'G/A',
              'Y': 'T/C',
              'M': 'A/C',....}

Cheers,


Glen


On Wed, Aug 4, 2010 at 9:37 PM, Vikram K <kpguy1975 at gmail.com> wrote:

> Suppose i have this string:
> z = 'AT/CG'
>
> How do i get this list:
>
> zlist = ['A','T/C','G']
>
>
> _______________________________________________
> Baypiggies mailing list
> Baypiggies at python.org
> To change your subscription options or unsubscribe:
> http://mail.python.org/mailman/listinfo/baypiggies
>



-- 
Whatever you can do or imagine, begin it;
boldness has beauty, magic, and power in it.

-- Goethe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/baypiggies/attachments/20100805/77dde0e9/attachment.html>


More information about the Baypiggies mailing list