[Tutor] GC content: Help with if/else statements:

Mats Wichmann mats at wichmann.us
Sun Oct 13 14:22:57 EDT 2019


On 10/13/19 9:52 AM, Mihir Kharate wrote:
> @Joel Goldstick , I tried what you suggested. Its still returning the first
> if statement. If I input a sequence which has a different letter than
> A/T/G/C (for example, if I input ATFGC as my input when prompted) , it does
> not return the error message: "Invalid base-pair in your Sequence".
> It seems like the second if statement (or else statement, if I change it
> that way) does not run at all.

You have some logic and/or indentation problems.

>>> ######################################################
>>> input_sequence = input("Input your sequence here: ")
>>> input_sequence = input_sequence.upper()
>>> print("\nYour sequence is: " + input_sequence)
>>> print("\n")
>>> a=t=g=c=0
>>> def gc_content():
>>>      global a,g,t,c

is there a reason to have these as global?  do you have a need to save 
values across repeated calls to this function? - seems unlikely.

>>>      for character in input_sequence:
>>>          if character == 'G':
>>>              g+=1
>>>          if character == 'C':
>>>              c+=1
>>>          if character == 'T':
>>>              t+=1
>>>          if character == 'A':
>>>              a+=1
>>>          gc_cont = (g+c)/(a+g+t+c)*100

this means you will do a pointless - and indeed destructive computation 
if the character is invalid (you'll get a divide-by-zero unless some 
valid character has been previously seen).  Probably you don't want to 
do this computation at all inside the loop, but rather finish processing 
the whole sequence first.

the tests here should properly be if-else, which will also let you 
easily bail out in the case of an invalid character.


>>>      if character == 'A' or 'T' or 'G' or 'C':

and this test, which is outside the loop as shown by the indentation, 
tests only the value of the last character will be tested.

>>
>> The if statement isn't doing what you think it is.
>> Try:
>>      if character in ('A', 'T', 'G', 'C'):
>>           ....
>>
>>
>>>          print ("The GC content is: " + str(gc_cont) + " %")

the formatting on this could be more elegant.  Here's a quicky refactor 
(more could be done), which also adds a sanity check in the beginning, 
and returning the result from the function (I see Alan has commented on 
the same thing while I was typing):

def gc_content():
     """ calculate and return GC content of a sequence

     Returns None if invalid
     """

     global a,g,t,c

     if not input_sequence:
         return None

     for character in input_sequence:
         if character == 'G':
             g+=1
         elif character == 'C':
             c+=1
         elif character == 'T':
             t+=1
         elif character == 'A':
             a+=1
         else:
             print(f"Invalid base-pair in your Sequence: {character}")
             return None

     gc_cont = (g+c)/(a+g+t+c)*100
     print (f"The GC content is: {gc_cont}%")
     return gc_cont








More information about the Tutor mailing list