[Tutor] Text Processing Query

taserian taserian at gmail.com
Thu Mar 14 12:28:47 CET 2013


Since the identifier and the item that you want to keep are on different
lines, you'll need to set a "flag".

with open(filename) as file:

    scanfile=file.readlines()

    flag = 0

    for line in scanfile:

        if line[0:6]=='COMPND' and 'FAB FRAGMENT' in line: flag = 1

        elif line[0:6]=='COMPND' and 'CHAIN' in line and flag = 1:

            print line

            flag = 0


Notice that the flag is set to 1 only on "FAB FRAGMENT", and it's reset to
0 after the next "CHAIN" line that follows the "FAB FRAGMENT" line.


AR


On Thu, Mar 14, 2013 at 6:56 AM, Spyros Charonis <s.charonis at gmail.com>wrote:

> Hello Pythoners,
>
> I am trying to extract certain fields from a file that whose text looks
> like this:
>
> COMPND   2 MOLECULE: POTASSIUM CHANNEL SUBFAMILY K MEMBER 4;
>
> COMPND   3 CHAIN: A, B;
>
> COMPND  10 MOL_ID: 2;
>
> COMPND  11 MOLECULE: ANTIBODY FAB FRAGMENT LIGHT CHAIN;
>
> COMPND  12 CHAIN: D, F;
>
> COMPND  13 ENGINEERED: YES;
>
> COMPND  14 MOL_ID: 3;
>
> COMPND  15 MOLECULE: ANTIBODY FAB FRAGMENT HEAVY CHAIN;
>
> COMPND  16 CHAIN: E, G;
>
> I would like the chain IDs, but only those following the text heading
> "ANTIBODY FAB FRAGMENT", i.e. I need to create a list with D,F,E,G  which
> excludes A,B which have a non-antibody text heading. I am using the
> following syntax:
>
> with open(filename) as file:
>
>     scanfile=file.readlines()
>
>     for line in scanfile:
>
>         if line[0:6]=='COMPND' and 'FAB FRAGMENT' in line: continue
>
>         elif line[0:6]=='COMPND' and 'CHAIN' in line:
>
>             print line
>
> But this yields:
>
> COMPND   3 CHAIN: A, B;
>
> COMPND  12 CHAIN: D, F;
>
> COMPND  16 CHAIN: E, G;
>
> I would like to ignore the first line since A,B correspond to non-antibody
> text headings, and instead want to extract only D,F & E,G whose text
> headings are specified as antibody fragments.
>
> Many thanks,
> Spyros
>
>
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130314/4923a842/attachment.html>


More information about the Tutor mailing list