how two join and arrange two files together

Chris Rebert clp2 at rebertia.com
Thu Jul 23 04:16:04 EDT 2009


On Thu, Jul 23, 2009 at 12:22 AM, <amrita at iisermohali.ac.in> wrote:
>
> Hi,
>
> I have two large files:
>
> FileA
> 15 ALA H = 8.05 N = 119.31 CA = 52.18 HA = 4.52 C =
> 21 ALA H = 7.66 N = 123.58 CA = 54.33 HA = C = 179.35
> 23 ALA H = 8.78 N =  CA =  HA = C = 179.93.................
>
> and
>
> FileB
> 21 ALA  helix (helix_alpha, helix2)
> 23 ALA  helix (helix_alpha, helix3)
> 38 ALA  helix (helix_alpha, helix3)...........
>
> now what i want that i will make another file in which i will join the two
> file in such a way that only matching entries will come like here 21 and
> 23 ALA is in both files, so the output will be something like:-
>
> 21 ALA H = 7.66 N = 123.58 CA = 54.33 HA = C = 179.35| 21 ALA  helix
> (helix_alpha, helix2)
> 23 ALA H = 8.78 N =  CA =  HA = C = 179.93|23 ALA  helix (helix_alpha,
> helix3)
>
> and further i will make another file in which i will be able to put those
> lines form this file based on the missing atom value, like for 21 ALA HA
> is not defined so i will put it another file based on its HA missing value
> similarly i will put 23 ALA on another file based on its missing N,CA and
> HA value.
>
> I tried to join the two file based on their matching entries by:---
>>>>from collections import defaultdict
>>>>
>>>> if __name__ == "__main__":
> ...      a = open("/home/amrita/alachems/chem100.txt")
> ...      c = open("/home/amrita/secstr/secstr100.txt")
> ...
>>>> def source(stream):
> ...     return (line.strip() for line in stream)
> ...
> ...
>>>> def merge(sources):
> ...     for m in merge([source(a),source(c)]):
> ...         print "|".join(c.ljust(10) for c in m)
> ...
>
> but it is not giving any value.

You never actually called any of your <expletive deleted> functions.

Slightly corrected version:

from collections import defaultdict

def source(stream):
    return (line.strip() for line in stream)

def merge(sources):
    for m in sources:
        print "|".join(c.ljust(10) for c in m)

if __name__ == "__main__":
    a = open("/home/amrita/alachems/chem100.txt")
    c = open("/home/amrita/secstr/secstr100.txt")
    merge([source(a), source(c)])


It's still not sophisticated enough to give the exact output you're
looking for, but it is a step in the right direction.

You really should try asking someone from your CS Dept to help you. It
would seriously take a couple hours, at most.

- Chris
-- 
Still brandishing a cluestick a vain...
http://blog.rebertia.com



More information about the Python-list mailing list