clarification

Fri Aug 17 09:09:29 EDT 2007

Laurent Pointal wrote:
> Thomas Jollans a écrit :
>> On Friday 17 August 2007, Beema shafreen wrote:
>>> hi everybody,
>>> i have a file with data separated by tab
>>> mydata:
>>> fhl1    fkh2
> <zip>
>>> shows these two are separated by tab represented as columns
>>> i have to check the common data between these two coloumn1 an coloumn2
>>> my code:
>>> data = []
>>> data1 = []
>>> result = []
>>> fh = open('sheet1','r')
>>> for line in fh.readlines():
>>>         splitted = line.strip().split('\t')
>>>         data.append(splitted[0])
>>>         data1.append(splitted[1])
>>>         for k in data:
>>>                 if k in data1:
>>>                         result.append(k)
>>>                         print result
>>> fh.close()
> 
> Use set data type for data and data1 (you fill them with an algo like th 
> one you wrote - just use add() in place of appen()) then use set 
> intersection to get common data.
> 
> See doc for set data type:
> http://docs.python.org/lib/types-set.html
> 
> Would look like (not tested):
> data = set()
> data1 = set()
> fh = open('sheet1','r')
> for line in fh.readlines():
>     splitted = line.strip().split('\t')
>     data.add(splitted[0])
>     data1.add(splitted[1])
> 
> result = data.intersection(data1)

   lefts = set()
   rights = set()
   with open('sheet1', 'r') as fh:
       for line in fh:
           trimmed = line.strip()
           if trimmed: # Skip blanks (file end often looks like that)
               left, right = line.strip().split('\t')
               lefts.add(left)
               rights.add(right)
   result = lefts & rights

-Scott