Memory error due to the huge/huge input file size

tejsupra at gmail.com tejsupra at gmail.com
Thu Nov 20 14:21:19 EST 2008


On Nov 10, 4:47 pm, tejsu... at gmail.com wrote:
> Hello Everyone,
>
> I need to read a .csv file which has a size of 2.26 GB . And I wrote a
> Python script , where I need to read this file. And my Computer has 2
> GB RAM Please see the code as follows:
>
> """
> This program has been developed to retrieve all the promoter sequences
> for the specified
> list of genes in the given cluster
>
> So, this program will act as a substitute to the whole EZRetrieve
> system
>
> Input arguments:
>
> 1) Cluster.txt or DowRatClust161718bwithDummy.txt
> 2) TransProCrossReferenceAndSequences.csv -> This is the file that has
> all the promoter sequences
> 3) -2000
> 4) 500
> """
>
> import time
> import csv
> import sys
> import linecache
> import re
> from sets import Set
> import gc
>
> print time.localtime()
>
> fileInputHandler = open(sys.argv[1],"r")
> line = fileInputHandler.readline()
>
> refSeqIDsinTransPro = []
> promoterSequencesinTransPro = []
> reader2 = csv.reader(open(sys.argv[2],"rb"))
> reader2_list = []
> reader2_list.extend(reader2)
>
> for data2 in reader2_list:
>    refSeqIDsinTransPro.append(data2[3])
> for data2 in reader2_list:
>    promoterSequencesinTransPro.append(data2[4])
>
> while line:
>    l = line.rstrip('\n')
>    for j in range(1,len(refSeqIDsinTransPro)):
>       found = re.search(l,refSeqIDsinTransPro[j])
>       if found:
>          """promoterSequencesinTransPro[j]  """
>          print l
>
>    line = fileInputHandler.readline()
>
> fileInputHandler.close()
>
> The error that I got is given as follows:
> Traceback (most recent call last):
>   File "RefSeqsToPromoterSequences.py", line 31, in <module>
>     reader2_list.extend(reader2)
> MemoryError
>
> I understand that the issue is Memory error and it is caused because
> of the  line reader2_list.extend(reader2). Is there any other
> alternative method in reading the .csv file  line by line?
>
> sincerely,
> Suprabhath

Thanks a Lot James Mills. It worked




More information about the Python-list mailing list