Out of memory while reading excel file

Mahmood Naderan nt_mahmood at yahoo.com
Thu May 11 03:19:03 EDT 2017


I wrote this:

a = np.zeros((p.max_row, p.max_column), dtype=object)
for y, row in enumerate(p.rows):
      for cell in row:
            print (cell.value)
            a[y] = cell.value 
     print (a[y])


For one of the cells, I see

NM_198576.3
['NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3']

 
These are 50 NM_198576.3 in a[y] and 50 is the number of columns in my excel file (p.max_column)



The excel file looks like

CHR1     11,202,100     NM_198576.3     PASS     3.08932    G|B|C     -    .   .   .



Note that in each row, some cells are '-' or '.' only. I want to read all cells as string. Then I will write the matrix in a file and my main code (java) will process that. I chose openpyxl for reading excel files, because Apache POI (a java package for manipulating excel files) consumes huge memory even for medium files.

So my python script only transforms an xlsx file to a txt file keeping the cell positions and formats.

Any suggestion?

Regards,
Mahmood



More information about the Python-list mailing list