reading file contents to an array (newbie)

Christopher T King squirrel at WPI.EDU
Tue Jul 6 21:02:32 EDT 2004


On Tue, 6 Jul 2004, Darren Dale wrote:

> Could I get some suggestions on how to do this more Pythonically? I have 
> to read pretty large files, so this approach is probably way to slow. 
> Here is my code:
> 
> from numarray import *
> myFile=file('test.dat',mode='rt')
> tempData=myFile.readlines()
> data=[]
> for line in tempData:
>      line=line.replace(' ',',')
>      line=line.replace('\n','')
>      data.append(eval(line))
> data=array(data)

First speedup:

Rather than replacing spaces with commas and evaluating the output, use
str.split to split the line up into pieces:

for line in tempData:
    temp=[]
    for value in line.split():
        temp.append(int(value))
    data.append(temp)

Second speedup:

Rewrite what I just wrote about using a list comprehension. This is a bit 
harder to read, but much more efficient:

for line in tempData:
    data.append([int(value) for value in line.split()])

Third speedup:

You don't need to read all the data in from the file beforehand; rather, 
you can just write this:

myFile=file('test.dat',mode='rt')
data=[]
for line in myFile:
    data.append([float(value) for value in line.split()])

Fourth speedup:

Replace the entire for loop with another list comprehension: (This is 
starting to get a bit ridiculous, sorry :))

myFile=file('test.dat',mode='rt')
data=[[float(value) for value in line.split()] for line in myFile]

For great readability (at the cost of some speed), I might suggest writing
the above using a nested function, so your final output looks like this:

from numarray import *

def parseline(line):
	return [float(value) for value in line.split()]

myFile=file('test.dat',mode='rt')
data=array([parseline(line) for line in myFile])

Hope this helps (and my quick intro to list comprehensions was somewhat 
understandable :P).




More information about the Python-list mailing list