reading file contents to an array (newbie)
Christopher T King
squirrel at WPI.EDU
Tue Jul 6 21:02:32 EDT 2004
On Tue, 6 Jul 2004, Darren Dale wrote:
> Could I get some suggestions on how to do this more Pythonically? I have
> to read pretty large files, so this approach is probably way to slow.
> Here is my code:
>
> from numarray import *
> myFile=file('test.dat',mode='rt')
> tempData=myFile.readlines()
> data=[]
> for line in tempData:
> line=line.replace(' ',',')
> line=line.replace('\n','')
> data.append(eval(line))
> data=array(data)
First speedup:
Rather than replacing spaces with commas and evaluating the output, use
str.split to split the line up into pieces:
for line in tempData:
temp=[]
for value in line.split():
temp.append(int(value))
data.append(temp)
Second speedup:
Rewrite what I just wrote about using a list comprehension. This is a bit
harder to read, but much more efficient:
for line in tempData:
data.append([int(value) for value in line.split()])
Third speedup:
You don't need to read all the data in from the file beforehand; rather,
you can just write this:
myFile=file('test.dat',mode='rt')
data=[]
for line in myFile:
data.append([float(value) for value in line.split()])
Fourth speedup:
Replace the entire for loop with another list comprehension: (This is
starting to get a bit ridiculous, sorry :))
myFile=file('test.dat',mode='rt')
data=[[float(value) for value in line.split()] for line in myFile]
For great readability (at the cost of some speed), I might suggest writing
the above using a nested function, so your final output looks like this:
from numarray import *
def parseline(line):
return [float(value) for value in line.split()]
myFile=file('test.dat',mode='rt')
data=array([parseline(line) for line in myFile])
Hope this helps (and my quick intro to list comprehensions was somewhat
understandable :P).
More information about the Python-list
mailing list