best way to read a huge ascii file.

Heli hemla21 at gmail.com
Tue Nov 29 09:17:57 EST 2016


Hi all, 

Let me update my question, I have an ascii file(7G) which has around 100M lines.  I read this file using : 

f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) 

x=f[:,1] 
y=f[:,2] 
z=f[:,3] 
id=f[:,0] 

I will need the x,y,z and id arrays later for interpolations. The problem is reading the file takes around 80 min while the interpolation only takes 15 mins.

I tried to get the memory increment used by each line of the script using python memory_profiler module.

The following line which reads the entire 7.4 GB file increments the memory usage by 3206.898 MiB (3.36 GB). First question is Why it does not increment the memory usage by 7.4 GB?

f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) 

The following 4 lines do not increment the memory at all. 
x=f[:,1] 
y=f[:,2] 
z=f[:,3] 
id=f[:,0] 

Finally I still would appreciate if you could recommend me what is the most optimized way to read/write to files in python? are numpy np.loadtxt and np.savetxt the best?

Thanks in Advance, 






More information about the Python-list mailing list