fastest way to read a text file in to a numpy array

Christian Gollwitzer auriocus at gmx.de
Thu Jun 30 17:02:45 EDT 2016


Am 30.06.16 um 17:49 schrieb Heli:
> Dear all,
>
> After a few tests, I think I will need to correct a bit my question. I will give an example here.
>
> I have file 1 with 250 lines:
> X1,Y1,Z1
> X2,Y2,Z2
> ....
>
> Then I have file 2 with 3M lines:
> X1,Y1,Z1,value11,value12, value13,....
> X2,Y2,Z2,value21,value22, value23,...
> ....
>
> I will need to interpolate values for the coordinates on file 1 from file 2. (using nearest)
> I am using the scipy.griddata for this.
>
> scipy.interpolate.griddata(points, values, xi, method='linear', fill_value=nan, rescale=False)

This constructs a Delaunay triangulation and no wonder takes some time 
if you run it over 3M datapoints. You can probably save a factor of 
three, because:

> I need to repeat the griddata above to get interpolation for each of the column of values.

I think this is wrong. It should, according to the docs, happily 
interpolate from a 2D array of values. BTW, you stated you want nearest 
interpolation, but you chose "linear". I think it won't make a big 
difference on runtime, though. (nearest uses a KDtree, Linear uses QHull)

> I was wondering if there are any ways to improve the time spent in interpolation.

Are you sure you need the full generality of this algorithm? i.e., are 
your values given on a scattered cloud of points in the 3D space, or 
maybe the X,Y,Z in file2 are in fact on a rectangular grid? In the 
former case, there is probably nothing you can really do. In the latter, 
there should be a more efficient algorithm by looking up the nearest 
index from X,Y,Z by index arithmetics. Or maybe even reshaping it into a 
3D-array.

	Christian




More information about the Python-list mailing list