data interpolation

Deborah Swanson python at deborahswanson.net
Wed Nov 9 13:56:19 EST 2016


 On Thursday, November 03, 2016 3:08 AM, Heli wrote:

> I have a question about data interpolation using python. I have a big 
> ascii file containg data in the following format and around 200M 
> points.
> 
> id, xcoordinate, ycoordinate, zcoordinate
> 
> then I have a second file containing data in the following format, ( 
> 2M values)
> 
> id, xcoordinate, ycoordinate, zcoordinate, value1, value2, value3,...,

> valueN
> 
> I would need to get values for x,y,z coordinates of file 1 from values

> of file2.
> 
> I don´t know whether my data in file1 and 2 is from structured or 
> unstructured grid source. I was wondering which interpolation module 
> either from scipy or scikit-learn you recommend me to use?
> 
> I would also appreciate if you could recommend me some sample 
> example/reference.
> 
> Thanks in Advance for your help,

I'm really not familiar with scipy, scikit or grid sources, but it seems
to me that what you want is a single array, with the values filled in
for the coordinates you have values for.

Are you planning to acquire the missing values sometime in the future?
If not, I think you should just use the data in the second file, as is.

If you are planning to get more values associated with coordinates, I'd
want to run through the coordinate sets in the first file, tossing any
duplicates. If there aren't any duplicates in file 1, then at least you
know you've got a clean list and you'll have it in a csv file, or
something you can read into and out of python.

Also, if you're planning to get more values later, you want to combine
the two lists you have. I'd do this by running a check of the coordinate
sets in both files, and this will give you a list of the sets in file 2
with matching sets in file 1. Then I'd delete the matching coordinate
sets from file 1, and simply append what's left of file 1 to file 2. You
would then have one list, waiting for more values to be filled in.

I don't know what your target use is, but making a key from the
coordinates (something like 'id/xcoordinate/ycoordinate/zcoordinate'),
and then making a dictionary using these keys to point to their
corresponding values, would certainly be easier to work with. Unless, of
course, your target application is expecting lists to work with.

Hope this helps,
Deborah




More information about the Python-list mailing list