[SciPy-User] More efficient way of sorting and filtering structured array.
Dharhas Pothina
Dharhas.Pothina at twdb.state.tx.us
Wed Aug 4 12:52:12 EDT 2010
Hi,
I have a structured array that contains intersection points between two sets of lines (specified by LINENAME and FILENAME). For each unique combination of LINENAME and FILENAME, there are a number of matches in the array intersection_points and I need only the closest match (i.e. the pdist field in the array is the smallest). The following snippet works but is extremely slow. I was wondering if there is a more efficient way to do this.
# Creates closest_points array with zeros
closest_points = np.zeros(0,dtype=pnt_dtype)
# loops through unique linenames of intersection_points
for linename in np.unique(intersection_points['LINENAME']):
# loops through unique filenames of intersection_points
for filename in np.unique(intersection_points['FILENAME']):
# create seperate temporary arrays from intersection_points matching linename THEN filename
idx_line = intersection_points['LINENAME'] == linename
idx_file = intersection_points['FILENAME'] == filename
# create temporary array from only points of correct linename AND filename
tmp_points = intersection_points[idx_line * idx_file]
# Eliminates empty array errors by making sure something is present
if tmp_points.size > 0:
# sort tmp_points array by pdist
idx_sort = np.argsort(tmp_points, order='pdist')
# add closest tmp_point to bottom of closest_points file
closest_points = np.hstack((closest_points,tmp_points[idx_sort][0]))
for reference pnt_dtype is :
pnt_dtype = np.dtype([('lon','f8'),('lat','f8'),('x','f8'),('y','f8'),
('FILENAME','S50'),
('FILE_NUM','i4'),
('SSID','i8'),
('Lidx', 'i4'),
('pdist', 'f8'),
('LINENAME','S50'),
('HE_Code','i4'),
('PairNum','i4'),
('dist', 'S50')
])
thanks,
- dharhas
More information about the SciPy-User
mailing list