[SciPy-User] More efficient way of sorting and filtering structured array.

Dharhas Pothina Dharhas.Pothina at twdb.state.tx.us
Wed Aug 4 12:52:12 EDT 2010


Hi,

I have a structured array that contains intersection points between two sets of lines (specified by LINENAME and FILENAME). For each unique combination of LINENAME and FILENAME, there are a number of matches in the array intersection_points and I need only the closest match (i.e. the pdist field in the array is the smallest). The following snippet works but is extremely slow. I was wondering if there is a more efficient way to do this.

# Creates closest_points array with zeros
closest_points = np.zeros(0,dtype=pnt_dtype)
# loops through unique linenames of intersection_points
for linename in np.unique(intersection_points['LINENAME']):
	# loops through unique filenames of intersection_points
	for filename in np.unique(intersection_points['FILENAME']):
		# create seperate temporary arrays from intersection_points matching linename THEN filename
		idx_line = intersection_points['LINENAME'] == linename
		idx_file = intersection_points['FILENAME'] == filename
		# create temporary array from only points of correct linename AND filename
		tmp_points = intersection_points[idx_line * idx_file]
		# Eliminates empty array errors by making sure something is present
		if tmp_points.size > 0:
			# sort tmp_points array by pdist 
			idx_sort = np.argsort(tmp_points, order='pdist')
			# add closest tmp_point to bottom of closest_points file
			closest_points = np.hstack((closest_points,tmp_points[idx_sort][0]))


for reference pnt_dtype is :

pnt_dtype = np.dtype([('lon','f8'),('lat','f8'),('x','f8'),('y','f8'),
		('FILENAME','S50'),
		('FILE_NUM','i4'),
		('SSID','i8'), 
		('Lidx', 'i4'),
		('pdist', 'f8'),
		('LINENAME','S50'),
		('HE_Code','i4'),
		('PairNum','i4'),
		('dist', 'S50')
		])



thanks,

- dharhas




More information about the SciPy-User mailing list