looping and searching in numpy array
Peter Otten
__peter__ at web.de
Thu Mar 10 08:02:25 EST 2016
Heli wrote:
> Dear all,
>
> I need to loop over a numpy array and then do the following search. The
> following is taking almost 60(s) for an array (npArray1 and npArray2 in
> the example below) with around 300K values.
>
>
> for id in np.nditer(npArray1):
>
> newId=(np.where(npArray2==id))[0][0]
>
>
> Is there anyway I can make the above faster? I need to run the script
> above on much bigger arrays (50M). Please note that my two numpy arrays in
> the lines above, npArray1 and npArray2 are not necessarily the same size,
> but they are both 1d.
You mean you are looking for the index of the first occurence in npArray2
for every value of npArray1?
I don't know how to do this in numpy (I'm not an expert), but even basic
Python might be acceptable:
lookup = {}
for i, v in enumerate(npArray2):
if v not in lookup:
lookup[v] = i
for v in npArray1:
print(lookup.get(v, "<not found>"))
That way you iterate once (in Python) instead of 2*len(npArray1) times (in
C) over npArray2.
More information about the Python-list
mailing list