looping and searching in numpy array

Peter Otten __peter__ at web.de
Thu Mar 10 08:02:25 EST 2016


Heli wrote:

> Dear all,
> 
> I need to loop over a numpy array and then do the following search. The
> following is taking almost 60(s) for an array (npArray1 and npArray2 in
> the example below) with around 300K values.
> 
> 
> for id in np.nditer(npArray1):
>                   
>        newId=(np.where(npArray2==id))[0][0]
> 
> 
> Is there anyway I can make the above faster? I need to run the script
> above on much bigger arrays (50M). Please note that my two numpy arrays in
> the lines above, npArray1 and npArray2  are not necessarily the same size,
> but they are both 1d.

You mean you are looking for the index of the first occurence in npArray2 
for every value of npArray1?

I don't know how to do this in numpy (I'm not an expert), but even basic 
Python might be acceptable:

lookup = {}
for i, v in enumerate(npArray2):
    if v not in lookup:
        lookup[v] = i

for v in npArray1:
    print(lookup.get(v, "<not found>"))

That way you iterate once (in Python) instead of 2*len(npArray1) times (in 
C) over npArray2.




More information about the Python-list mailing list