[SciPy-user] Equivalent to 'match' function in R?

Wes McKinney wesmckinn at gmail.com
Thu Jul 24 11:12:12 EDT 2008


I did a 'super naive' version of this:

def match(a, b):
    bmap = dict([(v, i) for i, v in enumerate(b)])
    res = empty(len(a))
    for i, val in enumerate(a):
        res[i] = bmap.get(val, NaN)
    return res

Runs pretty slow for a test case, matching arange(20000) with a shuffled
version of itself

In [28]: timeit match(a, b)
10 loops, best of 3: 49.9 ms per loop

Same slightly less naive implementation done all with Cython and working
only with ndarrays:

In [30]: timeit cmatch(a, b)
100 loops, best of 3: 10.3 ms per loop

I don't know how to compare performance of this with R, assume it's pretty
comparable. The only thing that is kind of bust is that values not found in
the target array get translated to NA in R, but NaN's get translated to 0 as
numpy ints, you can't index an array with an array containing NaN's anyhow.
Hmm.

On Thu, Jul 24, 2008 at 9:49 AM, Arnar Flatberg <arnar.flatberg at gmail.com>
wrote:

>
>
> On Thu, Jul 24, 2008 at 3:00 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>
>> Hi all,
>>
>> I've been working with users lately who are transitioning from using R to
>> NumPy/Scipy. Some are accustomed to using the 'match' function, for
>> example:
>>
>> > allData <- cbind(c(1,2,3,4,5), c(12, 19, 27, 38, 51))
>> > allData
>>      [,1] [,2]
>> [1,]    1   12
>> [2,]    2   19
>> [3,]    3   27
>> [4,]    4   38
>> [5,]    5   51
>> > subData <- cbind(c(3,5,1), c(NA, NA, NA))
>> > subData
>>      [,1] [,2]
>> [1,]    3   NA
>> [2,]    5   NA
>> [3,]    1   NA
>>
>
> What about using `intersect` combined with `where` ?
>
> all_data = np.array([[1,2,3,4,5], [12,19,27,38,51]]).T
> sub_data = np.array([[3,5,1], [nan,nan,nan]]).T
> match_ind = np.where(np.intersect_1d(sub_data[:,0], all_data[:,0]))
> sub_data[:,1] = all_data[match_ind,1]
>
> It may not be pretty or the best approach for solving the above examples
> but it behaves like R's match somewhat.
>
> Arnar
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20080724/eda20beb/attachment.html>


More information about the SciPy-User mailing list