[Numpy-discussion] Problem migrating PDL's index() into NumPy
Miroslav Sedivy
miroslav.sedivy at weather-consult.com
Wed Mar 17 09:36:59 EDT 2010
josef.pktd at gmail.com wrote:
> On Wed, Mar 17, 2010 at 7:12 AM, Miroslav Sedivy wrote:
>> There are two 2D arrays with dimensions: A[10000,1000] and B[10000,100].
>> The first dimension of both arrays corresponds to a list of 10000 objects.
>>
>> The array A contains for each of 10000 objects 1000 integer values
>> between 0 and 99, so that for each of 10000 objects a corresponding
>> value can be found in the array B.
>>
>> I need a new array C[10000,1000] with values from B the following way:
>>
>> for x in range(10000):
>> for y in range(1000):
>> C[x,y] = B[x,A[x,y]]
>>
>> In Perl's PDL, this can be done with $C = $B->index($A)
>>
>> If in NumPy I do C = B[A], then I do not get a [10000,1000] 2D array,
>> but rather a [10000,1000,1000] 3D array, in which I can find the correct
>> values on the following positions:
>>
>> for x in range(10000):
>> for y in range(1000):
>> C[x,y,y]
>>
>> which may seem nice, but it needs 1000 times more memory and very
>> probably 1000 times more time to calculate... Impossible with such large
>> arrays... :-(
>>
>> Could anyone help me, please?
>
> try
> C = B[:,A]
> or
> C = B[np.arange(1000)[:,None], A]
>
> I think, one of the two (or both) should work (but no time for trying it myself)
> Josef
Thank you, Josef, for responding.
None of them works correctly. The first one works only as B.T[:,A] and
gives me the same _3D_ array as B[A].T
The second one tells me: ValueError: shape mismatch: objects cannot be
broadcast to a single shape
Now I am using an iteration over all 10000 elements:
C = np.empty_like(A)
for i in range(10000):
C[:,i] = B[:,i][A[:,i]]
which works perfectly. Just it is a real pain seeing such a for-loop in
the NumPy-World :-(
Thanks,
Miroslav
More information about the NumPy-Discussion
mailing list