[Numpy-discussion] indexing, searchsorting, ...

Neil Martinsen-Burrell nmb at wartburg.edu
Tue Jan 26 11:46:00 EST 2010


On 2010-01-26 10:22 , Jan Strube wrote:
> Dear Josef and Keith,
>
> thank you both for your suggestions. I think intersect would be what I
> want for it makes clean code.
> I have, however, spotted the problem:
> I was mistakenly under the assumption that random_integers returns
> unique entries, which is of course not guaranteed, so that the random
> sample contained duplicate entries.
> That's why the numpy methods returned results inconsistent with python 'in'.
> I'll have to be a bit smarter in the generation of the random sample.
> Good thing I try to do things in two different ways. (Sometimes it is,
> anyway...)

You probably know this, but the function sample in Python's random 
module does sample without replacement.

In [1]: import random

In [2]: random.sample([1,2,3],2)
Out[2]: [2, 3]

In [5]: random.sample([1,2,3],3)
Out[5]: [1, 2, 3]

In [7]: random.sample([1,2,3],4)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

/Users/nmb/<ipython console> in <module>()

/Library/Frameworks/Python.framework/Versions/6.0.0/lib/python2.6/random.pyc 
in sample(self, population, k)
     314         n = len(population)
     315         if not 0 <= k <= n:
--> 316             raise ValueError, "sample larger than population"
     317         random = self.random
     318         _int = int

ValueError: sample larger than population

In [9]: import numpy as np
A =
In [10]: A = np.arange(1000)**(3/2.)

In [11]: A[random.sample(range(A.shape[0]),25)]
Out[11]:
array([ 12618.24425188,  30538.0882342 ,  18361.74109392,    925.94546276,
          2935.15331797,   4000.37598233,  21826.1206127 ,   2618.9692629 ,
           868.08467329,     52.38320341,  12063.64687812,  29930.60881439,
         12236.06517635,  10221.89370909,   2414.9534157 ,  13039.6113439 ,
         22967.67537214,  15140.04385727,   2639.67251757,  26461.80402013,
          3218.73142713,  15963.71209963,  11755.35677893,  11551.31295568,
         29142.37675619])

-Neil



More information about the NumPy-Discussion mailing list