[Numpy-discussion] numpy.random.shuffle

Wed Nov 22 13:21:52 EST 2006

Robert Kern wrote:
> Tim Hochberg wrote:
>   
>> Robert Kern wrote:
>>     
>
>   
>>> One possibility is to check if the object is an ndarray (or subclass) and use
>>> .copy() if so; otherwise, use the current implementation and hope that you
>>> didn't pass it a Numeric or numarray array (or some other view-based object).
>>>   
>>>       
>> I think I would invert this test and instead check if the object is a 
>> Python list and *not* copy in that case. Otherwise, use copy.copy to 
>> copy the object whatever it is. This looks like it would be more robust 
>> in that it would work in all sensible case, and just be a tad slower in 
>> some of them.
>>     
>
> I don't want to assume that the only two sequence types are lists and arrays.
> The problem with using copy.copy() on non-arrays is that it, well, makes copies
> of the elements. The objects in the shuffled sequence are not the same objects
> before and after the shuffling. I consider that to be a violation of the spec.
>   
OK, makes sense.
> Views are rare outside of numpy/Numeric/numarray, partially because Guido
> considers them to be evil. I'm beginning to see why.
>   
They are enough rope to shoot yourself in the foot and that's for sure.

>> Another possible refinement / complication would be to special case 1D 
>> arrays so that they run fastish.
>>
>> A third possibility involves rewriting this in this form:
>>
>>     indices = arange(len(x))
>>     _shuffle_core(indices) # This just does what current shuffle now does
>>     x[:] = take(x, indices, 0)
>>     
>
> That's problematic since the elements all turn into numpy scalar objects:
>   
[SNIP]

I believe that this one is fixable:

    def shuffle_take(x):
        indices = np.arange(len(x))
        np.random.shuffle(indices)
        source = np.asarray(x, dtype=object)
        x[:] = source[indices]
        return x

    a = range(9,-1,-1)
    shuffle_take(a)
    print a
    print type(a[0]), a[0]
    b = np.arange(10).reshape(5,2)
    print b
    shuffle_take(b)
    print b

====>

    [0, 8, 1, 4, 9, 5, 7, 3, 6, 2]
    <type 'int'> 0
    [[0 1]
     [2 3]
     [4 5]
     [6 7]
     [8 9]]
    [[8 9]
     [6 7]
     [0 1]
     [2 3]
     [4 5]]

I think that in real life you'd probably want only convert to an array 
of type object if 'x' is not an array to begin with, but I'm keeping it 
simple for right now.

-tim