Trying to use sets for random selection, but the pop() method returns items in order

Carl Banks pavlovevidence at gmail.com
Wed Jul 1 18:28:10 EDT 2009


On Jul 1, 2:34 pm, Mario Garcia <Mario... at gmail.com> wrote:
> Im trying to use sets for doing statistics from a data set.
> I want to select, 70% random records from a List. I thougth set where
> a good idea so I
> tested this way:
>
> c = set(range(1000))
> for d in range(1000):
>      print c.pop()
>
> I was hoping to see a print out of random selected numbers from 1 to
> 1000
> but I got an ordered count from 1 to 1000.
> I also tried using a dictionary, with keys from 1 to 10, and also got
> the keys in order.
>
> Im using:
>  Python 2.5.2 |EPD 2.5.2001| (r252:60911, Aug  4 2008, 13:45:20)
>  [GCC 4.0.1 (Apple Computer, Inc. build 5370)] on darwin
>
> Examples in the documentation seem to work. But I cant make it.
> Can some one, give me a hint on whats going on?

The keys in a dict or set are not in random order, but (more or less)
they are in hash key order modulo the size of the hash.  This neglects
the effect of hash collisions.  The hash code of an integer happens to
the integer itself, so oftentimes a dict or set storing a sequence of
integers will end up with keys in order, although it's not guaranteed
to be so.

Point it, it's unsafe to rely on *any* ordering behavior in a dict or
set, even relatively random order.

Instead, call random.shuffle() on the list, and iterate through that
to get the elements in random order.


Carl Banks



More information about the Python-list mailing list