Random Drawing Simulation -- performance issue

Simon Forman rogue_pedro at yahoo.com
Tue Sep 12 18:23:51 EDT 2006


Brendon Towle wrote:
> I need to simulate scenarios like the following: "You have a deck of
> 3 orange cards, 5 yellow cards, and 2 blue cards. You draw a card,
> replace it, and repeat N times."
>
> So, I wrote the following code, which works, but it seems quite slow
> to me. Can anyone point out some obvious thing that I'm doing
> inefficiently? Or, is this more or less as good as it gets?
>
> For reference, my typical numbers look like this:
>
>    2 <= len(population) <= 7
>    4 <= len(mapping) <= 50
>    10 <= count <= 1000000
>
> B.
>
> <code>
> #!/usr/bin/env python
>
> import random
>
> def randomDrawing(count, population):
>      """Simulates drawing <count> items from <population>, with
> replacement.
>      population is a list of lists: [[count1, type1], [count2,
> type2], ...]
>
>      Typical examples:
>      >>>randomDrawing(100, [[3, 'orange'], [5, 'yellow'], [2, 'blue']])
>      [[28, 'orange'], [57, 'yellow'], [15, 'blue']]
>
>      >>>randomDrawing(100000, [[3, 'orange'], [5, 'yellow'], [2,
> 'blue']])
>      [[29923, 'orange'], [50208, 'yellow'], [19869, 'blue']]
>
>      """
>      res = [[0, item[1]] for item in population]
>      mapping = []
>      for i in xrange(len(population)):
>          mapping.extend([i]*population[i][0])
>      for i in xrange(count):
>          index = random.choice(mapping)
>          res[index][0] += 1
>      return res
>
> </code>
>
> --
> Brendon Towle, PhD
> Cognitive Scientist
> +1-412-690-2442x127
> Carnegie Learning, Inc.
> The Cognitive Tutor Company ®
> Helping over 375,000 students in 1000 school districts succeed in math.

I got nearly a 2x speed up with this variant:

def randomDrawing3(count, population):
    res = [[0, item[1]] for item in population]
    mapping = []
    for i in xrange(len(population)):
        mapping.extend([i]*population[i][0])

    n = len(mapping)
    for i in xrange(count):
        index = int(n * random.random())
        res[mapping[index]][0] += 1

    return res


Apparently "int(n * random.random())"  is faster than random.choice()
or random.randint()

sforman at garbage:~ $ python -mtimeit -s'import random; n=10' 'int(n *
random.random())'
100000 loops, best of 3: 3.22 usec per loop

sforman at garbage:~ $ python -mtimeit -s'import random; n=10'
'random.randint(1, n)'
100000 loops, best of 3: 13 usec per loop

sforman at garbage:~ $ python -mtimeit -s'import random; n=range(10)'
'random.choice(n)'
100000 loops, best of 3: 6.07 usec per loop

(Note that the first and last examples produce values 0..9 while the
middle one produces 1..10)


I don't know for sure, but I think the random, uh, spread or whatever
will be the same for random() as for choice().  If it's important, you
should verify that.  ;-)

Peace,
~Simon




More information about the Python-list mailing list