benchmarks and questions for new python programmer

Mon May 17 01:48:47 EDT 2004

stormslayer wrote:
> Benchmarks (for the paramter settings in the file):
>
> C++builder 6.0: 3 seconds
> Perl (ActivePerl 5.6.1.638): 14 seconds
> Python (ActivePython 2.3.2 Build 232): 1 minute 13 seconds
>
> Note that I didn't fiddle overmuch with any of the programs -- each
> one took about 15 minutes to bang out on the keyboard.  I'm hoping
> that there is some obvious mistake in the way I used something in
> python to account for the speed differential.  It seems like a really,
> really nice language, but orders of magnitude slower than C/C++ and a
> 5x slowdown over Perl seems abusive...

It's pretty easy to speed up your code by a factor of more than 10.

First, you need to figure out where the time is going. Profiling is always a
Good Thing, but in this case, it was pretty obvious just by eyeballing the
code that the loop in init_all() was the likely culprit. To confirm this, I
commented out the call to one_choose(), and sure enough, this didn't reduce
the time much at all. So init_all() is where to concentrate our efforts.

Here's your init_all() code:

> def init_all():        # init globals
>     global pop, gold, standard
>     pop = []
>     gold = 0
>     for i in range(N):
>         if random.randint(1,2) == 1:
>             pop.append(1)
>             gold=gold+1
>         else: pop.append(2)
>     if gold>N/2: standard=1
>     else: standard=2        # if tie, silver wins

And here is a new version:

def init_all():        # init globals
    global pop, gold, standard
    pop = [2] * N
    gold = 0
    rand = random.random
    for i in xrange(N):
        if rand() < .5:
            pop[i] = 1
            gold += 1
    if gold>N/2: standard=1
    else: standard=2        # if tie, silver wins

What's different here?

* I used xrange(N) instead of range(N). This helps a little bit but not much
by itself.

* Instead of using append() repeatedly to extend the array, I preallocated
the entire array with the "pop = [2] * N" statement. This avoids repeated
memory allocation and helps a little bit. But still not enough to make a
huge difference.

* Finally, I tried tracing into the code in the debugger, and when it
stepped into the random.randint() funciton, I noticed that this is a fairly
complex little function. All you're using it for is to make a 50-50
decision, so we can bypass a lot of code by using random.random() intead.
This is where the big speedup comes from. Also, I grabbed a reference to
random.random outside the inner loop, which helps a little bit.

On my machine, your original code runs in 139 seconds. With these changes,
it runs in 10.9 seconds, which would be about 5.7 seconds on your faster
machine.

If that's not fast enough, you can get the psyco module from
http://psyco.sourceforge.net/ and add these two lines to the top of the
source file:

import psyco
psyco.full()

With that change, the code runs in 3.6 seconds, which would be about 1.9
seconds on your machine--considerably faster than the C++ Builder time.

Good enough? :-)

-Mike

> Python code:
>
> #!/usr/bin/env python
> import random
>
> pop=[]      # popluation of agents; initially empty list
> N = 10000     # population size
> ep = .05    # mutation rate
> its = 100   # number of times through the main loop
> gold=0      # initial number of gold adopters; (1-gold) = silver
> adopters
> standard=0  # either 1 for gold or 2 for silver
> shifts=0    # number of regime shifts
>
> def init_all():        # init globals
>     global pop, gold, standard
>     pop = []
>     gold = 0
>     for i in range(N):
>         if random.randint(1,2) == 1:
>             pop.append(1)
>             gold=gold+1
>         else: pop.append(2)
>     if gold>N/2: standard=1
>     else: standard=2        # if tie, silver wins
> #    print "Initial Population of Gold Users: ", gold
> # end function init_all
>
> def one_choose():
>     global pop, standard, gold, shifts
>     for i in range(its):
>         temp = random.randint(0,N-1)
>         tempval = pop[temp]
>         old_stand = standard
>         if random.random()<ep:
>             if random.random()<0.5:
>                 pop[temp]=1
>                 if tempval!=pop[temp]:
>                     gold=gold+1
>                     if gold>N/2: standard=1
>             else:
>                 pop[temp]=2
>                 if tempval!=pop[temp]:
>                     gold=gold-1
>                     if gold<N/2: standard=2
>         if standard!=old_stand: shifts=shifts+1 # check for regime
> shift after each agent chooses anew
>         else:
>             if gold>N/2:
>                 if pop[temp]!=1:
>                     pop[temp]=1
>                     gold=gold+1
>                     if gold>N/2: standard=1
>             else:
>                 if pop[temp]!=2:
>                     pop[temp]=2
>                     gold=gold-1
>                     if gold<N/2: standard=2
>         if standard!=old_stand: shifts=shifts+1 # check for regime
> shift after each agent chooses anew
> #    print "Final Population of Gold Users: ", gold
> # end function one_choose
>
> # start main loop
>
>
> for i in range(1000):
>     init_all()
>     one_choose()
> print "Number of regime shifts: ", shifts