[Tutor] Optimizing a simple radioactive decay simulation

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Thu Dec 12 20:28:01 2002


On Thu, 12 Dec 2002, Tim Wilson wrote:

> This worked perfectly, but was quite slow.
>
> Since I'm interested in using simulations to teach science, I wonder if
> anyone has suggestions for optimizing this code.

Hi Tim,


Profiling time!  *grin*

    http://www.python.org/doc/lib/profile.html


The profiler is pretty sophisticated, but for the moment, let's use it's
simple "profile.run()" function directly on the main() function.



Here's what happens on my system:

###
>>> import decay
>>> import profile
>>> profile.run("decay.main()")
How many coins? 50000
Undecayed (heads): 50000
Undecayed (heads): 24948
Undecayed (heads): 12440
Undecayed (heads): 6276
Undecayed (heads): 3209
Undecayed (heads): 1605
Undecayed (heads): 769
Undecayed (heads): 374
Undecayed (heads): 191
Undecayed (heads): 97
Undecayed (heads): 47
Undecayed (heads): 20
Undecayed (heads): 10
Undecayed (heads): 4
Undecayed (heads): 2
Undecayed (heads): 2
Undecayed (heads): 2
Undecayed (heads): 1
Undecayed (heads): 1
Undecayed (heads): 1
Undecayed (heads): 1

21 half-lives required.
         450089 function calls in 13.060 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   13.060   13.060 <string>:1(?)
        1    1.490    1.490    1.820    1.820 decay.py:10(__init__)
       43    0.000    0.000    0.000    0.000 decay.py:16(__len__)
       21    0.380    0.018    0.380    0.018 decay.py:20(removeTails)
       21    1.080    0.051   10.860    0.517 decay.py:24(decay)
    50000    0.330    0.000    0.330    0.000 decay.py:32(__init__)
   100000    2.300    0.000    9.780    0.000 decay.py:36(flip)
        1    0.000    0.000   13.060   13.060 decay.py:40(main)
        1    0.000    0.000   13.060   13.060 profile:0(decay.main())
        0    0.000             0.000          profile:0(profiler)
   100000    1.750    0.000    1.750    0.000 random.py:153(random)
   100000    3.530    0.000    5.280    0.000 random.py:277(randrange)
   100000    2.200    0.000    7.480    0.000 random.py:316(randint)
###


It looks the system spends a significant amount of time doing random
number generation in the flip() method.  That's what we expect: it's doing
a flippin' lot of of flipping.  *grin*

One optimization that specifically applies to Python is to pull specific
functions out of modules so that they appear to be local.  Python can more
quickly access local variables in a function.

Whenever a name is used in a program, Python needs to search in the
following approximate order:

    1.  Local
    2.  Class
    3.  Module

(Nested scopes complicates this somewhat, but not by much.)


At the moment, 'random.randint' is being found at the module level, so
we can speed up random number generation somewhat by pulling the randint
function as a local variable:


######
class Coin:
    """Coin class for use in radioactive decay simulation."""

    def __init__(self, value):
        """Create a coin instance. 1 = heads, 2 = tails"""
        self.value = value


    def flip(self, randint=random.randint):
                                         ## randint is default
                                         ## parameter for speed
                                         ## optimization reasons
        """Flip the coin."""
        self.value = randint(0, 1)
######


[Time out!  The comment of the Coin class is wrong: self.value is between
0 and 1, not 1 and 2, so something has to give: either the code is wrong,
or the comment is wrong.  I'm betting on the comment.]



Let's see if that helps any:

###
...
17 half-lives required.
         449977 function calls in 12.470 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   12.460   12.460 <string>:1(?)
        1    1.340    1.340    1.740    1.740 decay.py:10(__init__)
       35    0.000    0.000    0.000    0.000 decay.py:16(__len__)
       17    0.360    0.021    0.360    0.021 decay.py:20(removeTails)
       17    1.430    0.084   10.360    0.609 decay.py:24(decay)
    50000    0.400    0.000    0.400    0.000 decay.py:32(__init__)
    99976    1.790    0.000    8.930    0.000 decay.py:36(flip)
        1    0.000    0.000   12.460   12.460 decay.py:43(main)
        1    0.010    0.010   12.470   12.470 profile:0(decay.main())
        0    0.000             0.000          profile:0(profiler)
    99976    1.700    0.000    1.700    0.000 random.py:153(random)
    99976    3.020    0.000    4.720    0.000 random.py:277(randrange)
    99976    2.420    0.000    7.140    0.000 random.py:316(randint)
###


It shaved a little bit off the runtime, but not too much.  Still, that's a
start.  We know now that we should concentrate our efforts on minimizing
the cost of flip(): it's called the most often, and takes up the majority
of time.


We should notice, also, that randrange() is showing up in our profiling
report.  What gives?  That's because randint(), underneath the surface, is
using randrange()!  Let's avoid the middle man, and directly use
randrange:

###
## in the Coin class definition:
    def flip(self, randrange=random.randrange):
                                         ## randrange is default
                                         ## parameter for speed
                                         ## optimization reasons
        """Flip the coin."""
        self.value = randrange(2)
###



Does that help?

###
16 half-lives required.
         350051 function calls in 11.010 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   11.010   11.010 <string>:1(?)
        1    1.270    1.270    1.640    1.640 decay.py:10(__init__)
       33    0.000    0.000    0.000    0.000 decay.py:16(__len__)
       16    0.370    0.023    0.370    0.023 decay.py:20(removeTails)
       16    1.430    0.089    8.990    0.562 decay.py:24(decay)
    50000    0.370    0.000    0.370    0.000 decay.py:32(__init__)
    99994    2.590    0.000    7.560    0.000 decay.py:36(flip)
        1    0.010    0.010   11.010   11.010 decay.py:43(main)
        1    0.000    0.000   11.010   11.010 profile:0(decay.main())
        0    0.000             0.000          profile:0(profiler)
    99994    1.840    0.000    1.840    0.000 random.py:153(random)
    99994    3.130    0.000    4.970    0.000 random.py:277(randrange)
###


A little better.  And so on: we'll have to do little tweaks, compare
performance using profile, and tweak again.

I'll stop for now; Let's let someone else suggest a good way of speeding
flip() up.




Oh, by the way, since this experiment uses random numbers, we're not
getting exactly the same results, run after run.  This is not a good
thing: we want to make experiments repeatable, and not merely "almost"
repeatable.


But we can fix this, since we're actually using "pseudorandom" numbers.
Unlike the lottery, we can bias things so that we get the exact same
pseudorandom numbers, run after run, by twiddling random.seed() before
every program run:

###
>>> help(random.seed)

Help on method seed in module random:

seed(self, a=None) method of random.Random instance
    Initialize internal state from hashable object.

    None or no argument seeds from current time.

    If a is not None or an int or long, hash(a) is used instead.

    If a is an int or long, a is used directly.  Distinct values between
    0 and 27814431486575L inclusive are guaranteed to yield distinct
    internal states (this guarantee is specific to the default
    Wichmann-Hill generator).
###


So when we do profiling, let's call:

###
>>> random.seed("profiling")
###

just before every profile run, to make sure we get the same results.



Let's continue attacking this problem till we get tired.  *grin* Good luck
to you!