very simple Genetic Algorithm completed

Paul McGuire ptmcg at austin.rr.com
Fri Feb 1 09:09:49 EST 2008


On Jan 31, 10:43 am, Matthew_WAR... at bnpparibas.com wrote:
> Hi,
>
> I got some help with this from here, and there's been a little bit of
> discussion around GA's recently, so thought I'd post up my likey slow and
> clunky version of a GA that in essence just 'evolves' a solution to 'make a
> sum that evaluates to n using */+-0123456789'  it's a really simple GA that
> would be useful for someone who doesn't quite get GA's to look at.
>
> I think it's simple enough to be fairly self explanatory.
>
> to see it come up with evolved solutions to n=1000
>
> >>>from quickga import *
> >>>evolve()
>

Some improvement tips:

0. Tack this bit onto the end of quickga.py, and you wont have to
write a separate routine to import quickga and invoke evolve():

    if __name__ == "__main__":
        evolve()


1. Remove all calls to floatify, and instead start your program with:
        from __future__ import division
On my system this gained about 16% improvement.


2. Bugfix: In 2 places, change:
        newgene=genes[randint(0,14)-1]
 to
        newgene=genes[randint(0,14-1)]

randint(a,b) returns values from a to b inclusive, and genes is a list
containing 14 elements (so you want subscripts from 0 to 13).  (Using
subscripts from -1 to 13 will bias the selection of genes to use twice
as many of the last item in the list, since both -1 and 13 will give
that value.)

Similarly, change:

    s=rationalise_signs(''.join([ genes[randint(0,len(genes))-1] ...
to:
    s=rationalise_signs(''.join([ genes[randint(0,len(genes)-1)] ...


3. Change mutate to build a list instead of a string.  Then use
''.join() at the end to convert the list into a single string return
value.

def mutate(individual,rate):
    """Does what it says on the tin. Mutates per gene
    if rate is 10000 mutatuion rate is 1 in 10000 on avg
    """
    newindividual=[]
    for gene in individual:
        if randint(0,rate)==1:
            newgene=genes[randint(0,14-1)]
            newindividual.append(newgene)
        else:
            newindividual.append(gene)
    return ''.join(newindividual)


(This was less of a speed improvement than I thought it would be, but
IIRC, the optimization of successive string concatentions is only
available when running Python on Windows. If you are running on Linux,
this should have more benefit.)


4. Include psyco to cut execution time in half.  Put this at the top
of your program, right after "from __future__ ...":

try:
    import psyco
except ImportError:
    print ("Running without psyco optimization")
else:
    psyco.full()

If psyco is available, this will optimize all called functions.


5. On a hunch that many of your individuals are the same string, I
lifted the call to eval out of express() into a separate routine
called evaluate(), and added a memoizing cache to skip the call to
eval if the string had been eval'ed before - if so, the value is
reported from the cache.  This chopped another 40% off the runtime.

evalcache = {}
def evaluate(s):
    if s in evalcache:
        ret = evalcache[s]
    else:
        ret = eval(s)
        evalcache[s] = ret
    return ret

(A note on timing this code: since there are many calls to
randomization methods, consistent timing requires an explicit call to
random.seed() to ensure that a consistent set of random numbers is
used.  Otherwise, the timing gets thrown off by the randomization,
which sends the program down different paths.)


6. This is another change that had little effect, but I'll at least
put it out there (a leftover from an algorithmic optimization course I
took ages ago).  Instead of:

    fitness=abs(fitvalue-expression)

try using:

    fitness=(fitvalue-expression)*(fitvalue-expression)

For some optimization methods (GA is not one of them), the
discontinuity of abs at the origin delays convergence of the
algorithm, whereas computing the square of the difference *is*
continuous at 0 *and* still ensures a positive value.  Just gave it a
try out of academic curiosity, but just a dead end after all.

Cheers,
-- Paul



More information about the Python-list mailing list