Strings and Lists

Peter Hansen peter at engcorp.com
Mon Apr 18 10:17:54 EDT 2005


Tom Longridge wrote:
> My current Python project involves lots repeatating code blocks,
> mainly centred around a binary string of data. It's a genetic
> algorithm in which there are lots of strings (the chromosomes) which
> get mixed, mutated and compared a lot.
> 
> Given Python's great list processing abilities and the relative
> inefficiencies some string operations, I was considering using a list
> of True and False values rather than a binary string.
> 
> I somehow doubt there would be a clear-cut answer to this, but from
> this description, does anyone have any reason to think that one way
> would be much more efficient than the other? (I realise the best way
> would be to do both and `timeit` to see which is faster, but it's a
> sizeable program and if anybody considers it a no-brainer I'd much
> rather know now!)
> 
> Any advice would be gladly recieved.

It depends more on the operations you are performing, and
more importantly *which of those you have measured and
found to be slow*, than on anything else.

If, for example, you've got a particular complex set of
slicing, dicing, and mutating operations going on, then
that might say use one type of data structure.

If, on the other hand, the evaluation of the fitness
function is what is taking most of the time, then you
should focus on what that algorithm does and needs
and, after profiling (to get real data rather than
shots-in-the-dark), you can pick an appropriate data
structure for optimizing those operations.

Strings seem to be what people pick, by default, without
much thought, but I doubt they're the right thing
for the majority of GA work...

-Peter



More information about the Python-list mailing list