Need a specific sort of string modification. Can someone help?

Roy Smith roy at panix.com
Sat Jan 5 10:38:54 EST 2013


In article <mailman.121.1357398573.2939.python-list at python.org>,
 Chris Angelico <rosuav at gmail.com> wrote:

> it may or may not run faster than the explicit state machine,

Hmmm, hard to say.  Both are O(n), but yours makes several more copies 
of the data than mine (the string addition, the replace(), the split(), 
the string slicing).  We both make copies as we put fragments into a 
list, and when we do the join().
  
On the other hand, yours does all that inside the library, which is 
generally faster than random python code.

Let's see:

My way:
real    0m2.097s
user    0m2.078s
sys     0m0.016s

Your way:
real    0m0.649s
user    0m0.622s
sys     0m0.025s

You got me by a factor of 3 or 4.  Not bad.  And not terribly 
surprising.  Once you've got things to the same O() group, pushing as 
much data manipulation as you can down into the library is usually a 
pretty effective optimization technique.

> but IMHO it's a lot clearer to read a split() than something
> that changes state when a particular character is found.

Maybe.  But being familiar with state machines is still a handy skill.  
DNA sequence analysis has lots of problems like "find a start codon 
which is within about 50 bases of a binding site, and then copy 
everything up until you find a stop codon".  Things like that often map 
well to state machines.  Especially if you're trying to do it in 
parallel in all three reading frames.



More information about the Python-list mailing list