best way to replace first word in string?

bonono at gmail.com bonono at gmail.com
Sun Oct 23 01:05:38 EDT 2005


interesting. seems that "if ' ' in source:" is a highly optimized code
as it is even faster than "if str.find(' ') != -1:' when I assume they
end up in the same C loops ?

Ron Adam wrote:
> Guess again...  Is this the results below what you were expecting?
>
> Notice the join adds a space to the end if the source string is a single
> word.  But I allowed for that by adding one in the same case for the
> index method.
>
> The big win I was talking about was when no spaces are in the string.
> The index can then just return the replacement.
>
> These are relative percentages of time to each other.  Smaller is better.
>
> Type 1 = no spaces
> Type 2 = space at 10% of length
> Type 3 = space at 90% of length
>
> Type: Length
>
> Type 1: 10        split/join: 317.38%  index: 31.51%
> Type 2: 10        split/join: 212.02%  index: 47.17%
> Type 3: 10        split/join: 186.33%  index: 53.67%
> Type 1: 100       split/join: 581.75%  index: 17.19%
> Type 2: 100       split/join: 306.25%  index: 32.65%
> Type 3: 100       split/join: 238.81%  index: 41.87%
> Type 1: 1000      split/join: 1909.40%  index: 5.24%
> Type 2: 1000      split/join: 892.02%  index: 11.21%
> Type 3: 1000      split/join: 515.44%  index: 19.40%
> Type 1: 10000     split/join: 3390.22%  index: 2.95%
> Type 2: 10000     split/join: 2263.21%  index: 4.42%
> Type 3: 10000     split/join: 650.30%  index: 15.38%
> Type 1: 100000    split/join: 3342.08%  index: 2.99%
> Type 2: 100000    split/join: 1175.51%  index: 8.51%
> Type 3: 100000    split/join: 677.77%  index: 14.75%
> Type 1: 1000000   split/join: 3159.27%  index: 3.17%
> Type 2: 1000000   split/join: 867.39%  index: 11.53%
> Type 3: 1000000   split/join: 679.47%  index: 14.72%
>
>
>
>
> import time
> def test(func, source):
>      t = time.clock()
>      n = 6000000/len(source)
>      s = ''
>      for i in xrange(n):
>          s = func(source, "replace")
>      tt = time.clock()-t
>      return s, tt
>
> def replace_word1(source, newword):
>      """Replace the first word of source with newword."""
>      return newword + " " + " ".join(source.split(None, 1)[1:])
>
> def replace_word2(source, newword):
>      """Replace the first word of source with newword."""
>      if ' ' in source:
>          return newword + source[source.index(' '):]
>      return newword + ' '   # space needed to match join results
>
>
> def makestrings(n):
>      s1 = 'abcdefghij' * (n//10)
>      i, j = n//10, n-n//10
>      s2 = s1[:i] + ' ' + s1[i:] + 'd.'    # space near front
>      s3 = s1[:j] + ' ' + s1[j:] + 'd.'    # space near end
>      return [s1,s2,s3]
>
> for n in [10,100,1000,10000,100000,1000000]:
>      for sn,s in enumerate(makestrings(n)):
>          r1, t1 = test(replace_word1, s)
>          r2, t2 = test(replace_word2, s)
>          assert r1 == r2
>          print "Type %i: %-8i  split/join: %.2f%%  index: %.2f%%" \
>                 % (sn+1, n, t1/t2*100.0, t2/t1*100.0)




More information about the Python-list mailing list