Replace accented chars with unaccented ones

Josiah Carlson jcarlson at nospam.uci.edu
Tue Mar 16 17:15:52 EST 2004


>             r += xlate[ord(i)]
>             r += i

Perhaps I'm going to have to create a signature and drop information 
about this in every post to c.l.py, but repeated string additions are 
slow as hell for any reasonably large lengthed string.  It is much 
faster to place characters into a list and ''.join() them.

 >>> def test_s(l):
...     t = time.time()
...     for i in xrange(100):
...             a = ''
...             for j in xrange(l):
...                     a += '0'
...     return time.time()-t
...
 >>> def test_l(l):
...     t = time.time()
...     for i in xrange(100):
...             a = ''.join(['0' for j in xrange(l)])
...     return time.time()-t
...
 >>> i = 128
 >>> while i < 4097:
...     print test_s(i), test_l(i)
...     i *= 2
...
0.0150001049042 0.0309998989105
0.0469999313354 0.047000169754
0.140999794006 0.109000205994
0.343999862671 0.203000068665
0.905999898911 0.40700006485
2.56200003624 0.828000068665

At 256 characters long, it looks about even.  Anything longer and 
''.join(lst) is significantly faster.

When we do something like the below, the overhead of creating short 
lists is significant, but it is still faster when l is greater than 
roughly 2048:
a = []
for i in xrange(l):
	a += ['0']


  - Josiah



More information about the Python-list mailing list