RE Module Performance

wxjmfauth at gmail.com wxjmfauth at gmail.com
Tue Jul 30 15:09:11 EDT 2013


Matable, immutable, copyint + xxx, bufferint, O(n) ....
Yes, but conceptualy the reencoding happen sometime, somewhere.
The internal "ucs-2" will never automagically be transformed
into "ucs-4" (eg).

>>> timeit.timeit("'a'*10000 +'€'")
7.087220684719967
>>> timeit.timeit("'a'*10000 +'z'")
1.5685214234430873
>>> timeit.timeit("z = 'a'*10000; z = z +'€'")
7.169538866162213
>>> timeit.timeit("z = 'a'*10000; z = z +'z'")
1.5815893830557286
>>> timeit.timeit("z = 'a'*10000; z += 'z'")
1.606955741596181
>>> timeit.timeit("z = 'a'*10000; z += '€'")
7.160483334521416


And do not forget, in a pure utf coding scheme, your
char or a char will *never* be larger than 4 bytes.

>>> sys.getsizeof('a')
26
>>> sys.getsizeof('\U000101000')
48


jmf



More information about the Python-list mailing list