String performance regression from python 3.2 to 3.3

Neil Hodgson nhodgson at iinet.net.au
Sat Mar 16 18:00:32 EDT 2013


Steven D'Aprano:

> So while you might save memory by using "UTF-24" instead of UTF-32, it
> would probably be slower because you would have to grab three bytes at a
> time instead of four, and the hardware probably does not directly support
> that.

     Low-level string manipulation often deals with blocks larger than 
an individual character for speed. Generally 32 or 64-bits at a time 
using the CPU or 128 or 256 using the vector unit. Then there may be 
entry/exit code to handle initial alignment to a block boundary and 
dealing with a smaller than block-size tail.

    For an example of this kind of thing, see find_max_char in 
python\Objects\stringlib\find_max_char.h which can examine a char* 32 or 
64-bits at a time.

    24-bit is likely to be a win in many circumstances due to decreased 
memory traffic. a 12-bit implementation may also be worthwhile as the 
low 0x1000 characters of Unicode contains Latin (with extensions), 
Greek, Cyrillic, Arabic, Hebrew, and most Indic scripts.

    Neil



More information about the Python-list mailing list