[Python-Dev] Optimize Unicode strings in Python 3.3

Fri May 4 11:00:52 CEST 2012

04.05.12 02:45, Victor Stinner написав(ла):
>   * Two steps: compute the length and maximum character of the output
> string, allocate the output string and then write characters. str%args
> was using it.
>   * Optimistic approach. Start with a ASCII buffer, enlarge and widen
> (to UCS2 and then UCS4) the buffer when new characters are written.
> Approach used by the UTF-8 decoder and by str%args since today.

In real today UTF-8 decoder uses two-steps approach. Only after 
encountering an error it switches to optimistic approach.

> The optimistic approach uses realloc() to resize the string. It is
> faster than the PyAccu approach (at least for short ASCII strings),
> maybe because it avoids the creating of temporary short strings.
> realloc() looks to be efficient on Linux and Windows (at least Seven).

IMHO, realloc() has no relationship to this. The case in the cost of 
managing of the list and creating of temporary strings.

> Various notes:
>   * PyUnicode_READ() is slower than reading a Py_UNICODE array.

And PyUnicode_WRITE() is slower than writing a Py_UNICODE/PyUCS* array.

>   * Some decoders unroll the main loop to process 4 or 8 bytes (32 or
> 64 bits CPU) at each step.

Note, this is not only CPU-, but OS-depending (LP64 vs LLP64).

> I am interested if you know other tricks to optimize Unicode strings
> in Python, or if you are interested to work on this topic.

Optimized ASCII decoder (issue 14419) is not only reads 4 or 8 bytes at 
a time, but writes them all at a time. This is a very specific optimization.

More general principle is replacing serial scanning and translating on 
an one-pass optimistic reading and writing. This improves the efficiency 
of the memory cache.

I'm going to try it in UTF-8 decoder, it will allow to increase the 
speed of decoding ASCII-only strings up to speed of optimized ASCII decoder.