[Python-Dev] PEP 393: Flexible String Representation

Sat Jan 29 08:47:38 CET 2011

"Martin v. Löwis", 24.01.2011 21:17:
> I have been thinking about Unicode representation for some time now.
> This was triggered, on the one hand, by discussions with Glyph Lefkowitz
> (who complained that his server app consumes too much memory), and Carl
> Friedrich Bolz (who profiled Python applications to determine that
> Unicode strings are among the top consumers of memory in Python).
> On the other hand, this was triggered by the discussion on supporting
> surrogates in the library better.
>
> I'd like to propose PEP 393, which takes a different approach,
> addressing both problems simultaneously: by getting a flexible
> representation (one that can be either 1, 2, or 4 bytes), we can
> support the full range of Unicode on all systems, but still use
> only one byte per character for strings that are pure ASCII (which
> will be the majority of strings for the majority of users).
>
> You'll find the PEP at
>
> http://www.python.org/dev/peps/pep-0393/
>
> [...]
> Stable ABI
> ----------
>
> None of the functions in this PEP become part of the stable ABI.

I think that's only part of the truth. This PEP can potentially have an 
impact on the stable ABI in the sense that the build-time size of 
Py_UNICODE may no longer be important for extensions that work on unicode 
buffers in the future as long as they only use the 'str' pointer and not 
'wstr'.

Stefan