[Python-Dev] PEP 393: Flexible String Representation

Stefan Behnel stefan_ml at behnel.de
Sat Jan 29 08:47:38 CET 2011


"Martin v. Löwis", 24.01.2011 21:17:
> I have been thinking about Unicode representation for some time now.
> This was triggered, on the one hand, by discussions with Glyph Lefkowitz
> (who complained that his server app consumes too much memory), and Carl
> Friedrich Bolz (who profiled Python applications to determine that
> Unicode strings are among the top consumers of memory in Python).
> On the other hand, this was triggered by the discussion on supporting
> surrogates in the library better.
>
> I'd like to propose PEP 393, which takes a different approach,
> addressing both problems simultaneously: by getting a flexible
> representation (one that can be either 1, 2, or 4 bytes), we can
> support the full range of Unicode on all systems, but still use
> only one byte per character for strings that are pure ASCII (which
> will be the majority of strings for the majority of users).
>
> You'll find the PEP at
>
> http://www.python.org/dev/peps/pep-0393/
>
> [...]
> Stable ABI
> ----------
>
> None of the functions in this PEP become part of the stable ABI.

I think that's only part of the truth. This PEP can potentially have an 
impact on the stable ABI in the sense that the build-time size of 
Py_UNICODE may no longer be important for extensions that work on unicode 
buffers in the future as long as they only use the 'str' pointer and not 
'wstr'.

Stefan



More information about the Python-Dev mailing list