[Python-Dev] PEP 393: Flexible String Representation

Wed Jan 26 13:49:44 CET 2011

On 26 January 2011 12:30, Nick Coghlan <ncoghlan at gmail.com> wrote:
> The PEP actually does define that already:
>
> PyUnicode_AsUTF8 populates the utf8 field of the existing string,
> while PyUnicode_AsUTF8String creates a *new* string with that field
> populated.
>
> PyUnicode_AsUnicode will populate the wstr field (but doing so
> generally shouldn't be necessary).

AIUI, another point is that the PEP deprecates the use of the calls
that populate the utf8 and wstr fields, in favour of the calls that
expect the caller to manage the extra memory (PyUnicode_AsUTF8String
rather than PyUnicode_AsUTF8, ??? rather than PyUnicode_AsUnicode). So
in the long term, the extra fields should never be populated -
although this could take some time as extensions have to be recoded.
Ultimately, the extra fields and older APIs could even be removed.

So any space cost (which I concede could be non-trivial in some cases)
is expected to be short-term.

Paul.