Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

Terry Reedy tjreedy at udel.edu
Fri Mar 29 14:06:40 EDT 2013


On 3/28/2013 10:37 PM, Steven D'Aprano wrote:

> Under what circumstances will a string be created from a wchar_t string?
> How, and why, would such a string be created? Why would Python still
> support strings containing surrogates when it now has a nice, shiny,
> surrogate-free flexible representation?

I believe because surrogates are legal codepoints and users may put them 
in strings even though python does not (except for surrogate_escape 
error handling).

I believe some of the internal complexity comes from supporting the old 
C-api so as to not immediately invalidate existing extensions.

-- 
Terry Jan Reedy




More information about the Python-list mailing list