[Python-Dev] _PyUnicode_CheckConsistency() too strict?

Phil Thompson phil at riverbankcomputing.com
Mon Feb 3 19:10:27 CET 2014


On 03-02-2014 5:52 pm, Guido van Rossum wrote:
> Can we provide a convenience API (or even a few lines of code one
> could copy+paste) that determines if a particular 8-bit string
> should  have max-char equal to 127 or 255? I can easily imagine a
> number of use cases where this would come in handy (e.g. a list of
> strings produced by translation, or strings returned in Latin-1 by
> some other non-Python C-level API) -- and lets not get into a debate
> about whether UTF-8 wouldnt be better, I can also easily imagine
> legacy APIs where that isnt (yet) an option.

For my particular use case PyUnicode_FromKindAndData() (once I'd 
interpreted the docs correctly) should have made such code unnecessary. 
However I've just discovered that it doesn't support surrogates in UCS2 
so I'm going to have to roll my own anyway.

Phil


More information about the Python-Dev mailing list