Variables versus name bindings [Re: A certainl part of an if() structure never gets executed.]
Dave Angel
davea at davea.name
Tue Jun 18 00:12:34 EDT 2013
On 06/17/2013 10:42 PM, Steven D'Aprano wrote:
> On Mon, 17 Jun 2013 21:06:57 -0400, Dave Angel wrote:
>
>> On 06/17/2013 08:41 PM, Steven D'Aprano wrote:
>>>
>>> <SNIP>
>>>
>>> In Python 3.2 and older, the data will be either UTF-4 or UTF-8,
>>> selected when the Python compiler itself is compiled.
>>
>> I think that was a typo. Do you perhaps UCS-2 or UCS-4
>
> Yes, that would be better.
>
> UCS-2 is identical to UTF-16, except it doesn't support non-BMP
> characters and therefore doesn't have surrogate pairs.
>
> UCS-4 is functionally equivalent to UTF-16,
Perhaps you mean UTF-32 ?
as far as I can tell. (I'm
> not really sure what the difference is.)
>
Now you've got me curious, by bringing up surrogate pairs. Do you know
whether a narrow build (say 3.2) really works as UTF16, so when you
encode a surrogate pair (4 bytes) to UTF-8, it encodes a single Unicode
character into a single UTF-8 sequence (prob. 4 bytes long) ?
--
DaveA
More information about the Python-list
mailing list