Variables versus name bindings [Re: A certainl part of an if() structure never gets executed.]

Dave Angel davea at davea.name
Tue Jun 18 00:12:34 EDT 2013


On 06/17/2013 10:42 PM, Steven D'Aprano wrote:
> On Mon, 17 Jun 2013 21:06:57 -0400, Dave Angel wrote:
>
>> On 06/17/2013 08:41 PM, Steven D'Aprano wrote:
>>>
>>>      <SNIP>
>>>
>>> In Python 3.2 and older, the data will be either UTF-4 or UTF-8,
>>> selected when the Python compiler itself is compiled.
>>
>> I think that was a typo.  Do you perhaps UCS-2 or UCS-4
>
> Yes, that would be better.
>
> UCS-2 is identical to UTF-16, except it doesn't support non-BMP
> characters and therefore doesn't have surrogate pairs.
>
> UCS-4 is functionally equivalent to UTF-16,

Perhaps you mean UTF-32 ?

  as far as I can tell. (I'm
> not really sure what the difference is.)
>

Now you've got me curious, by bringing up surrogate pairs.  Do you know 
whether a narrow build (say 3.2) really works as UTF16, so when you 
encode a surrogate pair (4 bytes) to UTF-8, it encodes a single Unicode 
character into a single UTF-8 sequence (prob.  4 bytes long) ?



-- 
DaveA



More information about the Python-list mailing list