String performance regression from python 3.2 to 3.3

Mark Lawrence breamoreboy at yahoo.co.uk
Sat Mar 16 00:56:41 EDT 2013


On 16/03/2013 04:35, rusi wrote:
> On Mar 16, 9:09 am, Chris Angelico <ros... at gmail.com> wrote:
>> On Sat, Mar 16, 2013 at 2:56 PM, Mark Lawrence <breamore... at yahoo.co.uk> wrote:
>>> On 16/03/2013 02:44, Thomas 'PointedEars' Lahn wrote:
>>
>>>> Chris Angelico wrote:
>>
>>> Thomas and Chris, would the two of you be kind enough to explain to morons
>>> such as myself how all the ECMAScript stuff relates to Python's unicode as
>>> implemented via PEP 393 as you've lost me, easily done I know.
>>
>> Sure. Here's the brief version: It's all about how a string is exposed
>> to a script.
>>
>> * Python 3.2 Narrow gives you UTF-16. Non-BMP characters count twice.
>> * Python 3.2 Wide gives you UTF-32. Each character counts once.
>> * Python 3.3 gives you UTF-32, but will store it as compactly as possible.
>
> Framing issue here (made famous by en.wikipedia.org/wiki/
> George_Lakoff)
>
> When one uses words like 'compact' 'flexible' etc it loads the dice in
> favour of 3.3 choices.
> And ignores that 3.3 trades time for space.
>

As stated in PEP 393 so what's all the fuss about?

-- 
Cheers.

Mark Lawrence




More information about the Python-list mailing list