Performance of int/long in Python 3

Steven D'Aprano steve+comp.lang.python at pearwood.info
Wed Apr 3 13:43:51 EDT 2013


On Wed, 03 Apr 2013 10:38:20 -0600, Ian Kelly wrote:

> On Wed, Apr 3, 2013 at 9:02 AM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> On Wed, 03 Apr 2013 09:43:06 -0400, Roy Smith wrote:
>>
>> [...]
>>>> n = max(map(ord, s))
>>>> 4 if n > 0xffff else 2 if n > 0xff else 1
>>>
>>> This has to inspect the entire string, no?
>>
>> Correct. A more efficient implementation would be:
>>
>> def char_size(s):
>>     for n in map(ord, s):
>>         if n > 0xFFFF: return 4
>>         if n > 0xFF: return 2
>>     return 1
> 
> That's an incorrect implementation, as it would return 2 at the first
> non-Latin-1 BMP character, even if there were SMP characters later in
> the string.  It's only safe to short-circuit return 4, not 2 or 1.


Doh!

I mean, well done sir, you have successfully passed my little test!



-- 
Steven



More information about the Python-list mailing list