Performance of int/long in Python 3

Mark Lawrence breamoreboy at yahoo.co.uk
Wed Apr 3 18:39:19 EDT 2013


On 03/04/2013 22:55, Chris Angelico wrote:
> On Thu, Apr 4, 2013 at 4:43 AM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> On Wed, 03 Apr 2013 10:38:20 -0600, Ian Kelly wrote:
>>
>>> On Wed, Apr 3, 2013 at 9:02 AM, Steven D'Aprano
>>> <steve+comp.lang.python at pearwood.info> wrote:
>>>> On Wed, 03 Apr 2013 09:43:06 -0400, Roy Smith wrote:
>>>>
>>>> [...]
>>>>>> n = max(map(ord, s))
>>>>>> 4 if n > 0xffff else 2 if n > 0xff else 1
>>>>>
>>>>> This has to inspect the entire string, no?
>>>>
>>>> Correct. A more efficient implementation would be:
>>>>
>>>> def char_size(s):
>>>>      for n in map(ord, s):
>>>>          if n > 0xFFFF: return 4
>>>>          if n > 0xFF: return 2
>>>>      return 1
>>>
>>> That's an incorrect implementation, as it would return 2 at the first
>>> non-Latin-1 BMP character, even if there were SMP characters later in
>>> the string.  It's only safe to short-circuit return 4, not 2 or 1.
>>
>>
>> Doh!
>>
>> I mean, well done sir, you have successfully passed my little test!
>
> Try this:
>
> def str_width(s):
>    width=1
>    for ch in map(ord,s):
>      if ch > 0xFFFF: return 4
>      if cn > 0xFF: width=2
>    return width
>
> ChrisA
>

Given the quality of some code posted here recently this patch can't be 
accepted until there are some unit tests :)

-- 
If you're using GoogleCrap™ please read this 
http://wiki.python.org/moin/GoogleGroupsPython.

Mark Lawrence




More information about the Python-list mailing list