Performance of int/long in Python 3
Chris Angelico
rosuav at gmail.com
Wed Apr 3 17:55:43 EDT 2013
On Thu, Apr 4, 2013 at 4:43 AM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> On Wed, 03 Apr 2013 10:38:20 -0600, Ian Kelly wrote:
>
>> On Wed, Apr 3, 2013 at 9:02 AM, Steven D'Aprano
>> <steve+comp.lang.python at pearwood.info> wrote:
>>> On Wed, 03 Apr 2013 09:43:06 -0400, Roy Smith wrote:
>>>
>>> [...]
>>>>> n = max(map(ord, s))
>>>>> 4 if n > 0xffff else 2 if n > 0xff else 1
>>>>
>>>> This has to inspect the entire string, no?
>>>
>>> Correct. A more efficient implementation would be:
>>>
>>> def char_size(s):
>>> for n in map(ord, s):
>>> if n > 0xFFFF: return 4
>>> if n > 0xFF: return 2
>>> return 1
>>
>> That's an incorrect implementation, as it would return 2 at the first
>> non-Latin-1 BMP character, even if there were SMP characters later in
>> the string. It's only safe to short-circuit return 4, not 2 or 1.
>
>
> Doh!
>
> I mean, well done sir, you have successfully passed my little test!
Try this:
def str_width(s):
width=1
for ch in map(ord,s):
if ch > 0xFFFF: return 4
if cn > 0xFF: width=2
return width
ChrisA
More information about the Python-list
mailing list