Performance of int/long in Python 3
Ian Kelly
ian.g.kelly at gmail.com
Wed Apr 3 12:38:20 EDT 2013
On Wed, Apr 3, 2013 at 9:02 AM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> On Wed, 03 Apr 2013 09:43:06 -0400, Roy Smith wrote:
>
> [...]
>>> n = max(map(ord, s))
>>> 4 if n > 0xffff else 2 if n > 0xff else 1
>>
>> This has to inspect the entire string, no?
>
> Correct. A more efficient implementation would be:
>
> def char_size(s):
> for n in map(ord, s):
> if n > 0xFFFF: return 4
> if n > 0xFF: return 2
> return 1
That's an incorrect implementation, as it would return 2 at the first
non-Latin-1 BMP character, even if there were SMP characters later in the
string. It's only safe to short-circuit return 4, not 2 or 1.
More information about the Python-list
mailing list