Performance of int/long in Python 3

Chris Angelico rosuav at gmail.com
Wed Apr 3 10:17:28 EDT 2013


On Thu, Apr 4, 2013 at 12:43 AM, Roy Smith <roy at panix.com> wrote:
> This has to inspect the entire string, no?  I posted (essentially) this
> a few days ago:
>
>        if all(ord(c) <= 0xffff for c in s):
>             return "it's all bmp"
>         else:
>             return "it's got astral crap in it"
>
> I'm reasonably sure all() is smart enough to stop at the first False
> value.

Probably, but it still has to scan the body of the string. It'd not be
too bad if it's all astral, but if it's all BMP, it has to scan the
whole string. In the max() case, it has to scan the whole string
anyway, as there's no other way to determine the maximum. I'm thinking
here of this function:

http://pike.lysator.liu.se/generated/manual/modref/ex/7.2_3A_3A/String/width.html

It's implemented as a simple lookup into the header. (Pike strings,
like PEP 393 strings, are stored in the most compact way possible - 1,
2, or 4 bytes per character - with a conceptually similar header
structure.) Is this something that would be worth having available?
Should I post an issue about it?

ChrisA

more for self-ref than anyone else's: source of Pike's String.width():
http://pike-git.lysator.liu.se/gitweb.cgi?p=pike.git;a=blob;f=src/builtin.cmod;hb=HEAD#l1077



More information about the Python-list mailing list