Performance of int/long in Python 3

Roy Smith roy at panix.com
Wed Apr 3 20:49:15 EDT 2013


In article <515c448c$0$29966$c3e8da3$5496439d at news.astraweb.com>,
 Steven D'Aprano <steve+comp.lang.python at pearwood.info> wrote:

> On Wed, 03 Apr 2013 09:43:06 -0400, Roy Smith wrote:
> 
> [...]
> >> n = max(map(ord, s))
> >> 4 if n > 0xffff else 2 if n > 0xff else 1
> > 
> > This has to inspect the entire string, no?
> 
> Correct. A more efficient implementation would be:
> 
> def char_size(s):
>     for n in map(ord, s):
>         if n > 0xFFFF: return 4
>         if n > 0xFF: return 2
>     return 1
> 
> 
> 
> > I posted (essentially) this a few days ago:
> > 
> >        if all(ord(c) <= 0xffff for c in s):
> >             return "it's all bmp"
> >         else:
> >             return "it's got astral crap in it"
> 
> 
> It's not "astral crap". People use it, and they'll use it more in the 
> future. Just because you don't, doesn't give you leave to make 
> disparaging remarks about it.
> 
> Honestly, it's really painful to see how history repeats itself:
> 
> "Bah humbug, why do we need to support the SMP astral crap? The Unicode 
> BMP is more than enough for everybody."

Come on, guys.  It was a joke.  I'm the guy who was complaining that my 
database doesn't support non-BMP, remember?



More information about the Python-list mailing list