Performance of int/long in Python 3

Roy Smith roy at panix.com
Wed Apr 3 09:43:06 EDT 2013


In article <515be00e$0$29891$c3e8da3$5496439d at news.astraweb.com>,
 Steven D'Aprano <steve+comp.lang.python at pearwood.info> wrote:

> On Wed, 03 Apr 2013 18:24:25 +1100, Chris Angelico wrote:
> 
> > On Wed, Apr 3, 2013 at 6:06 PM, Ian Kelly <ian.g.kelly at gmail.com> wrote:
> >> On Wed, Apr 3, 2013 at 12:52 AM, Chris Angelico <rosuav at gmail.com>
> >> wrote:
> >>> Hmm. I was about to say "Can you just do a quick collections.Counter()
> >>> of the string widths in 3.3, as an easy way of seeing which ones use
> >>> BMP or higher characters", but I can't find a simple way to query a
> >>> string's width. Can't see it as a method of the string object, nor in
> >>> the string or sys modules. It ought to be easy enough at the C level -
> >>> just look up the two bits representing 'kind' - but I've not found it
> >>> exposed to Python. Is there anything?
> >>
> >> 4 if max(map(ord, s)) > 0xffff else 2 if max(map(ord, s)) > 0xff else 1
> > 
> > Yeah, that's iterating over the whole string (twice, if it isn't width
> > 4). 
> 
> Then don't write it as a one-liner :-P
> 
> n = max(map(ord, s))
> 4 if n > 0xffff else 2 if n > 0xff else 1

This has to inspect the entire string, no?  I posted (essentially) this 
a few days ago:

       if all(ord(c) <= 0xffff for c in s):
            return "it's all bmp"
        else:
            return "it's got astral crap in it"

I'm reasonably sure all() is smart enough to stop at the first False 
value.


> (sys.getsizeof(s) - sys.getsizeof(''))/len(s)
> 
I wouldn't trust getsizeof() to return exactly what you're looking for.



More information about the Python-list mailing list