Assignment Versus Equality

Wed Jun 29 09:24:57 EDT 2016

On 29/06/2016 13:36, Steven D'Aprano wrote:
> On Wed, 29 Jun 2016 06:09 pm, Chris Angelico wrote:

>> That's not necessarily fair - you're comparing two quite different
>> Python interpreters, so there might be something entirely different
>> that counteracts the integer performance.

> No, my test doesn't precisely compare performance of boxed native ints
> versus boxed BigNums for the same version, but I don't care about that. I
> care about whether the Python interpeter is slower at int arithmetic since
> unifying int and long, and my test shows that it isn't.

> For int arithmetic, the answer is No. I can make guesses and predictions
> about why there is no performance regression:
>
> - native ints were amazingly fast in Python 2.7, and BigNums in Python 3.3
> are virtually as fast;
>
> - native ints were horribly slow in Python 2.7, and changing to BigNums is
> no slower;
>
> - native ints were amazingly fast in Python 2.7, and BigNums in Python 3.3
> are horribly slow, BUT object creation and disposal was horribly slow in
> 2.7 and is amazingly fast in 3.3, so overall it works out about equal;
>
> - int arithmetic is so fast in Python 2.7, and xrange() so slow, that what I
> actually measured was just the cost of calling xrange, and by mere
> coincidence it happened to be almost exactly the same speed as bignum
> arithmetic in 3.3.
>
> But frankly, I don't really care that much. I'm not so much interested in
> micro-benchmarking individual features of the interpreter as caring about
> the overall performance, and for that, I think my test was reasonable and
> fair.

I think there are too many things going on in CPython that would 
dominate matters beyond the actual integer arithmetic.

I used this little benchmark:

def fn():
     n=0
     for i in range(1000000):
         n+=i

for k in range(100):
     fn()

With CPython, Python 2 took 21 seconds (20 with xrange), while Python 3 
was 12.3 seconds (fastest times).

I then ran the equivalent code under my own non-Python interpreter (but 
a version using 100% C to keep the test fair), and it was 2.3 seconds.

(That interpreter keeps 64-bit integers and bigints separate. The 64-bit 
integers are also value-types, not reference-counted objects.)

When I tried optimising versions, then PyPy took 7 seconds, while mine 
took 0.5 seconds.

Testing the same code as C, then unoptimised it was 0.4 seconds, and 
optimised, 0.3 seconds (but n was declared 'volatile' to stop the loop 
being eliminated completely).

So the actual work involved takes 0.3 seconds. That means Python 3 is 
spending 12.0 seconds dealing with overheads. The extra ones of dealing 
with bigints would get lost in there!

(If I test that same code using an explicit bigint for n, then it's a 
different story. It's too complicated to test for C, but it will likely 
be a lot more than 0.3 seconds. And my bigint library is hopelessly 
slow, taking some 35 seconds.

So from that point of view, Python is doing a good job of managing a 
12-second time using a composite integer/bigint type.

However, the vast majority of integer code /can be done within 64 bits/. 
Within 32 bits probably. But like I said, it's possible that other 
overheads come into play than just the ones of using bigints, which I 
would imagine are streamlined.)

-- 
Bartc