Assignment Versus Equality

Chris Angelico rosuav at gmail.com
Wed Jun 29 04:09:42 EDT 2016


On Wed, Jun 29, 2016 at 4:45 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> On Wednesday 29 June 2016 15:51, Lawrence D’Oliveiro wrote:
>
>> On Wednesday, June 29, 2016 at 5:26:46 PM UTC+12, Steven D'Aprano wrote:
>>> BUT in Python 3, the distinction between int and long is gone by dropping
>>> int and renaming long as "int". So all Python ints are BIGNUMs.
>>
>> I don’t understand what the problem is with this. Is there supposed to be
>> some issue with performance? Because I can’t see it.
>
> If there is a performance hit, it's probably pretty small. It may have been
> bigger back in Python 3.0 or 3.1.
>
> [steve at ando ~]$ python2.7 -m timeit -s "n = 0" "for i in xrange(10000): n += i"
> 100 loops, best of 3: 1.87 msec per loop
>
> [steve at ando ~]$ python3.3 -m timeit -s "n = 0" "for i in range(10000): n += i"
> 1000 loops, best of 3: 1.89 msec per loop
>
>
> Although setting debugging options does make it pretty slow:
>
> [steve at ando ~]$ python/python-dev/3.6/python -m timeit -s "n = 0" "for i in
> range(10000): n += i"
> 100 loops, best of 3: 13.7 msec per loop

That's not necessarily fair - you're comparing two quite different
Python interpreters, so there might be something entirely different
that counteracts the integer performance. (For example: You're
creating and disposing of large numbers of objects, so the performance
of object creation could affect things hugely.) To make it somewhat
fairer, add long integer performance to the mix. Starting by redoing
your test:

rosuav at sikorsky:~$ python2.7 -m timeit -s "n = 0" "for i in
xrange(10000): n += i"
10000 loops, best of 3: 192 usec per loop
rosuav at sikorsky:~$ python2.7 -m timeit -s "n = 1<<100" "for i in
xrange(10000): n += i"
1000 loops, best of 3: 478 usec per loop
rosuav at sikorsky:~$ python3.4 -m timeit -s "n = 0" "for i in
range(10000): n += i"
1000 loops, best of 3: 328 usec per loop
rosuav at sikorsky:~$ python3.4 -m timeit -s "n = 1<<100" "for i in
range(10000): n += i"
1000 loops, best of 3: 337 usec per loop
rosuav at sikorsky:~$ python3.5 -m timeit -s "n = 0" "for i in
range(10000): n += i"
1000 loops, best of 3: 369 usec per loop
rosuav at sikorsky:~$ python3.5 -m timeit -s "n = 1<<100" "for i in
range(10000): n += i"
1000 loops, best of 3: 356 usec per loop
rosuav at sikorsky:~$ python3.6 -m timeit -s "n = 0" "for i in
range(10000): n += i"
1000 loops, best of 3: 339 usec per loop
rosuav at sikorsky:~$ python3.6 -m timeit -s "n = 1<<100" "for i in
range(10000): n += i"
1000 loops, best of 3: 343 usec per loop

(On this system, python3.4 and python3.5 are Debian-shipped builds of
CPython, and python3.6 is one I compiled from hg today. There's no
visible variance between them, but just in case. I don't have a
python3.3 on here for a fair comparison with your numbers, sorry.)

The way I read this, Python 2.7 is noticeably slower with bignums, but
visibly faster with machine words. Python 3, on the other hand, has
consistent performance whether the numbers fit within a machine word
or not - which is to be expected, since it uses bignums for all
integers. PyPy's performance shows an even more dramatic gap:

rosuav at sikorsky:~$ pypy -m timeit -s "n = 0" "for i in xrange(10000): n += i"
100000 loops, best of 3: 7.59 usec per loop
rosuav at sikorsky:~$ pypy -m timeit -s "n = 1<<100" "for i in
xrange(10000): n += i"
10000 loops, best of 3: 119 usec per loop
rosuav at sikorsky:~$ pypy --version
Python 2.7.10 (5.1.2+dfsg-1, May 17 2016, 18:03:30)
[PyPy 5.1.2 with GCC 5.3.1 20160509]

Sadly, Debian doesn't ship a pypy3 yet, so for consistency, I picked
up the latest available pypy2 and pypy3 from pypy.org.

rosuav at sikorsky:~/tmp$ pypy2-v5.3.1-linux64/bin/pypy -m timeit -s "n =
0" "for i in xrange(10000): n += i"
100000 loops, best of 3: 7.58 usec per loop
rosuav at sikorsky:~/tmp$ pypy2-v5.3.1-linux64/bin/pypy -m timeit -s "n =
1<<100" "for i in xrange(10000): n += i"
10000 loops, best of 3: 115 usec per loop
rosuav at sikorsky:~/tmp$ pypy3.3-v5.2.0-alpha1-linux64/bin/pypy3 -m
timeit -s "n = 0" "for i in range(10000): n += i"
100000 loops, best of 3: 7.56 usec per loop
rosuav at sikorsky:~/tmp$ pypy3.3-v5.2.0-alpha1-linux64/bin/pypy3 -m
timeit -s "n = 1<<100" "for i in range(10000): n += i"
10000 loops, best of 3: 115 usec per loop

Performance comparable to each other (and to the Debian-shipped one,
which is nice - as Adam Savage said, I love consistent data!), and
drastically different between machine words and bignums. So it looks
like PyPy *does* have some sort of optimization going on here, without
ever violating the language spec.

ChrisA



More information about the Python-list mailing list