find all multiplicands and multipliers for a number
Chris Angelico
rosuav at gmail.com
Sat Apr 11 17:10:47 EDT 2015
On Sun, Apr 12, 2015 at 3:52 AM, Paul Rubin <no.email at nospam.invalid> wrote:
>> PS Note that you're being "wasteful" by multiplying c*c over and over
>
> Yeah this is a reasonable point, though most of the c's should fit in a
> machine word, at least in my 64-bit system. I think Python still
> separates ints and longs in the implementation.
I don't think it does. Performance doesn't seem to change in Py3 as
the numbers get bigger:
rosuav at sikorsky:~$ cat perftest.py
def naive_sum(top,base=0):
i=0
while i<top:
i+=1
base+=i
return base
# Correctness test
print("Sum of numbers from 1 to 10 is: %d"%naive_sum(10))
print("Sum of numbers from 1 to 20 is: %d"%(naive_sum(20,1000)-1000))
import timeit
for base in (0, 10, 60, 100, 200):
print("Base: 2**%d"%base)
# Simpler than doing the whole iteration-judging thing ourselves
timeit.main([
"-s","from __main__ import naive_sum",
"naive_sum(1000,%d)"%(2**base)
])
rosuav at sikorsky:~$ python2 perftest.py
Sum of numbers from 1 to 10 is: 55
Sum of numbers from 1 to 20 is: 210
Base: 2**0
10000 loops, best of 3: 89.6 usec per loop
Base: 2**10
10000 loops, best of 3: 89.9 usec per loop
Base: 2**60
10000 loops, best of 3: 92.3 usec per loop
Base: 2**100
10000 loops, best of 3: 145 usec per loop
Base: 2**200
10000 loops, best of 3: 153 usec per loop
Python 2.7: Clear difference in timing once all the numbers we're
using are above 2**64.
rosuav at sikorsky:~$ python3 perftest.py
Sum of numbers from 1 to 10 is: 55
Sum of numbers from 1 to 20 is: 210
Base: 2**0
10000 loops, best of 3: 145 usec per loop
Base: 2**10
10000 loops, best of 3: 144 usec per loop
Base: 2**60
10000 loops, best of 3: 155 usec per loop
Base: 2**100
10000 loops, best of 3: 151 usec per loop
Base: 2**200
10000 loops, best of 3: 156 usec per loop
Python 3.5: Much more consistent timing. Similarly, sticking an
explicit "L" suffix onto the number gives Py2 consistent timings:
rosuav at sikorsky:~$ python perftest.py
Sum of numbers from 1 to 10 is: 55
Sum of numbers from 1 to 20 is: 210
Base: 2**0
10000 loops, best of 3: 130 usec per loop
Base: 2**10
10000 loops, best of 3: 129 usec per loop
Base: 2**60
10000 loops, best of 3: 138 usec per loop
Base: 2**100
10000 loops, best of 3: 139 usec per loop
Base: 2**200
10000 loops, best of 3: 140 usec per loop
So it's the data type, not the size of the numbers, that makes the difference.
rosuav at sikorsky:~$ pike perftest.pike
Sum of numbers from 1 to 10 is: 55
Sum of numbers from 1 to 20 is: 210
Base: 2**0
100000 loops, best of 3: 18.3 usec per loop
Base: 2**10
100000 loops, best of 3: 18.5 usec per loop
Base: 2**60
100000 loops, best of 3: 18.5 usec per loop
Base: 2**100
10000 loops, best of 3: 398 usec per loop
Base: 2**200
10000 loops, best of 3: 406 usec per loop
Like Python 3, Pike has a single "int" type which stores
arbitrary-precision integers. Like Python 2, it has an optimization
for small ones. (One you can't defeat by using "2L".) Interestingly,
once you defeat Pike's optimizer, it's actually quite a lot slower
than Py3 at repeated bignum arithmetic.
Conclusion: Python 3 has one single integer type with consistent
performance across the board. "Machine word" is a meaningless concept
to Py3.
ChrisA
More information about the Python-list
mailing list