How to make Python run as fast (or faster) than Julia

bartc bc at freeuk.com
Mon Feb 26 10:02:56 EST 2018


On 26/02/2018 14:04, bartc wrote:
> On 26/02/2018 13:42, Ned Batchelder wrote:

>   Well, once you notice that the
>> Python code had N=1e5, and the C code had N=1e9 :)   If you want to 
>> experiment, with N=1e5, the final number should be 5255210926702073855.
> 
> OK, I'll try that.

I have that Python version working now. It's necessary to apply that 
masking function to wherever numbers can get bigger.

I don't know how long a 1-billion loop will take, but a 10-million loop 
took 46 seconds on Python 3.6, and 21 seconds on PyPy 2.7 from a couple 
of years ago. (And on Windows, which has a somewhat slower CPython than 
Linux.)

Result should be x=11240129907685265998.

By comparison, the C version compiled with -O3 took 0.11 seconds.

(The C version I posted will work, if adjusted to a 10000000 loop, but 
you have to change 'signed' to 'unsigned'. Apparently they weren't 
interchangeable after all. I've no idea why I used 'signed' there.

That version is rather cryptic, but it can be better written and without 
the macros, and it will run just as fast. (Marsaglia may have been hot 
with random number routines, but his C could have done with some work...)

My interpreter, using 64-bit numbers, managed 4.8 seconds. But unsigned 
arithmetic, which is uncommon, is not accelerated.)

---------------------------------------------------

Q=0
carry=36243678541
xcng=12367890123456
xs=521288629546311
indx=20632

def i64(x): return x & 0xFFFFFFFFFFFFFFFF

def refill():
     global Q, carry, indx
     for i in range(20632):
         h = carry & 1
         z = i64((  i64((Q[i]<<41))>>1)+(i64((Q[i]<<39))>>1)+(carry>>1))
         carry = i64((Q[i]>>23)+(Q[i]>>25)+(z>>63))
         Q[i] = i64(~(i64(i64(z<<1)+h)))

     indx=1
     return Q[0]

def start():
     global Q, carry, xcng, xs, indx
     Q=[0,]*20632

     for i in range(20632):

         xcng=i64(6906969069 * xcng + 123)

         xs = i64(xs ^ (xs<<13))
         xs = i64(xs ^ (xs>>17))
         xs = i64(xs ^ (xs<<43))

         Q[i] = i64(xcng + xs)

     N = 10000000
     for i in range(N):
         if indx<20632:
             s = Q[indx]
             indx+=1
         else:
             s = refill()
         xcng=i64(6906969069 * xcng + 123)
         xs = i64(xs ^ (xs<<13))
         xs = i64(xs ^ (xs>>17))
         xs = i64(xs ^ (xs<<43))
         x = i64(s+xcng+xs)
     print ("Does x= 4013566000157423768")
     print ("     x=",x)

start()

-- 
bartc



More information about the Python-list mailing list