How to make Python run as fast (or faster) than Julia

bartc bc at freeuk.com
Mon Feb 26 07:13:56 EST 2018


On 26/02/2018 11:40, Chris Angelico wrote:
> On Mon, Feb 26, 2018 at 10:13 PM, bartc <bc at freeuk.com> wrote:
>> Below is the first draft of a Python port of a program to do with random
>> numbers. (Ported from my language, which in turned ported it from a C
>> program by George Marsaglia, the random number guy.)
>>
>> However, running it quickly exhausts the memory in my machine. The reason is
>> that Python unhelpfully widens all results to bignums as needed. The code
>> relies on calculations being modulo 2^64.
>>
>> Note that restricting integer ops to 64 bits probably still won't work, as I
>> believe the numbers need to be unsigned.
> 
> No, it's because the original implementation assumed integer
> wrap-around (at least, I think that's what's happening; I haven't
> analyzed the code in great detail). That means all your integer
> operations are doing two jobs: the one they claim to, and then a
> masking to 64 bits signed. That's two abstract operations that happen,
> due to the nature of the CPU, to work efficiently together. If you
> don't implement both halves of that in your Python port, you have
> failed at porting. What if you were porting a program from a 72-bit
> chip that assumed Binary Coded Decimal? Would you complain that C's
> integers are badly behaved?
> 
> And that's without even asking whether a C program that assumes
> integer wrap-around counts as portable. At least with Python, you have
> a guarantee that integer operations are going to behave the same way
> on all compliant implementations of the language.


A C version is given below. (One I may have messed around with, which 
I'm not sure works properly. For an original, google for Marsaglia and 
KISS64 or SUPRKISS64.)

Most integers are unsigned, which have well-defined overflow in C (they 
just wrap as expected). In C, a mixed signed/unsigned op is performed as 
unsigned.

-----------------------------

/*   SUPRKISS64.c, period 5*2^1320480*(2^64-1)   */
#include <stdio.h>
#include <stdlib.h>
#include "timer.c"
  static signed long long Q[20632],
                  carry=36243678541LL,
                  xcng=12367890123456LL,
                  xs=521288629546311LL,
                  indx=20632;

  #define CNG ( xcng=6906969069LL*xcng+123 )
  #define XS  ( xs^=xs<<13,xs^=xs>>17,xs^=xs<<43 )
  #define SUPR ( indx<20632 ? Q[indx++] : refill() )
  #define KISS SUPR+CNG+XS

  signed long long refill(void) {
   int i; signed long long z,h;
   for(i=0;i<20632;i++) {
     h = (carry&1);
     z = ((Q[i]<<41)>>1)+((Q[i]<<39)>>1)+(carry>>1);
     carry = (Q[i]>>23)+(Q[i]>>25)+(z>>63);
     Q[i] = ~((z<<1)+h);
   }
   indx=1;
   return (Q[0]);
  }

  int main() {
   int i; signed long long x;

   for(i=0;i<20632;i++) Q[i]=CNG+XS;

   for(i=0;i<1000000000;i++) x=KISS;

   printf("Does x=4013566000157423768\n     x=%llu.\n",x);
}

-- 
bartc



More information about the Python-list mailing list