[Python-checkins] python/dist/src/Objects longobject.c,1.127,1.128

tim_one@users.sourceforge.net tim_one@users.sourceforge.net
Mon, 12 Aug 2002 11:25:45 -0700


Update of /cvsroot/python/python/dist/src/Objects
In directory usw-pr-cvs1:/tmp/cvs-serv21116/python/Objects

Modified Files:
	longobject.c 
Log Message:
x_mul():  Made life easier for C optimizers in the "grade school"
algorithm.  MSVC 6 wasn't impressed <wink>.

Something odd:  the x_mul algorithm appears to get substantially worse
than quadratic time as the inputs grow larger:

bits in each input   x_mul time   k_mul time
------------------   ----------   ----------
             15360         0.01         0.00
             30720         0.04         0.01
             61440         0.16         0.04
            122880         0.64         0.14
            245760         2.56         0.40
            491520        10.76         1.23
            983040        71.28         3.69
           1966080       459.31        11.07

That is, x_mul is perfectly quadratic-time until a little burp at
2.56->10.76, and after that goes to hell in a hurry.  Under Karatsuba,
doubling the input size "should take" 3 times longer instead of 4, and
that remains the case throughout this range.  I conclude that my "be nice
to the cache" reworkings of k_mul() are paying.


Index: longobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/longobject.c,v
retrieving revision 1.127
retrieving revision 1.128
diff -C2 -d -r1.127 -r1.128
*** longobject.c	12 Aug 2002 17:36:03 -0000	1.127
--- longobject.c	12 Aug 2002 18:25:43 -0000	1.128
***************
*** 1540,1543 ****
--- 1540,1544 ----
  		twodigits f = a->ob_digit[i];
  		int j;
+ 		digit *pz = z->ob_digit + i;
  
  		SIGCHECK({
***************
*** 1546,1557 ****
  		})
  		for (j = 0; j < size_b; ++j) {
! 			carry += z->ob_digit[i+j] + b->ob_digit[j] * f;
! 			z->ob_digit[i+j] = (digit) (carry & MASK);
  			carry >>= SHIFT;
  		}
  		for (; carry != 0; ++j) {
  			assert(i+j < z->ob_size);
! 			carry += z->ob_digit[i+j];
! 			z->ob_digit[i+j] = (digit) (carry & MASK);
  			carry >>= SHIFT;
  		}
--- 1547,1558 ----
  		})
  		for (j = 0; j < size_b; ++j) {
! 			carry += *pz + b->ob_digit[j] * f;
! 			*pz++ = (digit) (carry & MASK);
  			carry >>= SHIFT;
  		}
  		for (; carry != 0; ++j) {
  			assert(i+j < z->ob_size);
! 			carry += *pz;
! 			*pz++ = (digit) (carry & MASK);
  			carry >>= SHIFT;
  		}