[Python-checkins] python/dist/src/Objects longobject.c,1.134,1.135

tim_one@users.sourceforge.net tim_one@users.sourceforge.net
Tue, 13 Aug 2002 13:37:56 -0700


Update of /cvsroot/python/python/dist/src/Objects
In directory usw-pr-cvs1:/tmp/cvs-serv9254/python/Objects

Modified Files:
	longobject.c 
Log Message:
k_mul():  The fix for (ah+al)*(bh+bl) spilling 1 bit beyond the allocated
space is no longer needed, so removed the code.  It was only possible when
a degenerate (ah->ob_size == 0) split happened, but after that fix went
in I added k_lopsided_mul(), which saves the body of k_mul() from seeing
a degenerate split.  So this removes code, and adds a honking long comment
block explaining why spilling out of bounds isn't possible anymore.  Note:
ff we end up spilling out of bounds anyway <wink>, an assert in v_iadd()
is certain to trigger.


Index: longobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/longobject.c,v
retrieving revision 1.134
retrieving revision 1.135
diff -C2 -d -r1.134 -r1.135
*** longobject.c	13 Aug 2002 00:24:58 -0000	1.134
--- longobject.c	13 Aug 2002 20:37:51 -0000	1.135
***************
*** 1651,1656 ****
--- 1651,1659 ----
  		return k_lopsided_mul(a, b);
  
+ 	/* Split a & b into hi & lo pieces. */
  	shift = bsize >> 1;
  	if (kmul_split(a, shift, &ah, &al) < 0) goto fail;
+ 	assert(ah->ob_size > 0);	/* the split isn't degenerate */
+ 
  	if (kmul_split(b, shift, &bh, &bl) < 0) goto fail;
  
***************
*** 1736,1748 ****
  	assert(t3->ob_size >= 0);
  
! 	/* Add t3.  Caution:  t3 can spill one bit beyond the allocated
! 	 * result space; it's t3-al*bl-ah*bh that always fits.  We have
! 	 * to arrange to ignore the high bit.
  	 */
- 	if (t3->ob_size > i) {
- 		assert(t3->ob_size == i+1);	/* just one digit over */
- 		assert(t3->ob_digit[t3->ob_size - 1] == 1); /* & just one bit */
- 		--t3->ob_size;	/* ignore the overflow bit */
- 	}
  	(void)v_iadd(ret->ob_digit + shift, i, t3->ob_digit, t3->ob_size);
  	Py_DECREF(t3);
--- 1739,1745 ----
  	assert(t3->ob_size >= 0);
  
! 	/* Add t3.  It's not obvious why we can't run out of room here.
! 	 * See the (*) comment after this function.
  	 */
  	(void)v_iadd(ret->ob_digit + shift, i, t3->ob_digit, t3->ob_size);
  	Py_DECREF(t3);
***************
*** 1758,1761 ****
--- 1755,1797 ----
  	return NULL;
  }
+ 
+ /* (*) Why adding t3 can't "run out of room" above.
+ 
+ We allocated space for asize + bsize result digits.  We're adding t3 at an
+ offset of shift digits, so there are asize + bsize - shift allocated digits
+ remaining.  Because degenerate shifts of "a" were weeded out, asize is at
+ least shift + 1.  If bsize is odd then bsize == 2*shift + 1, else bsize ==
+ 2*shift.  Therefore there are at least shift+1 + 2*shift - shift =
+ 
+ 2*shift+1 allocated digits remaining when bsize is even, or at least
+ 2*shift+2 allocated digits remaining when bsize is odd.
+ 
+ Now in bh+bl, if bsize is even bh has at most shift digits, while if bsize
+ is odd bh has at most shift+1 digits.  The sum bh+bl has at most
+ 
+ shift   digits plus 1 bit when bsize is even
+ shift+1 digits plus 1 bit when bsize is odd
+ 
+ The same is true of ah+al, so (ah+al)(bh+bl) has at most
+ 
+ 2*shift   digits + 2 bits when bsize is even
+ 2*shift+2 digits + 2 bits when bsize is odd
+ 
+ If bsize is even, we have at most 2*shift digits + 2 bits to fit into at
+ least 2*shift+1 digits.  Since a digit has SHIFT bits, and SHIFT >= 2,
+ there's always enough room to fit the 2 bits into the "spare" digit.
+ 
+ If bsize is odd, we have at most 2*shift+2 digits + 2 bits to fit into at
+ least 2*shift+2 digits, and there's not obviously enough room for the
+ extra two bits.  We need a sharper analysis in this case.  The major
+ laziness was in the "the same is true of ah+al" clause:  ah+al can't actually
+ have shift+1 digits + 1 bit unless bsize is odd and asize == bsize.  In that
+ case, we actually have (2*shift+1)*2 - shift = 3*shift + 2 allocated digits
+ remaining, and that's obviously plenty to hold 2*shift + 2 digits + 2 bits.
+ Else (bsize is odd and asize < bsize) ah and al each have at most shift digits,
+ so ah+al has at most shift digits + 1 bit, and (ah+al)*(bh+bl) has at most
+ 2*shift+1 digits + 2 bits, and again 2*shift+2 digits + 2 bits is
+ enough to hold it.
+ */
  
  /* b has at least twice the digits of a, and a is big enough that Karatsuba