How to get decimal form of largest known prime?

Tue Jun 15 20:16:24 EDT 2004

>> [Tim Peters]
>> The fastest pure-Python solution I've found consumed about 20 CPU minutes
>> Sorry, I really can't make time to review unfamiliar code today.  Instead
>> I'll just show the code I used.
Thank you _very much_ for the code and for all your postings -
it was a challenge for me as Python newbie to understand it,
but I learned from it much about Python syntax.
(by the way: my machine needs about 25 CPU minutes)
I still haven't got any clear idea why your code runs so much faster than
mine,
but I see, that you are using shifting and own algorithm for pow(), so
that's
probably the core of the improvement (I haven't used shifting and pow(),
only
standard Python multiplication without any special algorithms behind).
As it is often the case, the more I know, the more questions comes up,
as e.g. why there is probably no common base-independent algorithm for
binary -> decimal conversion, or what is the amount of information in
knowing the largest prime? Is it 2**24036583-1  i.e. around 13 byte,
or about 3 MByte after maximum compression ratio which can be
reached on the decimal representation, or these small amount of bytes
after compressing the hexadecimal representation?

To stay on-topic: I did some work rewriting your code and provide
here code of  a bigDecReprOfInt class as a kind of module and
stand-alone app in one. You can print the largest prime with it
(setting False to True) and perform timing comparisons (see comments
within the code for details).

***** REPLY to following question will be welcome: *****
What I can't understand is, why if I write in __main__() e.g. :

bigDecimal = bigDecReprOfInt([2],80)**intExponentToTest

the bigDecimal "digits" still remain as defined in
 def __init__(self, lstOfDigits, intNoOfDecimalDigitsInOneBigDigit = 4096):

i.e.  4096  and not as I would expect it  80 decimal digits long

Haven't I proper understood how to set/overwrite default values
of function parameters? Makes it a difference if it is a function
defined within a class or stand-alone? Have I faced a bug?

***** REPLY to question above will be welcome: *****

Now knowing you [Tim Peters] as a Python and programming
expert, may I ask you even for more support by providing a
reply to my other posting to this newsgroup:
      "Python Scripting in Windows MSIE 6.0"?
Is there an easy way to activate and use it in Python 2.3.4?
I know, that there are much work-arounds possible, but
I would just prefere to write the Python within the
<script...> </script> tags within the HTML page using
it as a kind of GUI to Python code.

Claudio
P.S. here the code for use by everyone interested in
the topic of this discussion path:

class bigDecReprOfInt(object):
  """ "/*(UEdit)
  # The primary purpose of the class  bigDecReprOfInt  is speeding up of
'%i'%i
  # and str(i) conversion for very large integers i.
  # Example of a task where the advantage from usage of  bigDecReprOfInt
class
  #   for conversion of a large integer to string (using decimal form)
becomes
  #   apparent, is printing all digits of the largest known (May 2004) prime
  #                              2**24036583-1
  #   It can be done using
  #
  #                      "print '%i'%(2**24036583-1,)"
  #
  #   but it takes time in order of hours of busy CPU. Interesting is here,
that
  #   printing the hexadecimal form of same prime using
  #
  #                      "print '%X'%(2**24036583-1,)"
  #
  #   takes only seconds.
  #   Usage of the  bigDecReprOfInt  class to do the job:
  #
  #                      bigDecimal = bigDecReprOfInt([2])**24036583
  #                      bigDecimal.digits[0] -= 1
  #                      print bigDecimal
  #
  #   cuts the CPU time down to less than half an hour.
  #
  # Generally:
  #   comp.lang.python, Tim Peters <tim.one at comcast.net> June 13, 2004 16:35
:
  #   "The asymptotic complexity of the elementary arithmetic operations
  #     (+, -, *, /, integer power) is independent of base. ", but practical
  #   comp.lang.python, Tim Peters <tim.one at comcast.net> June 10, 2004 23:56
:
  #   "binary -> hex conversion time increases linear with the number of
bits
  #    (Python longs are stored internally in a form of binary
representation).
  #   binary -> decimal conversion time increases quadratic with the number
  #    of bits, therefore it is a very slow process for large numbers."
  # The principle:
  #   The core of the idea behind the approches to lower the CPU time
necessary
  #   for printing is to avoid binary -> hex conversion of very large
integers
  #   applying this conversion only to smaller ones, with low conversion
time.
  # The specific approach choosen for this class:
  #   Because the large integer must be splitted into smaller integers
  #   representing a given range of decimal digits of the integer before
  #   printing, this job can be done already while the large integer is
  #   created (using * or ** operators).
  #   This class inherited from class 'object' overwrites the __init__(),
  #   __imul__(), __mul__(), __pow__() and __str__() methods of 'object'
  #   exposing this way own * and ** operators and conversion to string.
  #   The code of this class is based on code of the BigDec class
  #   provided by "Tim Peters" <tim.one at comcast.net> in his contribution to
  #   newsgroup comp.lang.python (python-list at python.org) June 13, 2004
21:49.
  #   Packing all of the for BigDec necesarry code to one class and writing
  #   of introductory information and __main__() was done by "Claudio
Grondi"
  #   <claudio.grondi at freenet.de> June 15, 2004
  " """ #(UEdit)*/

  def __init__(self, lstOfDigits, intNoOfDecimalDigitsInOneBigDigit = 4096):
    # digits is a list of base-BASE digits, least-significant first.
    self.digits = lstOfDigits
    self.EXP    = intNoOfDecimalDigitsInOneBigDigit
    self.BASE   = (5L ** self.EXP) << self.EXP      # == 10**EXP
  #:def __init__(self, digits):

  def __mul__(self, other):
    # Force 'other' to have the fewer # of digits.
    if len(self.digits) < len(other.digits):
        self, other = other, self
    # Straightforward cross-product.  Don't worry about digits
    # getting out of range for the duration.
    result = [0] * (len(self.digits) + len(other.digits) - 1)
    for baseindex, otherdigit in enumerate(other.digits):
      for i, selfdigit in enumerate(self.digits):
        result[baseindex + i] += selfdigit * otherdigit
      #:for
    #:for
    self.normalize(result)
    return bigDecReprOfInt(result)
  #:def __mul__(self, other):

  __imul__ = __mul__

  def __pow__(self, intExponent):
    # "The usual" left-to-right efficient exponentiation algorithm, but
    # intExponent must be here of type integer, not bigDecReprOfInt.
    mask = 1L
    while mask <= intExponent:
      mask <<= 1
    #:while mask:
    mask >>= 1  # position of intExponent's most-significant bit

    result = bigDecReprOfInt([1])
    while mask:
      result = result.square()
      if intExponent & mask:
        result *= self
      #:if
      mask >>= 1
    #:while mask:
    return result
  #:def __pow__(self, intExponent):

  def square(self):
    # Efficient divide-and-conquer squaring, based on
    #   (a*x+b)**2 = a**2*x**2 + 2*a*b*x + b**2 :
    intNoOfDigits = len(self.digits)
    if intNoOfDigits <= 5: return self * self #:intNoOfDigits <= 5: RETURN
    intSplitAt = intNoOfDigits // 2
    lo, hi =                                      \
        bigDecReprOfInt(self.digits[:intSplitAt]) \
      , bigDecReprOfInt(self.digits[intSplitAt:])
    result = lo.square().digits
    result += [0]*(2*intSplitAt - len(result))  # list multiplication
    result += hi.square().digits
    for intDigit in (lo * hi).digits:
      result[intSplitAt] += intDigit << 1
      intSplitAt += 1
    #:for
    self.normalize(result)
    return bigDecReprOfInt(result)
  #: def square(self):

  def __str__(self):
    result = map(str, self.digits)
    # Pad all digits except for the MSD with leading zeroes.
    for i in xrange(len(result) - 1):
      digit = result[i]
      if(len(digit) < self.EXP):
        result[i] = "0"*(self.EXP - len(digit)) + digit
      #:if
    #:for
    result.reverse()
    return "".join(result)
  #:def __str__(self):

  def normalize(self, lstOfDigits):
    # Force the digits in self.digits into range(BASE), by propagating
carries.
    # Function has no return value - self.digits is modified in-place.
    intCarry = 0
    # first loop over all of the existing digits ...
    for intIndexOfDigitInListOfDigits, intValueOfDigitInListOfDigits in
enumerate(lstOfDigits):
      intCarry += intValueOfDigitInListOfDigits
      intCarry, lstOfDigits[intIndexOfDigitInListOfDigits] =
divmod(intCarry, self.BASE)
    # ... then loop as long as it is necessary to append leading digits and
...
    while(intCarry >= self.BASE):
      intCarry, leftover = divmod(intCarry, self.BASE)
      lstOfDigits.append(leftover)
    # ... check if it necessary to append a last leading digit and ...
    if intCarry: lstOfDigits.append(intCarry) #:if intCarry:
    # strip leading zeroes ( 0 digits ):
    while lstOfDigits and lstOfDigits[-1] == 0: lstOfDigits.pop() #:while
  #:normalize(lstOfDigits)
#:class bigDecReprOfInt(object)

if __name__ == '__main__':
  # For getting decimal form of currently(May 2004) largest known prime
  #   i.e. 2**24036583-1  set blnPrintLargestKnownPrime to true:
  blnPrintLargestKnownPrime = False # True

  if(not blnPrintLargestKnownPrime):
    intExponentToTest =   500000
  else:
    intExponentToTest = 24036583
  #:if/else

  fileFOut = open('bigDecReprOfInt.out','w')
  import time
  print
  print '
============================================================================
=== '
  print
' --------------------------------------------------------------------------
----- '
  print
  print '     *** Usage of bigDecReprOfInt class for fast prints of large
integers ***'
  print
  print '     Print of first ... last 77 digits of 2**%i:
'%(intExponentToTest,)
  floatBegTime_bigDecimal = time.clock()
  bigDecimal = bigDecReprOfInt([2])**intExponentToTest
  if(blnPrintLargestKnownPrime):
    # bigDecReprOfInt does not(yet) support + and - operators and we want to
    # subtract one(1), so let it do in a generalized way:
    assert bigDecimal.digits[0] != 0
    # i.e. subtract one(1) only if last digit is non-zero (is the case
here):
    bigDecimal.digits[0] -= 1
  #:if
  strDecimal = str(bigDecimal)
  floatEndTime_bigDecimal = time.clock()
  print '   ' + strDecimal[0:77]
  print '   ' + '...'
  print '   ' + strDecimal[-77:]
  if(intExponentToTest <= 500000):
    print '     -------- started calculating of strDecimal =
str(2**%i)'%intExponentToTest
    floatBegTime_stdDecimal = time.clock()
    strDecimal = str(2**intExponentToTest)
    floatEndTime_stdDecimal = time.clock()
    print '   ' + strDecimal[0:77]
    print '   ' + '...'
    print '   ' + strDecimal[-77:]
    print '   str(bigDecReprOfInt([2])**%i) is  %4.2f  times faster than
str(2**%i)'%( \
        intExponentToTest, (floatEndTime_stdDecimal -
floatBegTime_stdDecimal)           \
      / (floatEndTime_bigDecimal - floatBegTime_bigDecimal),
intExponentToTest           \
    )
    # On Pentium III, 900 MHz :
    # str(bigDecReprOfInt([2])**500000) is  7.81  times faster than
str(2**500000)
  else:
    print
    print '     --------- skipping str(2**EXP) for EXP > 500000 ---------'
    print
    print                                                               \
      '  runtime of str(bigDecReprOfInt([2])**%i) is  %6.2f  minutes'%( \
      intExponentToTest,                                                \
      (floatEndTime_bigDecimal - floatBegTime_bigDecimal)/60            \
    )
  #:if/else
  fileFOut.write(strDecimal)
  print
  print '     See "bigDecReprOfInt.out" in curr. dir for value of
2**%i'%(intExponentToTest,)
  print
  fileFOut.close()

  # print                             # TEST
  # print bigDecimal.digits[0]        # TEST
  print
  print '     Print of first ... last 77 digits of (2**24036583-1) in
hexadecimal form: '
  floatBegTime_bigHexadecimal = time.clock()
  strDecimal = '%X'%(2**24036583-1)
  floatEndTime_bigHexadecimal = time.clock()
  print '   ' + strDecimal[0:77]
  print '   ' + '...'
  print '   ' + strDecimal[-77:]
  print                                                                   \
    '     runtime of strDecimal = \'%%X\'%%(2**24036583-1) is  %6.4f
sec'%( \
    (floatEndTime_bigDecimal - floatBegTime_bigDecimal),                  \
  )
  print
  print
' --------------------------------------------------------------------------
----- '
  print '
============================================================================
=== '

  raw_input('exit with ENTER: #\>  ')

#:if __name__ == '__main__'