[ python-Bugs-1746088 ] long.__str__ is quadratic time

SourceForge.net noreply at sourceforge.net
Tue Jul 3 16:22:31 CEST 2007


Bugs item #1746088, was opened at 2007-07-01 15:42
Message generated for change (Comment added) made by marketdickinson
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1746088&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: David Benbennick (dbenbenn)
Assigned to: Nobody/Anonymous (nobody)
Summary: long.__str__ is quadratic time

Initial Comment:
In the 2.5.1 source code, Objects/longobject.c:long_format() is used to convert a long int to a string.  When base is not a power of 2, it is quadratic time in the length of the output string.  Of course, this is fine on small numbers, but is catastrophic on huge numbers.

Suppose base is 10.  The problem is that the algorithm basically does the following: take the number mod 10 to get a digit, stick that digit on the output, divide the number by 10, and repeat.

To print an n digit number, there is an O(n log n) algorithm, using divide-and-conquer.  You break the number into 2 pieces, each n/2 digits long, and iterate on both pieces.

Converting string to long has the same quadratic time problem, in PyLong_FromString().  The solution is the same, in reverse: break the string in half, convert each piece to a long, and combine the two longs into one.


Alternatively, Python could just use GMP (GNU MP Bignum Library, http://gmplib.org/) to provide long integers.  That would make other operations, such as * and /, more efficient, too.  But it would require a much bigger change.

----------------------------------------------------------------------

Comment By: Mark Dickinson (marketdickinson)
Date: 2007-07-03 14:22

Message:
Logged In: YES 
user_id=703403
Originator: NO

I'd call this a feature request rather than a bug.

If I understand correctly, an O(n^(1+epsilon)) printing algorithm would
rely on having an FFT-based fast multiplication algorithm, together with
some form of divide-and-conquer division---is this right?  These algorithms
are nontrivial to implement efficiently, and even then the crossover point
(where the FFT-based method becomes faster than the quadratic method) is
likely to be in the thousands of digits.  So I can't imagine there's much
demand for this---even a 4096-bit RSA key is only 1233 (or 1234) digits
long.  If you just want subquadratic printing (O(n^1.585) or so) then you'd
still need a subquadratic division (Python already has Karatsuba
multiplication for large integers); here I guess the crossover would be
smaller.  A subquadratic division for Python might well be of interest to
the developers, if someone could be persuaded to write and test one, and
demonstrate a significant positive impact on performance.

What's your use-case for printing huge integers fast?  It doesn't seem
like a very common need.

Regarding GMP, I believe there are licensing issues:  it's not legal to
include GMP in core Python and release Python under its current non-GPL
license, or something like that---I don't know anything about the details. 
But have you encountered Martelli's gmpy? 

http://code.google.com/p/gmpy/



----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1746088&group_id=5470


More information about the Python-bugs-list mailing list