str(bigint) is slow

Sat Jul 10 14:43:54 EDT 2004

On Fri, 9 Jul 2004 16:34:03 +1000, "Delaney, Timothy C (Timothy)" <tdelaney at avaya.com> wrote:

>Bryan wrote:
>
>> does anyone know how to make this faster?  it seems that str(x) is
>> the slow part.=20
>>=20
>>=20
>>  >>> def foo():
>> ...     t1 =3D time.time()
>> ...     x =3D 19 ** 314159
>> ...     t2 =3D time.time()
>> ...     y =3D str(x)
>> ...     t3 =3D time.time()
>> ...     print y
>> ...     t4 =3D time.time()
>> ...     print t2-t1, t3-t2, t4-t3
>> ...
>>  >>> foo()
>> <snip a print out of a billion numbers>
>> 3.78499984741 230.490999937 0.0700001716614
>Bryan wrote:
>
>> does anyone know how to make this faster?  it seems that str(x) is
>> the slow part.=20
>>=20
>>  >>> def foo():
>> ...     t1 =3D time.time()
>> ...     x =3D 19 ** 314159
>> ...     t2 =3D time.time()
>> ...     y =3D str(x)
>> ...     t3 =3D time.time()
>> ...     print y
>> ...     t4 =3D time.time()
>> ...     print t2-t1, t3-t2, t4-t3
>> ...
>>  >>> foo()
>> <snip a print out of a billion numbers>
>> 3.78499984741 230.490999937 0.0700001716614
>
>401732 digits actually ... that's not even half a million ...
>
>There was a recent thread titled "How to get decimal form of largest
>known prime?" ...
>
>http://tinyurl.com/3b2no
>
>http://groups.google.com.au/groups?hl=3Den&lr=3D&ie=3DUTF-8&threadm=3D2is=
>27mFqee
>n8U1%40uni-berlin.de&rnum=3D1&prev=3D/groups%3Fhl%3Den%26lr%3D%26ie%3DUTF=
>-8%
>26selm%3D2is27mFqeen8U1%2540uni-berlin.de
>
A couple of years ago I wrote a recursive decimal print that seemed to
improve on str quite a bit:

http://groups.google.com/groups?as_ugroup=comp.lang.python&as_umsgid=3b4bab70.1586151915%40wa.news.verio.net

this is on the same old 300mhz p2 with 320mb ram, so a modern computer might go an
order of magnitude faster:

[10:07] C:\pywk\fib>strl.py strL 19**314159
strL took 279.600129 seconds

(note that the eval of the command line expression is not secure, so the timing hack part
should not be allowed to run in untrusted contexts).

You can use it like (though note that it has had no significant testing):

 [10:22] C:\pywk\fib>python
 Python 2.3.2 (#49, Oct  2 2003, 20:02:00) [MSC v.1200 32 bit (Intel)] on win32
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import strL
 >>> from time import clock
 >>> if 1:
 ...     t0=clock()
 ...     bigno = 19**314159
 ...     print clock()-t0
 ...
 12.7305390125
 >>> if 1:
 ...     t0=clock()
 ...     bigstr = strL.strL(bigno)
 ...     print clock()-t0
 ...
 279.613874497
 >>> len(bigstr)
 401732
 >>> print bigstr[:20],'...',bigstr[-20:]
 89644105862678522181 ... 98160931303747570779

(Does that match other test results?)

I don't know why the first time (which included computing 19**314159) did not go down
in the interactive test above, but maybe there was a gc cycle or the os did something.
The cached powers of ten should benefit another trial however. Let's see:

 >>> if 1:
 ...     t0=clock()
 ...     bigstr = strL.strL(bigno)
 ...     print clock()-t0
 ...
 271.413380356
 >>> len(bigstr)
 401732
 >>> print bigstr[:20],'...',bigstr[-20:]
 89644105862678522181 ... 98160931303747570779
 >>> if 1:
 ...     t0=clock()
 ...     bigstr = strL.strL(bigno)
 ...     print clock()-t0
 ...
 271.252725904

Hm, not a lot of variation in two tries ...
BTW, to compare algorithms more or less platform-independently, perhaps
we ought to use bogomips*seconds (bogomi?) Maybe timeit could import/cache the
appropriate scale factor in a module (created if its import fails) and
bogomi result normalization scaling could be an option?

(Hope to be back more once I catch up...)

Regards,
Bengt Richter