strings with moreover of one million of characters

Bengt Richter bokr at oz.net
Sun Feb 22 23:22:29 EST 2004


On Thu, 19 Feb 2004 08:13:29 +1100, "Delaney, Timothy C (Timothy)" <tdelaney at avaya.com> wrote:

>> From: Jos=E9 Carlos
>>=20
>> Is possible use strings with moreover of one million of characters?
>
>Yes. If you have enough memory, which shouldn't be a problem.
>
>> I try using int() but the program keep blocked.
>
>This is a completely separate issue. Presumably you are trying to =
>convert a string which specifies a *very* large integer into a python =
>long (which in 2.3 is done automatically by int).
>
>If you wait a very long time, and you have enough memory, then this =
>should work. But it will take a *very* long time.
>
>After you've done that, try multiplying or dividing by another very =
>large number ...
>
>Tim Delaney
>
If you have to do conversions, you can improve on brute force:

 >>> def pow10(p, cache={}):
 ...     try: return cache[p]
 ...     except KeyError: cache[p] = tmp = 10**p; return tmp
 ...
 >>> def bigs2int(s, small=10):
 ...     if len(s)<=small: return(int(s))
 ...     half = len(s)//2
 ...     return pow10(half)*bigs2int(s[:-half]) + bigs2int(s[-half:])
 ...

 Sanity check...

 >>> for s in [('1234567890'*3)[:x] for x in xrange(1,25)]:
 ...     print '%30s %30s' %(s, bigs2int(s))
 ...
                              1                              1
                             12                             12
                            123                            123
                           1234                           1234
                          12345                          12345
                         123456                         123456
                        1234567                        1234567
                       12345678                       12345678
                      123456789                      123456789
                     1234567890                     1234567890
                    12345678901                    12345678901
                   123456789012                   123456789012
                  1234567890123                  1234567890123
                 12345678901234                 12345678901234
                123456789012345                123456789012345
               1234567890123456               1234567890123456
              12345678901234567              12345678901234567
             123456789012345678             123456789012345678
            1234567890123456789            1234567890123456789
           12345678901234567890           12345678901234567890
          123456789012345678901          123456789012345678901
         1234567890123456789012         1234567890123456789012
        12345678901234567890123        12345678901234567890123
       123456789012345678901234       123456789012345678901234

 >>> import time
 >>> for lognc in xrange(11):
 ...     nc = 10**lognc
 ...     s = '1'+'0'*nc
 ...     t0 = time.clock()
 ...     ibig = bigs2int(s)
 ...     t1 = time.clock()
 ...     iint = int(s)
 ...     t2 = time.clock()
 ...     assert ibig==iint
 ...     print '%10s: %30.9f  %30.9f'%(nc, t1-t0, t2-t1)
 ...
          1:                    0.000078781                     0.000020952
         10:                    0.000436648                     0.000218743
        100:                    0.001541257                     0.000284114
       1000:                    0.007652646                     0.005485332
      10000:                    0.075087455                     0.488497983
     100000:                    1.986088459                    54.656944205
    1000000:                   59.682104658                  6587.106510993

(I took a long walk to let that last int conversion finish ;-)

I'll leave it as an exercise to optimize the setting for small ;-)
Anyway, you can make a similar approach to conversion in the other direction,
beating the builtin decimal print for really large numbers.

And you could do smarter things than I did above. Or at least I'm sure the Timbot could ;-)

Regards,
Bengt Richter



More information about the Python-list mailing list