Python 3.3 vs. MSDOS Basic

Nick Mellor thebalancepro at gmail.com
Mon Feb 18 23:17:51 EST 2013


Hi John,

Thanks for the problem. I've been writing Python for about 4 years now and am beginning to feel like I'm writing much better Python code.

Python does fine on this problem if you play to its strengths. The following uses dictionary lookups to store previously computed sequence lengths, thus saving a lot of work. The problem is very "sparse", i.e. there are huge gaps between numbers that are actually used in the solution, making dictionaries a better fit than lists.

This code crosses the line in under 3s on a 64-bit laptop. MS-DOS BASIC anyone? :-)

I tried precomputing powers of 2 and multiples of 2, but to my surprise it made very little difference to timings. Even though precomputing n//2 is fast, I think again this is because the problem is sparse and the time the computer saves is not offset by the cost of precomputing many multiples of 2 that are never needed.

Best wishes,

Nick

And the winner is 837799 with sequence length 524
Time (s):  2.924168109893799
Sequence is:
[837799, 2513398, 1256699, 3770098, 1885049, 5655148, 2827574, 1413787, 4241362, 2120681, 6362044, 3181022, 1590511, 4771534, 2385767, 7157302, 3578651, 10735954, 5367977, 16103932, 8051966, 4025983, 12077950, 6038975, 18116926, 9058463, 27175390, 13587695, 40763086, 20381543, 61144630, 30572315, 91716946, 45858473, 137575420, 68787710, 34393855, 103181566, 51590783, 154772350, 77386175, 232158526, 116079263, 348237790, 174118895, 522356686, 261178343, 783535030, 391767515, 1175302546, 587651273, 1762953820, 881476910, 440738455, 1322215366, 661107683, 1983323050, 991661525, 2974984576, 1487492288, 743746144, 371873072, 185936536, 92968268, 46484134, 23242067, 69726202, 34863101, 104589304, 52294652, 26147326, 13073663, 39220990, 19610495, 58831486, 29415743, 88247230, 44123615, 132370846, 66185423, 198556270, 99278135, 297834406, 148917203, 446751610, 223375805, 670127416, 335063708, 167531854, 83765927, 251297782, 125648891, 376946674, 188473337, 565420012, 282710006, 141355003, 424065010, 212032505, 636097516, 318048758, 159024379, 477073138, 238536569, 715609708, 357804854, 178902427, 536707282, 268353641, 805060924, 402530462, 201265231, 603795694, 301897847, 905693542, 452846771, 1358540314, 679270157, 2037810472, 1018905236, 509452618, 254726309, 764178928, 382089464, 191044732, 95522366, 47761183, 143283550, 71641775, 214925326, 107462663, 322387990, 161193995, 483581986, 241790993, 725372980, 362686490, 181343245, 544029736, 272014868, 136007434, 68003717, 204011152, 102005576, 51002788, 25501394, 12750697, 38252092, 19126046, 9563023, 28689070, 14344535, 43033606, 21516803, 64550410, 32275205, 96825616, 48412808, 24206404, 12103202, 6051601, 18154804, 9077402, 4538701, 13616104, 6808052, 3404026, 1702013, 5106040, 2553020, 1276510, 638255, 1914766, 957383, 2872150, 1436075, 4308226, 2154113, 6462340, 3231170, 1615585, 4846756, 2423378, 1211689, 3635068, 1817534, 908767, 2726302, 1363151, 4089454, 2044727, 6134182, 3067091, 9201274, 4600637, 13801912, 6900956, 3450478, 1725239, 5175718, 2587859, 7763578, 3881789, 11645368, 5822684, 2911342, 1455671, 4367014, 2183507, 6550522, 3275261, 9825784, 4912892, 2456446, 1228223, 3684670, 1842335, 5527006, 2763503, 8290510, 4145255, 12435766, 6217883, 18653650, 9326825, 27980476, 13990238, 6995119, 20985358, 10492679, 31478038, 15739019, 47217058, 23608529, 70825588, 35412794, 17706397, 53119192, 26559596, 13279798, 6639899, 19919698, 9959849, 29879548, 14939774, 7469887, 22409662, 11204831, 33614494, 16807247, 50421742, 25210871, 75632614, 37816307, 113448922, 56724461, 170173384, 85086692, 42543346, 21271673, 63815020, 31907510, 15953755, 47861266, 23930633, 71791900, 35895950, 17947975, 53843926, 26921963, 80765890, 40382945, 121148836, 60574418, 30287209, 90861628, 45430814, 22715407, 68146222, 34073111, 102219334, 51109667, 153329002, 76664501, 229993504, 114996752, 57498376, 28749188, 14374594, 7187297, 21561892, 10780946, 5390473, 16171420, 8085710, 4042855, 12128566, 6064283, 18192850, 9096425, 27289276, 13644638, 6822319, 20466958, 10233479, 30700438, 15350219, 46050658, 23025329, 69075988, 34537994, 17268997, 51806992, 25903496, 12951748, 6475874, 3237937, 9713812, 4856906, 2428453, 7285360, 3642680, 1821340, 910670, 455335, 1366006, 683003, 2049010, 1024505, 3073516, 1536758, 768379, 2305138, 1152569, 3457708, 1728854, 864427, 2593282, 1296641, 3889924, 1944962, 972481, 2917444, 1458722, 729361, 2188084, 1094042, 547021, 1641064, 820532, 410266, 205133, 615400, 307700, 153850, 76925, 230776, 115388, 57694, 28847, 86542, 43271, 129814, 64907, 194722, 97361, 292084, 146042, 73021, 219064, 109532, 54766, 27383, 82150, 41075, 123226, 61613, 184840, 92420, 46210, 23105, 69316, 34658, 17329, 51988, 25994, 12997, 38992, 19496, 9748, 4874, 2437, 7312, 3656, 1828, 914, 457, 1372, 686, 343, 1030, 515, 1546, 773, 2320, 1160, 580, 290, 145, 436, 218, 109, 328, 164, 82, 41, 124, 62, 31, 94, 47, 142, 71, 214, 107, 322, 161, 484, 242, 121, 364, 182, 91, 274, 137, 412, 206, 103, 310, 155, 466, 233, 700, 350, 175, 526, 263, 790, 395, 1186, 593, 1780, 890, 445, 1336, 668, 334, 167, 502, 251, 754, 377, 1132, 566, 283, 850, 425, 1276, 638, 319, 958, 479, 1438, 719, 2158, 1079, 3238, 1619, 4858, 2429, 7288, 3644, 1822, 911, 2734, 1367, 4102, 2051, 6154, 3077, 9232, 4616, 2308, 1154, 577, 1732, 866, 433, 1300, 650, 325, 976, 488, 244, 122, 61, 184, 92, 46, 23, 70, 35, 106, 53, 160, 80, 40, 20, 10, 5, 16, 8, 4, 2, 1]
Sparsity calculations...
Computed sequence lengths 2168611
Largest term:  56991483520
Test range:  1 1000000
Biggest gap:  4508198208
Sparsity: 0.00175%


# If True, will precompute powers of 2 and multiples of 2
# in practice this made little difference on 64-bit hardware
OPTIMISE = True

def build_sequence(n):
    """return sequence as a list given the starting number
    Uses the trail of data left by compute_sequence"""
    tmp = compute_sequence(n)
    sequence = []
    while n:
        sequence.append(n)
        n = next_num[n]
    return sequence

def compute_sequence(n):
    """lazily compute sequences for Collatz problem"""
    if n in seqlength:
        return seqlength[n]
    if n not in next_num:
        # NOTE: (some) evens are pre-computed
        next_num[n] = 3 * n + 1 if n % 2 else n // 2
    seqlength[n] = 1 + compute_sequence(next_num[n])
    return seqlength[n]

import time
start = time.time()

highest_number = int(1000000)
highest_term = highest_number * 3 + 1
highest_term += 1 if highest_term % 2 else 0

next_num = {2:1}
if OPTIMISE:
    # quickly pre-compute (some of) the evens (used for n = n//2 if n is even)
    # how many should we precompute? Any mathematicians?
    doubles = range(2, highest_term, 2)
    numbers = range(1, highest_term//2)
    next_num = dict(zip(doubles, numbers))
# mark 1 as the end-point of any sequence
next_num[1] = 0

# initialise the sequence lengths
seqlength = {}
seqlength[1] = 0
seqlength[2] = 1
if OPTIMISE:
    # powers of 2 are trivial: 2**n has sequence length n
    n = 2
    pwr = 4
    while pwr < highest_term:
        seqlength[pwr] = n
        pwr = pwr * 2
        n += 1
max_length = 0
for n in range(3, highest_number + 1):
    length = compute_sequence(n)
    if length > max_length:
        max_length = length
        winning_number = n
print ("And the winner is {0} with sequence length {1}".format(winning_number, max_length))
end = time.time()
print ("Time (s): ", (end-start))

print ("Sequence is:")
print (build_sequence(winning_number))

# Sparsity calculation
sorted_seqlengths = sorted(seqlength.keys())
print ("Sparsity calculations...")
print ("Computed sequence lengths", len(seqlength))
largest_term = sorted_seqlengths[-1]
print ("Largest term: ", largest_term)
print ("Test range: ", 1, highest_number)
gaps = (second - first for first, second in zip(sorted_seqlengths[0:-1], sorted_seqlengths[1:]))
biggest_gap = 0
for n in gaps:
    if biggest_gap < n:
        biggest_gap = n
print ("Biggest gap: ", n)
print ("Sparsity: {0:.5f}%".format(highest_number / largest_term * 100))


On Tuesday, 19 February 2013 14:01:31 UTC+11, Chris Angelico  wrote:
> On Tue, Feb 19, 2013 at 12:39 PM, John Immarino <johimm at gmail.com> wrote:
> 
> > On Monday, February 18, 2013 2:58:57 PM UTC-7, Chris Angelico wrote:
> 
> >> On Tue, Feb 19, 2013 at 8:56 AM, Chris Angelico <rosuav at gmail.com> wrote:
> 
> >>
> 
> >> > On Tue, Feb 19, 2013 at 8:55 AM, Chris Angelico <rosuav at gmail.com> wrote:
> 
> >>
> 
> >> >> How long did your BASIC version take, and how long did the Python
> 
> >>
> 
> >> >> version on the same hardware?
> 
> >>
> 
> >> >
> 
> >>
> 
> >> > Oops, my bad, you already posted the figures :) And I forgot to ask:
> 
> >>
> 
> >> > Which Python version didyou use?
> 
> >>
> 
> >> >
> 
> >>
> 
> >> > ChrisA
> 
> >>
> 
> >>
> 
> >>
> 
> >> Doh. I'm having a great day of not reading properly, today. (I blame
> 
> >>
> 
> >> checking mail on the bus, it took me over an hour to read this one
> 
> >>
> 
> >> message and I'd forgotten the subject line by the time I got to the
> 
> >>
> 
> >> end.) Python 3.3, right there in the header. Disregard me!
> 
> >>
> 
> >>
> 
> >>
> 
> >> ChrisA
> 
> >
> 
> > Thanks,Chris. I'm a newbie to Python and didn't realize that it's not as good at number crunching as some of the others. It does seem to do better than Basic with numbers in lists as opposed to arrays in Basic.
> 
> 
> 
> Yes, Python is excellent at data handling. I'll cheerfully use Python
> 
> to manipulate huge lists or arrays, and its performance at that is
> 
> usually well within the "good enough" range (for instance, anything
> 
> that manipulates the file system will be waiting on my disks, not on
> 
> Python). It's an excellent tool in the toolkit, just not the one
> 
> solution to everything. (Nothing's that!)
> 
> 
> 
> ChrisA



More information about the Python-list mailing list