Python 3.3 vs. MSDOS Basic

Mon Feb 18 16:55:21 EST 2013

On Tue, Feb 19, 2013 at 6:13 AM, John Immarino <johimm at gmail.com> wrote:
> I coded a Python solution for Problem #14 on the Project Euler website. I was very surprised to find that it took 107 sec. to run even though it's a pretty simple program.  I also coded an equivalent solution for the problem in the old MSDOS basic. (That's the 16 bit app of 1980s vintage.)  It ran in 56 sec. Is there a flaw in my coding, or is Python really this slow in this particular application. MSDOS Basic usually runs at a snails pace compared to Python.

BASIC does a lot less. If you wrote an 8086 assembly language
interpreter in Python, it'd run fairly slowly too :) Python isn't
really the world's best language for number crunching inside a machine
word; though if this were a major project, I would recommend looking
into Cython, as it lets you translate a few critical portions of your
code to C while leaving the rest in Python.

In order to get some useful stats, I added a little timing code to
your original; on my Windows XP laptop, running Python 3.3, your
version took 212.64 seconds to get to a result (namely, 837799 with a
count of 524).

Here's how I'd code it:

import time
start=time.time()
max=0
for m in range(1,1000001):
	n=m
	count=0
	while n>1:
		if n%2: n=3*n+1
		else: n//=2
		count+=1
	if count>max: max,num=count,m
	if not m&16383: print("->",m,count)
print(num,max)
print(time.time()-start)

(You'll see the same timing information that I added to yours. It adds
immeasurably to the run-time, and gives some early idea of how it's
going.)

Running under Python 2.6, both your version and mine take about 90
seconds to run. But under Python 3.3, where (among other things)
range() yields values lazily, my version is significantly faster than
yours. BUT! Both versions, under 3.3, are significantly *slower* than
under 2.6. My first thought is that it's because Py2 has different
types for 'int' and 'long', and Py3 doesn't (effectively, everything's
a long), so I added an L suffix to every number and ran each of them
under 2.6 again. Seems that was the bulk of the difference, though not
all.

Pythonistas, does this count as a regression, or is Python
sufficiently "not a number crunching language" that we don't care?

(range = my code, as above; while = original version with a C-style
loop counter)
range py3: 171.07846403121948
while py3: 212.64104509353638
range py2: 87.859000206
while py2: 86.4059998989
range py2 longs: 190.530999899
while py2 longs: 176.125999928

For comparison purposes, I also coded up the equivalent in Pike.
Pike's a very similar language to Python, but with a C-like syntax,
and certain optimizations - including, significantly to this exercise,
an integer type that sits within a machine word if it can (though
it'll happily go arbitrary precision when it's needed to). It pretends
to the programmer that it's a Py3-style "everything's an int", but
underneath, functions more like Py2 with separate short and long
types. The result: 22.649 seconds to reach the same conclusion.

How long did your BASIC version take, and how long did the Python
version on the same hardware?

This sort of pure number crunching isn't really where a modern high
level language shines. You'll come to *really* appreciate Python as
soon as you start working with huge arrays, dictionaries, etc. This is
a job for C, really.

ChrisA