[Tutor] Python and Speed

Mon, 16 Apr 2001 18:48:16 -0400

On Tue, Apr 17, 2001 at 08:11:15AM +1000, Arthur Watts wrote:
| Guys,
| 
|          I noticed that the Advocacy thread contained reference to the
| question of Perl's assumed speed advantage over Python. As someone who
| championed Python's acceptance for use in implementing a replacement for the
| shell scripts which support our bread-and-butter application, I was keen to
| quantify the difference. I was particularly keen to see how well Python went
| with a common shell operation - reading in a file, sorting it and then
| writing the new file.
...
| 	3. I tried to use the same algorithmic approach for each language,
| and relied on the default 'quicksort' approach. 

I think that this is a 'defect' in your benchmark.  I have seen
several examples of perl (or other) programmers on c.l.py saying that
Python is horribly slow in comparison to the other language.

Then the gurus on c.l.py reworked their python code using more
pythonic idioms and data structures and the speed became comparable.
I think that for the benchmark to be "better" (by some definition of
better <wink>) the most natural approach (algorithm, data structures)
for each language must be used, even those approaches are almost
certainly quite different.

...
| 	For me, the big win was not in any perceived speed advantage : it
| was the fact that I achieved my goal with 32 lines of Python (over 45 linees
| of Perl and way too many lines of C...). Here are the comparisons between
| the Perl sort code and the Python sort code :

<grin>  That's the goal and the win.

...
| 	Finally, I would like this to provide the catalyst for the Perl and
| Python communities to each agree on a set of common tasks which we could use
| for benchmark testing each new release of these interpreters. I know that
| Guido has never denied that Python is slower, but I'd like to be able to
| quantify it against something a little less trivial than my own manufactured
| example.

I don't really agree that a "standard" benchmark would be a real
advantage because developers have a tendency to optimize to the
benchmark then.  Sure, the numbers on the benchmark may look great,
but what about the rest of the uses?

The following perl is faster than the python below it :

(barring any careless mistakes;  yes I know the python version will
add an extra newline to every line , I'm not sure about the perl
version)

for (<STDIN>)
{
    print ;
}

for line in sys.stdin.readlines() :
    print line

because (according to Tim Peters, and others) perl heavily
optimizes file io at the cost of portability (supposedly they toy with
the C FILE structs directly rather than use the C API) while python
focuses more on being portable and thread-safe.  Other differences
include readlines sucks everything into RAM at once,  xreadlines (in
>=2.1 would be better).

For another example, see the archives regarding counting words in a
file.  I posted a perl sample (didn't make it to the list) that I
wrote for a class,  then later translated to python.  The perl version
was (insignificantly, IMO) faster than the python version.  Both
produced identical results.  I posted several observations about the
testbed and the python code.  The python version is on the Useless
pages.  I did have ~3 bugs in the perl version, but I don't remember
what they are (the prof said so when I got my grade).  The python
version probably has the same 3 errors <wink>.

I rewrote that script and measured the times in response to someone
having trouble with performance of his solution to a similar problem.
The original post used several classes and nested linear traversals of
lists.  He said it took over 15 minutes (or so) for his version to
count a 2M (or something) file.  The version I posted used different
data structures and took less than 2 minutes (at most, I don't
remember exactly now).

-D