python vs perl performance test

Chris Mellon arkanes at gmail.com
Thu Dec 13 13:54:44 EST 2007


On Dec 13, 2007 12:11 PM,  <igor.tatarinov at gmail.com> wrote:
> First, let me admit that the test is pretty dumb (someone else
> suggested it :) but since I am new to Python, I am using it to learn
> how to write efficient code.
>
> my $sum = 0;
> foreach (1..100000) {
>     my $str = chr(rand(128)) x 1024;
>     foreach (1..100) {
>         my $substr = substr($str, rand(900), rand(100));
>         $sum += ord($substr);
>     }
> }
> print "Sum is $sum\n";
>

> Basically, the script creates random strings, then extracts random
> substrings, and adds up the first characters in each substring. This
> perl script takes 8.3 secs on my box and it can probably be improved.
>
> When I first translated it to Python verbatim, the Python script took
> almost 30 secs to run.
> So far, the best I can do is 11.2 secs using this:
>
> from random import randrange
> from itertools import imap, repeat
> from operator import getitem, add, getslice
>
> result = 0
> zeros = [0]*100
> for i in xrange (100000):
>     s = [chr(randrange(128))] * 1024
>     starts = repeat(randrange(900), 100)
>     ends = imap(add, starts, repeat(randrange(1,100), 100))
>     substrs = imap(getslice, repeat(s, 100), starts, ends)
>     result += sum(imap(ord, imap(getitem, substrs, zeros)))
>
> print "Sum is ", result
>
> There's got to be a simpler and more efficient way to do this.
> Can you help?

Benchmarking is usually done to test the speed of an operation. What
are you trying to measure with this test? String slicing? Numerical
operations? Looping? You're doing all sorts of bogus work for no
reason. The use of randomness is also totally arbitrary and doesn't do
anything except make it harder to confirm the correctness of the test.
Is the ability to generate 0-length substrings (which perl claims have
an ordinal of 0) intentional?

For the record, taking the randomness out makes this take 4 seconds
for perl, and 6 seconds for a literal translation in python. Moving it
into a function drops the python function to 3.5 seconds.



More information about the Python-list mailing list