code for Computer Language Shootout

Wed Mar 16 21:07:09 EST 2005

On Wed, 16 Mar 2005 16:45:53 -0800, bearophileHUGS wrote:

> Michael Spencer's version is nice, this is a bit shortened version. The
> main() isn't useful for this very short loop, and you can use shorter
> variable names to make lines shorter (this code isn't much readable,
> it's just for the Shootout, "production quality" code has probably to
> be more readable. Code produced by lot of people of a newsgroup isn't
> the normal code usually produced by a single programmer in a limited
> amount of time).
> I've used file(sys.argv[1]) instead of sys.stdin.
> 

I don't see what advantage having smaller variable names gives you. IIRC,
they measure lines of code in logical rather than physical lines.

> 
> . import string, itertools, sys
> .
> . t = string.maketrans('ACBDGHKMNSRUTWVYacbdghkmnsrutwvy',
> .                      'TGVHCDMKNSYAAWBRTGVHCDMKNSYAAWBR')
> .
> . for h,b in itertools.groupby( file(sys.argv[1]), lambda x: x[0] in
> ">;" ):
> .     if h:
> .         print "".join(b),
> .     else:
> .         b = "".join(b).translate(t, "\n\r")
> .         print "\n".join( b[-i:-i-60:-1] for i in xrange(1, len(b),
> 60) )
> 

I benchmarked this, btw - it ran in the same amount of time as the other
solution. It does have the advantage of being significantly fewer lines of
code; I suppose that itertools.groupby, while unexpected to someone from a
language without such niceties in the standard library =), is a better
solution than duplicating the code (or the function call) to translate,
reverse, and format the string.

> ----------------------
> 
> The Python Mandelbrot program seems to produce a wrong image:
> 
> http://shootout.alioth.debian.org/benchmark.php?test=mandelbrot&lang=python&id=0&sort=fullcpu
> 

It's my understanding that they use an automated diff with the outputs. So
presumably it's generating correct output or it would be listed as
"Error". I haven't actually checked this, so who knows.

> ----------------------
> 
> This is a shorter and faster version of wordfreq:
> http://shootout.alioth.debian.org/benchmark.php?test=wordfreq&lang=python&id=0&sort=fullcpu
> 
> . import string, sys
> .
> . def main():
> .     f = {}
> .     t = " "*65+ string.ascii_lowercase+ " "*6+
> string.ascii_lowercase+ " "*133
> .
> .     afilerl = file(sys.argv[1]).readlines
> .     lines = afilerl(4095)
> .     while lines:
> .         for line in lines:
> .             for w in line.translate(t).split():
> .                 if w in f: f[w] += 1
> .                 else: f[w] = 1
> .         lines = afilerl(4095)
> .
> .     l = sorted( zip(f.itervalues(), f.iterkeys()), reverse=True)
> .     print "\n".join("%7s %s" % (f,w) for f,w in l)
> .
> . main()
> 

Cool. I haven't looked at this one, but why don't you test it against
their sample data, diff it to make sure the output is identical, and send
it to their mailing list :-).

> ----------------------
> 
> This is my shorter and faster version of Harmonic (I hope the use of
> sum instead of the for is okay for the Shootout rules):
> http://shootout.alioth.debian.org/benchmark.php?test=harmonic&lang=python&id=0&sort=fullcpu
> 
> import sys
> print sum( 1.0/i for i in xrange(1, 1+int(sys.argv[1]) ) )
> 

Yes, the current Python version is an embarrassment. I was already
planning to send in the one-character patch (s/range/xrange/) when
submitting the reverse-complement code. This version is probably more
efficient than an explicit loop, although I doubt the difference is by
much. I suppose an ounce of profiling is worth a pound of speculation...

-- 
Jacob Lee
jelee2 at uiuc.edu | www.nearestneighbor.net