Perl and Python, a practical side-by-side example.

Shawn Milo Shawn at Milochik.com
Fri Mar 2 17:44:46 EST 2007


I'm new to Python and fairly experienced in Perl, although that
experience is limited to the things I use daily.

I wrote the same script in both Perl and Python, and the output is
identical. The run speed is similar (very fast) and the line count is
similar.

Now that they're both working, I was looking at the code and wondering
what Perl-specific and Python-specific improvements to the code would
look like, as judged by others more knowledgeable in the individual
languages.

I am not looking for the smallest number of lines, or anything else
that would make the code more difficult to read in six months. Just
any instances where I'm doing something inefficiently or in a "bad"
way.

I'm attaching both the Perl and Python versions, and I'm open to
comments on either. The script reads a file from standard input and
finds the best record for each unique ID (piid). The best is defined
as follows: The newest expiration date (field 5) for the record with
the state (field 1) which matches the desired state (field 6). If
there is no record matching the desired state, then just take the
newest expiration date.

Thanks for taking the time to look at these.

Shawn

##########################################################################
Perl code:
##########################################################################
#! /usr/bin/env perl

use warnings;
use strict;

my $piid;
my $row;
my %input;
my $best;
my $curr;

foreach $row (<>){

	chomp($row);
	$piid = (split(/\t/, $row))[0];

	push ( @{$input{$piid}}, $row );
}

for $piid (keys(%input)){

	$best = "";

	for $curr (@{$input{$piid}}){
		if ($best eq ""){
			$best = $curr;
		}else{
			#If the current record is the correct state

			if ((split(/\t/, $curr))[1] eq (split(/\t/, $curr))[6]){
				#If existing record is the correct state
				if ((split(/\t/, $best))[1] eq (split(/\t/, $curr))[6]){
					if ((split(/\t/, $curr))[5] gt (split(/\t/, $best))[5]){
						$best = $curr;
					}
				}else{
					$best = $curr;
				}
			}else{
				#if the existing record does not have the correct state
				#and the new one has a newer expiration date
				if (((split(/\t/, $best))[1] ne (split(/\t/, $curr))[6]) and
((split(/\t/, $curr))[5] gt (split(/\t/, $best))[5])){
					$best = $curr;
				}
			}
		}


	}
	print "$best\n";
}

##########################################################################
End Perl code
##########################################################################






##########################################################################
Python code
##########################################################################

#! /usr/bin/env python

import sys

input = sys.stdin

recs = {}

for row in input:
	row = row.rstrip('\n')
	piid = row.split('\t')[0]
	if recs.has_key(piid) is False:
		recs[piid] = []
	recs[piid].append(row)

for piid in recs.keys():
	best = ""
	for current in recs[piid]:
		if best == "":
			best = current;
		else:
			#If the current record is the correct state
			if current.split("\t")[1] == current.split("\t")[6]:
				#If the existing record is the correct state
				if best.split("\t")[1] == best.split("\t")[6]:
					#If the new record has a newer exp. date
					if current.split("\t")[5] > best.split("\t")[5]:
						best = current
				else:
					best = current
			else:
				#If the existing  record does not have the correct state
				#and the new record has a newer exp. date
				if best.split("\t")[1] != best.split("\t")[6] and
current.split("\t")[5] > best.split("\t")[5]:
					best = current
			
	print best


##########################################################################
End Python code
##########################################################################



More information about the Python-list mailing list