I need help speeding up an app that reads football scores andgenerates rankings

Terry Reedy tjreedy at udel.edu
Wed May 2 12:38:17 EDT 2007


"jocknerd" <jeff.self at gmail.com> wrote in message 
news:1178118022.865173.266300 at h2g2000hsg.googlegroups.com...
| About 10 years ago, I wrote a C app that would read scores from
| football games and calculate rankings based on the outcome of the
| games.  In fact, I still use this app.  You can view my rankings at
| http://members.cox.net/jocknerd/football.
|
| A couple of years ago, I got interested in Python and decided to
| rewrite my app in Python.  I got it to work but its painfully slow
| compared to the C app.  I have a file containing scores of over 1500
| high school football games for last season.  With my Python app, it
| takes about 3 minutes to process the rankings.  With my C app, it
| processes the rankings in less than 15 seconds.

A ratio of 12 to 1 is not bad.  However....

| The biggest difference in my two apps is the C app uses linked lists.
| I feel my Python app is doing too many lookups  which is causing the
| bottleneck.

You have to do as many lookups as you have to do, but looking up teams by 
name in a linear scan of a list is about the slowest way possible.  Replace 
'teamlist' with a dict 'teams' keyed by team name.  Replace 
'lookupTeam(team)' by 'if team not in teams: addTeam(team)' and delete the 
lookupTeam function.  Similarly 'lookupTeamRate(team)' becomes 
'teams[team]['grate'] (and delete function).  And 
'updateTeamRate(team,rate)' becomes teams[team]['rate'] = rate' (and delete 
function.  And similarly for updateTeamRating and anything else using 
teamlist.  In many places, multiple lookups in teams could be eliminated. 
For instance, 'team1 = teams[g['team1']].  Then use 'team1' to manipulate 
its rating and other attributes.

| You can download the source code from 
http://members.cox.net/jocknerd/downloads/fbratings.py
| and the data file from 
http://members.cox.net/jocknerd/downloads/vhsf2006.txt

Minor point.  Multiple functions do 'localvar = <expression>; return 
localvar'.   The simpler 'return <expression>' will be slightly faster. 
Your comments and function name eliminate any documentary need for the 
otherwise useless local var.

Function calls are relatively slow in Python.  So calling
def totalPtsGame (score1, score2): return score1 + score2
is slower than simply adding the scores 'in place'.

Terry Jan Reedy


You can also, people say, use the profiler to find where time is going. 






More information about the Python-list mailing list