I need help speeding up an app that reads football scores and generates rankings

Arnaud Delobelle arnodel at googlemail.com
Wed May 2 12:49:38 EDT 2007


On May 2, 4:00 pm, jocknerd <jeff.s... at gmail.com> wrote:
> About 10 years ago, I wrote a C app that would read scores from
> football games and calculate rankings based on the outcome of the
> games.  In fact, I still use this app.  You can view my rankings athttp://members.cox.net/jocknerd/football.
>
> A couple of years ago, I got interested in Python and decided to
> rewrite my app in Python.  I got it to work but its painfully slow
> compared to the C app.  I have a file containing scores of over 1500
> high school football games for last season.  With my Python app, it
> takes about 3 minutes to process the rankings.  With my C app, it
> processes the rankings in less than 15 seconds.
>
> The biggest difference in my two apps is the C app uses linked lists.
> I feel my Python app is doing too many lookups  which is causing the
> bottleneck.
>
> I'd love some feedback regarding how I can improve the app.  I'd like
> to drop the C app eventually.  Its really ugly.  My goal is to
> eventually get the data stored in PostgreSQL and then have a Django
> powered site to process and display my rankings.
>
> You can download the source code fromhttp://members.cox.net/jocknerd/downloads/fbratings.py
> and the data file fromhttp://members.cox.net/jocknerd/downloads/vhsf2006.txt
>
> Thanks!

A simple improvement is to change your list of teams('teamlist') to a
dictionary of teams (call it say 'teamdict') mapping team names to
teams.

You have lots of
    #Some code
    for row in teamlist:
        if teamname == row['name']:
            #Do something with row

These can all be replaced with:
   #Some code
   row = teamdict[teamname]
   #Do something with row

(Although I wouldn't call it 'row' but rather 'team')

That may speed up your code significantly.

Moreover you can make the main loop (in calcTeamRatings) faster by
avoiding looking up a team each time you need some info on it.

Finally I would change your schedule list to a list of tuples rather
than a list of dictionaries: each game in the schedule would be a
tuple (team1, team2, ratio) and wouldn't include the actual team
scores as you don't seem to use them in your calcTeamRatings function
(that means moving the ratio calculation into the loop that creates
the schedule)

Disclaimer: I only looked at your code superficially and I don't claim
to understand it !

HTH

--
Arnaud




More information about the Python-list mailing list