[Tutor] Top posters to tutor list for 2008

Kent Johnson kent37 at tds.net
Thu Jan 1 15:34:24 CET 2009


For several years I have been using a simple script to find the top 20
posters to the tutor list by web-scraping the archive pages. I thought
others might be interested so here is the list for 2008 and the script
that generates it. The lists for previous years (back to 2003) are at
the end so everyone on the list doesn't hit the archives to find out
:-)

The script gives a simple example of datetime, urllib2 and
BeautifulSoup. It consolidates names that vary by case but other
variations are not detected.

Alan, I thought you might have passed me this year but we are both off
a little :-) Somehow I have posted an average of 2.8 times per day for
the last four years...

Happy New Year everyone!

Kent

2008
====
Kent Johnson 931
Alan Gauld 820
bob gailer 247
Dick Moores 191
W W 142
Wayne Watson 106
John Fouhy 97
Steve Willoughby 91
Lie Ryan 88
bhaaluu 85
Marc Tompkins 83
Michael Langford 71
Tiger12506 70
Andreas Kostyrka 64
Dinesh B Vadhia 64
wesley chun 58
Tim Golden 57
Chris Fuller 54
Ricardo Aráoz 53
spir 53

#####################################

''' Counts all posts to Python-tutor by author'''
# -*- coding: latin-1 -*-
from datetime import date, timedelta
import operator, urllib2
from BeautifulSoup import BeautifulSoup

today = date.today()

for year in [2008]:
    startDate = date(year, 1, 1)
    endDate = date(year, 12, 31)
    thirtyOne = timedelta(days=31)
    counts = {}

    # Collect all the counts for a year by scraping the monthly author
archive pages
    while startDate < endDate and startDate < today:
        dateString = startDate.strftime('%Y-%B')

        url = 'http://mail.python.org/pipermail/tutor/%s/author.html'
% dateString
        data = urllib2.urlopen(url).read()
        soup = BeautifulSoup(data)

        li = soup.findAll('li')[2:-2]

        for l in li:
            name = l.i.string.strip()
            counts[name] = counts.get(name, 0) + 1

        startDate += thirtyOne

    # Consolidate names that vary by case under the most popular spelling
    nameMap = dict() # Map lower-case name to most popular name
    for name, count in sorted(counts.iteritems(),
key=operator.itemgetter(1), reverse=True):
       lower = name.lower()
       if lower in nameMap:
          # Add counts for a name we have seen already
          counts[nameMap[lower]] += count
       else:
          nameMap[lower] = name

    print
    print year
    print '===='
    for name, count in sorted(counts.iteritems(),
key=operator.itemgetter(1), reverse=True)[:20]:
        print name.encode('latin-1', 'xmlcharrefreplace'), count
    print


# Results as of 12/31/2008:
'''
2003
====
Danny Yoo 617
Alan Gauld 421
Jeff Shannon 283
Magnus Lycka 242
Bob Gailer 195
Magnus =?iso-8859-1?Q?Lyck=E5?= 166
alan.gauld at bt.com 161
Kirk Bailey 155
Gregor Lingl 152
Lloyd Kvam 142
Andrei 118
Sean 'Shaleh' Perry 117
Magnus Lyckå 113
Michael Janssen 113
Erik Price 100
Lee Harr 88
Terry Carroll 87
Daniel Ehrenberg 78
Abel Daniel 76
Charlie Clark 74


2004
====
Alan Gauld 699
Danny Yoo 530
Kent Johnson 451
Lloyd Kvam 146
Dick Moores 145
Liam Clarke 140
Brian van den Broek 122
Karl Pfl&#228;sterer 109
Jacob S. 101
Andrei 99
Chad Crabtree 93
Bob Gailer 91
Magnus Lycka 91
Terry Carroll 88
Marilyn Davis 84
Gregor Lingl 73
Dave S 73
Bill Mill 71
Isr Gish 71
Lee Harr 67


2005
====
Kent Johnson 1189
Danny Yoo 767
Alan Gauld 565
Alan G 317
Liam Clarke 298
Max Noel 203
Nathan Pinno 197
Brian van den Broek 190
Jacob S. 154
jfouhy at paradise.net.nz 135
Alberto Troiano 128
Bernard Lebel 119
Joseph Quigley 101
Terry Carroll 93
Andrei 79
D. Hartley 77
John Fouhy 73
bob 73
Hugo Gonz&#225;lez Monteverde 72
Orri Ganel 69


2006
====
Kent Johnson 913
Alan Gauld 815
Danny Yoo 448
Luke Paireepinart 242
John Fouhy 187
Chris Hengge 166
Bob Gailer 134
Dick Moores 129
Asrarahmed Kadri 119
Terry Carroll 111
Python 94
Mike Hansen 74
Liam Clarke 72
Carroll, Barry 67
Kermit Rose 66
anil maran 66
Hugo Gonz&#225;lez Monteverde 65
wesley chun 63
Christopher Spears 53
Michael Lange 51

2007
====
Kent Johnson 1052
Alan Gauld 938
Luke Paireepinart 260
Dick Moores 203
Eric Brunson 164
Terry Carroll 128
Tiger12506 112
John Fouhy 105
Bob Gailer 97
Ricardo Ar&#225;oz 93
Rikard Bosnjakovic 93
bhaaluu 88
elis aeris 83
Andreas Kostyrka 77
Michael Langford 68
shawn bright 63
Tim Golden 62
Dave Kuhlman 62
wormwood_3 53
wesley chun 53

2008
====
Kent Johnson 931
Alan Gauld 820
bob gailer 247
Dick Moores 191
W W 142
Wayne Watson 106
John Fouhy 97
Steve Willoughby 91
Lie Ryan 88
bhaaluu 85
Marc Tompkins 83
Michael Langford 71
Tiger12506 70
Andreas Kostyrka 64
Dinesh B Vadhia 64
wesley chun 58
Tim Golden 57
Chris Fuller 54
Ricardo Ar&#225;oz 53
spir 53

'''


More information about the Tutor mailing list