Using the nntplib module to count Google Groups users

Zero Piraeus z at etiol.net
Sun Oct 27 02:37:29 EDT 2013


:

On Sun, Oct 27, 2013 at 03:35:40PM +1100, Chris Angelico wrote:
> On Sun, Oct 27, 2013 at 2:32 PM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
> > If anyone wants to modify the script to determine the ratio of posters,
> > rather than posts, using GG, be my guest.
> 
> And if anyone does, do please post the result on-list.

Taking a different tack, since I happen to have a complete[1] local
archive of python-list going back a few years ... here's a quick and
dirty script to count unique senders and Google Groups users for this
year:

 - - -

import os
from email.parser import HeaderParser

LIST = "python-list at python.org" 
MAILDIR = "/path/to/mail/archive/cur"
YEAR = "2013"

parser = HeaderParser()

found = set()
gg_users = 0

for filename in os.listdir(MAILDIR):
    with open(os.path.join(MAILDIR, filename)) as message:
        headers = parser.parse(message)
        sender = headers.get("from", "")
        dest = headers.get("to", "")
        date = headers.get("date", "")
        if (LIST not in dest) or (YEAR not in date) or (sender in found):
            continue
        found.add(sender)
        if "groups-abuse at google.com" in headers.get("complaints-to", ""):
            gg_users += 1
            print("GG user:")
        print(sender)
        print("Senders: %d" % len(found))
        print("GG users: %d" % gg_users)
        print("---")

 - - -

It's obviously not very robust, but I reckon it's good enough to get an
idea what's going on.

The results:

Senders: 1701
GG users: 879

... so just over 50%.

If anyone wants the complete output, just let me know and I'll email it
privately.

 -[]z.

[1] except for spam filtered out by Gmail.

-- 
Zero Piraeus: ad referendum
http://etiol.net/pubkey.asc



More information about the Python-list mailing list