[Tutor] Python NNTP Scripts?

Remco Gerlich scarblac@pino.selwerd.nl
Thu, 19 Apr 2001 20:18:27 +0200


On  0, Curtis Larsen <curtis.larsen@Covance.Com> wrote:
> Does anyone know where I could find some Python NNTP scripts to study?

I'll add to this post part of a late night hack that I did about a year ago.
I was having a few beers and wanted to know which newsreaders people were
using in nl.*.

(Outlook 49.7%, Forte Agent 15.2%, Netscape 12.3%, Forte Free Agent 4.8%,
33 other newsreaders).

I must say I'm rather surprised how easy to understand the code is now - I
don't want to see the Perl I produce when slightly drunk...

This is the function for downloading the posts; the bit for getting the
bodies is commented out, and the posts are stored by Message-ID in a shelf
in a directory per group:

import nntplib, string, time, shelve, os

def get_posts(grouplistfile, server='news.rug.nl', filename_func=default_file):
    nntp = nntplib.NNTP('news.rug.nl')

    groups = map(string.strip, grouplistfile.readlines())
    for group in groups:
        filename = ("/home/scarblac/python/nntphack/groups/%s/%s.dat" %
	    (time.strftime("%Y%m%d",time.localtime(time.time())), group))
        if os.path.exists(filename):
            continue
        try:
	    # This gets info about the list of articles in this group
            resp, count, first, last, name = nntp.group(group)
        except nntplib.NNTPError:
            continue
        db = shelve.open(filename)

        for i in xrange(int(first), int(last)+1):
            article = Article()
            try:
                resp, nr, id, list = nntp.head(str(i))
                article.set_headers(list)
                # resp, nr, id, list = nntp.body(str(i))
                # article.set_body(list)
                db[article.get_header("Message-ID")] = article
            except nntplib.NNTPError:
                pass


And the Article class:

class Article:
    def __init__(self, headers=None, body=None):
        self.headers = {}
        if headers:
            self.set_headers(headers)
            self.body = body

    def set_headers(self, header_list):
        """Internal. Header_list is a list of header strings from nntplib."""
        for line in header_list:
            i = string.find(line, ": ")
            if i == -1:
                continue
            self.headers[line[:i]] = string.strip(line[i+1:])
            
    def set_body(self, body):
        self.body = body

    def get_headers(self):
        return self.headers
    def get_header(self, header):
        return self.headers.get(header, "")
    def get_body(self):
        return self.body

> I'm especially interested in a Python-based news-searcher: how one
> would grab a collection of headers, then look through them using regex,
> then getting more if what you're looking for isn't there.  If it is
> there, then downloading the messages) for later perusal.  (What does the
> NNTP communications ebb and flow look like, etc.)

What you see above should be enough. The sad thing is that in order to get
all the headers in NNTP, you have to download the whole article. There are
but a few that you can get with an XOVER command (From:, Xref:, Subject: and
two others I think) but I can't remember how that's done with Python's
nntplib.

-- 
Remco Gerlich