[Tutor] pickle to save time of access

Prahlad Vaidyanathan slime@vsnl.net
Tue, 15 Jan 2002 10:24:32 GMT


Hi,

I have been trying to implement a script to retrieve news for my home
system . I have attached the entire script below for your perusal, but I
am primarily interested in this :

<code>
# Retrieves date from conffile 
f = open(conffile,'r')
f.seek(0)
oldtime = pickle.load(f)
f.close()
</code>

<code>
# Write new date to (the start of) conffile
f = open(conffile,'w')
f.seek(0)
pickle.dump(time.time(),f)
f.close()
</code>

As you can see, I use a pickled time.time() object to find out the
previous time the script was successfully run. I then use the 'oldtime'
variable to check the time of the news post. If the post's time
(newtime) is _after_ oldtime, I retrieve it, else I ignore it.

Now, I was wondering if there is a better was of doing the following :

* Completely abandong the 'conffile' for a better way of saving the
  previous time the script was successfully run.
* Add error-checking to ensure that 'conffile' was not written to since
  the last time the script was run successfully.

Thanks.

pv.

The entire script is here :

#!/usr/bin/python -S

# TODO
# * Handle exceptions better
# * check for new groups every month or so
# * check if conffile is there/has been modified since last pickle.dump
# * try and make rfc822.Message from this - readline() ?

import smtplib, nntplib, pickle, time
from string import split
from sys import exit
from rfc822 import parsedate

# Edit these variables before proceeding
newshost = 'news.vsnl.net.in'
smtphost = 'marvin.clone'
newsgroups = ['alt.humor.best-of-usenet', 'comp.os.linux.announce', 
'comp.os.linux.answers']
to_address = 'Prahlad V <prahlad@marvin.clone>'
conffile = '/home/prahlad/.fetchnewsrc'

# Variables you needn't edit
debug_level = 0
agent_name = 'fetchnews'

# Drum roll
print agent_name, 'started'

# Open nntp connection
try:
    ns = nntplib.NNTP(newshost)
except:
    print 'Unable to connect to', newshost
    exit(1)
ns.set_debuglevel(debug_level)

# Open smtp connection (sans exceptions ;)
server = smtplib.SMTP(smtphost)

# Retrieves date from conffile
f = open(conffile,'r')
f.seek(0)
oldtime = pickle.load(f)
f.close()

# Start retrieving
for group in newsgroups:
    # Get group info
    resp, count, first, last, name = ns.group(group)
    print 'Group', name, 'has', count, 'articles, range', first, 'to', last
    resp, subs = ns.xhdr('subject', first + '-' + last)
    # Process each mail
    for id, sub in subs[-10:]:
        # Construct mail
        msg = ""
        # Headers
        flag = 0
        for line in ns.head(id)[3]:
            try:
                head, value = split(line,": ",1)
            except ValueError:
                continue
            fromaddr = ""
            if head == "From":
                fromaddr = value
            if head == "Date":
                newtime = parsedate(value)
                if time.mktime(newtime) < oldtime:
                    flag = 1
                    break
                value = time.strftime('%a, %d %b %Y %H:%M:%S GMT', newtime)
            msg = msg + head + ": " + value + "\n"
        if flag:
            print "Ignoring : ", id, sub
            continue
        # Fallback
        if fromaddr == "":
            fromaddr = to_address
        # Put in the To header
        msg = msg + "To: " + to_address + "\n"
        # Put in a custom header for me procmail
        msg = msg + "X-NewsAgent: " + agent_name + "\n"
        # The body
        for line in ns.body(id)[3]:
            msg = msg + line + "\n"
        # Send the mail
        print 'Retrieving:', id, sub
        server.sendmail(fromaddr,to_address,msg)
    print 'Retrieved news from', group

# Write new date to (the start of) conffile
f = open(conffile,'w')
f.seek(0)
pickle.dump(time.time(),f)
f.close()

# Clean up
ns.quit()
server.quit()
print agent_name, 'done.'

-- 
Prahlad Vaidyanathan <slime@vsnl.net>

Buck-passing usually turns out to be a boomerang.