[Spambayes] Upgrade problem
Just van Rossum
just@letterror.com
Wed Nov 6 21:42:27 2002
Tim Stone - Four Stones Expressions wrote:
> This is why you keep a corpus. This is pre-alpha code, and anything that
> anyone does at any time can screw the world up. You should simply delete your
> database and retrain it. If you don't have a corpus, go ahead and make one
> now... <wink>
Alright, this triggered a feature request in me, which resulted in some hacking
activity <wink>. The patch below appends training messages to one of two mbox
files ('_pop3proxyspam.mbox' or '_pop3proxyham.mbox' respectively), making it
easier to later rebuild the database from scratch, while still being able to
train ad hoc with the web interface of pop3proxy.py. Good idea?
Just
Index: pop3proxy.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/pop3proxy.py,v
retrieving revision 1.10
diff -c -r1.10 pop3proxy.py
*** pop3proxy.py 5 Nov 2002 22:18:56 -0000 1.10
--- pop3proxy.py 6 Nov 2002 21:37:03 -0000
***************
*** 608,615 ****
raise SystemExit
def onUpload(self, params):
! message = params.get('file') or params.get('text')
isSpam = (params['which'] == 'spam')
self.bayes.learn(tokenizer.tokenize(message), isSpam, True)
self.push("""<p>Trained on your message. Saving database...</p>""")
self.push(" ") # Flush... must find out how to do this properly...
--- 608,626 ----
raise SystemExit
def onUpload(self, params):
! message = params.get('file') or params.get('text')
isSpam = (params['which'] == 'spam')
+ # Append the message to a file, to make it easier to rebuild
+ # the database later.
+ message = message.replace('\r\n', '\n').replace('\r', '\n')
+ if isSpam:
+ f = open("_pop3proxyspam.mbox", "a")
+ else:
+ f = open("_pop3proxyham.mbox", "a")
+ f.write("From ???@???\n") # fake From line (XXX good enough?)
+ f.write(message)
+ f.write("\n")
+ f.close()
self.bayes.learn(tokenizer.tokenize(message), isSpam, True)
self.push("""<p>Trained on your message. Saving database...</p>""")
self.push(" ") # Flush... must find out how to do this properly...
More information about the Spambayes
mailing list