[spambayes-dev] Dibbler.py error in training

Kenny Pitt kennypitt at hotmail.com
Tue Apr 6 15:55:46 EDT 2004


sean darcy wrote:
>> In looking more closely, though, something seems a little odd here. 
>> The offending object that is coming back None appears to be the
>> msg[header] reference.  If I'm not mistaken, that means that either
>> the Subject: or To: header is missing entirely from the message,
>> which is very unusual. 
> 
> It's not that unusual for the Subject header to be missing. Looking
> over past emails, I've found some "ham" posts that had no subject. In
> any event, some of the posts to be trained do have  no Subject - all
> spam.   

Well, it's certainly not unusual for the Subject: header to be empty but
I didn't realize that it was legal to leave out the header entirely.
Guess I'll have to go back and re-read the spec! <wink>

Anyway, I checked in a new fix (Corpus.py 1.19) to guard against missing
headers, so give that a try when it comes through and let us know the
results.

>> Could you, by chance, attach a copy of the message that is causing
>> the error?
> 
> The untrained message page has about 60 messages. How do I know which
> one is the problem? 

Click the "Defer" heading to make sure that is the default for all
messages, then select a classification for only one message at a time to
see which one dies.  You can then go back to Review Messages and click
the subject of that message to display the message source.

>> A copy of it should appear as a file in one of the cache
>> directories below the directory containing your training database, or
>> you could just view the message source from Review Messages and
>> copy-and-paste it.
> 
> You've lost me. Here's my spambayes data directory:
> 
> ls
> bayescustomize.ini      _pop3proxy.log            pop3proxy-spam-cache
> bayescustomize.ini~     pop3proxy.log-1          
> pop3proxy-unknown-cache 
> bayescustomize.ini.bak  pop3proxy.log-evolution  
> spambayes.messageinfo.db 
> hammie.db               pop3proxy.log-evolution~  start.info
> pop3proxy-ham-cache     pop3proxy.log-mozilla

The pop3proxy-unknown-cache subdirectory contains copies of e-mails that
haven't been trained yet, up to the expiration age which I believe
defaults to 7 days.  No worries, though.  The message source you
included in the message was what I was interested in.

-- 
Kenny Pitt




More information about the spambayes-dev mailing list