[Spambayes] language problems

Meyer, Tony T.A.Meyer at massey.ac.nz
Fri Jul 11 12:31:20 EDT 2003


> what kind of problems a spambayes user has when using a language 
> different of English?

Do you mean problems that will mean the software won't run, or problems
that mean the classification won't be very accurate?

For the first: apart from the Outlook plugin, everything *should* be
fine.  If anything doesn't work, please open a bug report and we'll do
what we can to fix it.  The Outlook plugin has some known problems with
non-English locales (there is an open bug report about it), particularly
those that use ',' as a decimal separator, rather than '.'.  We're
(slowly) working on it.

For the second: if the language is separated into words like English
(French, German, others that I don't know... :), probably including
Spanish, which might be the one you're after), then things should be
fine.  If not (like many Asian languages), I don't think anyone really
knows what will happen ;)

SpamBayes will learn whatever you give it.  This does mean that if most
of your spam is in one language, and most of your ham in another, you
may get classification errors when you get ham in the normal spam
language (and vice-versa).  If all your mail is in the same language,
then you should be ok.  By careful training (as many examples of each
type, in each language), then you should be able to get around this.
The list archives have (limited) discussion about this (including my
(still interesting to me, but not enough to spend time on) idea about
translating words).

HTH.

=Tony Meyer



More information about the Spambayes mailing list