[Spambayes] How do you classify text?

Tim Peters tim.one at comcast.net
Wed Apr 23 12:19:12 EDT 2003


[Miguel Sevillano]
>    I'm working in a project that must classify a paragraph as one among
> N subjects. I would like to know exactly how you take a paragraph and
> classify it; how do you train the filter?.
>
>    I would like to apply bayesian rules to distinguish among N
> differents subjects which a paragraph is talking about.
>
>    I hope that you help because it's an important project for me.

The spambayes project doesn't (despite its name <wink>) do Bayesian
classification, or N-way classification.  A good paper on a good system that
does both is Jason Rennie's

    "ifile: An Application of Machine Learning to E-Mail Filtering"

The paper summarizes the classic Bayesian classification approach.  Do learn
how to use citeseer:  it's a great way to find papers on tech subjects!  The
citeseer record for the paper above is:

    http://citeseer.nj.nec.com/11099.html




More information about the Spambayes mailing list