Any Neural Net code in Python? I want to filter out spam email

s713221 at student.gu.edu.au s713221 at student.gu.edu.au
Fri Apr 20 08:08:17 EDT 2001


Romuald Texier wrote:
> 
> Ken Seehof wrote:
> 
> > Excellent idea, Dan.  That's conveniently sidesteps the most difficult
> > issue: getting the neural network to actually come up with linguistic
> > rules.  Once an intelligent human specifies the set of rules, the neural
> > net should have no difficulty coming up with an optimal non-linear
> > function of pre-processed features (i.e. the "rules") to identify spam.
> > Analysis of the weights after training will help remove rules that turn
> > out to be irrelevant.
> 
> Wouldn't decision tree or other rule inference algorithms be more accurate
> than neural networks for that kind of machine learning ? Moreover, neural
> nets are "black box" : you do not get a logical rule set (that may be
> edited by humans or exchanged) but an ugly matrix of floats...
> 
> --
> Romuald Texier

Not for all neural net algorithms. There are a variety of neural net
algorithms that, when backpropagation is applied to a node, it's closest
neighbour also recieves a little bit of this weighting adjustment as
well. What you end up with is a grouping of similarly behaved nodes,
which should be easier to extract rule sets from.

However, if a simple weighting of rules are all that one is after, maybe
a genetic algorithm is all that is required, which does give the benefit
of simple rule extraction by humans. Give it an ~ 20 sized population of
weightings, apply stociastic propagation with all the mutation,
cross-over functions etc., and let the population climb to an optimal
peak. Of course, neural nets have the advantage that they include the
posibility of non-linear behaviour. To include non-linearity for the
genetic algorithm, you'd need to do things like this:

f_n is the n'th filtering rule
w_n is the linear weighting associated with this rule
w_nm is the bilinear weighting associated with rules n and m
And run the genetic algoritm for this, with penalties for using w_nm's.
Only hassle is, for a rule set of size N, a population that tries to
solve a complete solution has to solve for N + N^2 (allowing for w_nn
values) weightings. 

Maybe a genetic algorithm which allows incomplete but expandable
solutions, that penalises for computational size? Everything I wrote
beyond this point was too vague to include so I'm shutting up now.
____________snip___________________

Joal Heagney/AncientHart



More information about the Python-list mailing list