Bayesian kids content filtering in Python?

Tue Sep 2 17:46:50 EDT 2003

ed at digitallumber.net (Ed Stoner) writes:

> I've been looking at this sort of thing for a while now.  I initially
> tried what you did about a year and a half ago and ran into some
> problems.  I ended up writing my own http proxy, html parser, and
> bayes filtering code.  It is being used now by Woodland Hills School
> District in Pittsburgh, PA.  The school has 2300 computers, 6000
> students, and 3 T1 lines to the Internet.  Most of the problems I ran
[...]

Interesting.  How closely has the success of Willow at the school been
monitored?  Has somebody taken a statistical sample of unlucky kids
and watched their surfing closely, for instance?  How does it fail, if
so?  I think the ways that it fails are probably more interesting than
the frequency of failure at a particular site.  I suppose one problem
is that the frequency of abuse is likely to be low once you start
cracking down!

What have been your experiences with attempting to subvert your own
system?

Of course, the fundamental problem with testing these things is that a
successful block doesn't tell you anything about all the cases where
blocking would fail...

John