[spambayes-dev] SpamBayes core_server.py and related bits merged to CVS HEAD

skip at pobox.com skip at pobox.com
Sun Jun 10 19:06:13 CEST 2007


    skip> For Reimar and Marian (the MoinMoin gurus), I did a very little
    skip> bit of performance testing.  Roundtrip performance on my laptop
    skip> (Mac PowerBook G4 - 800MHz) with both the server and client
    skip> running on the same machine ranged anywhere from 10-50 bytes/ms.
    skip> When I added a large payload (a MIME encoded JPEG file of 9.5MB)
    skip> performance in terms of bytes/ms shot way up, but as you would
    skip> imagine overall time did as well.  Here are some figures:

    skip>     attachment     time          bytes/ms
    skip>        size
    skip>     9587824        30.7 sec      312
    skip>      975978         3.7 sec      259
    skip>      114794         0.5 sec      252
    skip>       28675         0.2 sec      142

I probably should have drawn some inferences from this.  First, if you
really try to score 100MB payloads (Reimer & Marian suggested that some
people routinely attach 100MB Word (I think) files to wikis), you're going
to be disappointed.  Second, although attachments of that size would be
problematic, since SpamBayes doesn't examine the guts of binary data,
there's probably nothing wrong with trimming the binary file to a reasonable
size (< 1MB?) and including that trimmed version in the score request.

Also, note that I've really don't nothing with non-ASCII data to this point.
I suspect people more familiar with that will see a clear path to sanity
fairly easily.

Skip



More information about the spambayes-dev mailing list