[spambayes-dev] Dev environment setup

Tony Meyer tony.meyer at gmail.com
Sat Dec 2 03:20:30 EST 2023


Hi,

For what it's worth, of the storage backends in the sourceforge version:

* pickle: probably still best for the research tools since it's basically
just an in-memory dict that is dumped to disk
* dbm: this could be bsddb3, but also gdm and others. Out of date.
* Postgres: should still be fine - could be worth using if you already have
a Postgres server
* MySQL: same as Postgres (MariaDB also works with this)
* CDB: out of date
* ZODB: no idea if this would still work
* Zeo: probably doesn't work

I believe Skip's copy on Github already drops a bunch of these.

If you're not already using MySQL/MariaDB/postgres and aren't doing
training, then my guess would be that sqlite3 would likely be the best
storage backend to add. It's fast, has stdlib support, does everything
needed if you don't need concurrent access or replication, and should be
able to mostly use the existing SQLClassifier base class.

I could help/write a sqlite3 storage class if anyone needed that.

Factoring in the language can definitely help accuracy. I probably can't go
into too much detail on this, but I have used SpamBayes with language
detection additions in very large scale deployments with good results.

Ngā mihi,
Tony

On Sat, 2 Dec 2023 at 2:04 AM, Marko von Oppen <marko at von-oppen.com> wrote:

> Hello John,
>
> in the last few days I did some additional work based on the code from
> here:
>
> https://github.com/mpwillson/spambayes3
>
> Here at least the Spambayes core functionality is already running under
> Python3 (There are other similar repositories available too).
>
> It supports also Berkeley DB. From my understanding Berkeley DB is not
> supported anymore by modern Linux distributions like Debian. So It's not an
> option for me.
>
> The code in the repository seems to support standard Python pickle and
> ZODB-pickle what is available  at least on modern Linux. From the
> comments in the code I assume that ZODB is the best option when you do not
> want to use a real SQL database backend but I have not used Zope/ZODB on
> Windows for nearly 20 years. So I don't know if ZODB is an option for you.
> Maybe ZODB is easy available in pip.
>
> As I'm planning to use sb_filter.py, I re-activated MySQL as DB in the
> code there and it is running well against my old Spambayes database.
> PostgreSQL should work too.
>
> Currently I'm working on some modernization of this code, using argparse instead
> of getopt, logging module instaed of printf(), modern string formatting
> and so on. Also I'll do some SQL optimizations ("insert ... on duplicate
> key ...", specify charset on table creation and so on).
>
>
> Too I want to implement a new feature by adding support for "langdetect".
> As I get emails in different languages with overlapping words and very
> different Ham/Spam ratios I suppose tagging all words with the email's
> language could reasonable improve the results.
>
> I'll share my changes here https://github.com/mvoppen/spambayes3 soon.
>
>
> BR
>
> Marko
>
>
>
> Am 01.12.2023 um 02:39 schrieb John Cherney:
>
> I think I’ve gotten most of the code converted to Python 3.12. I’m at the
> point now where I have to get the database working. It looks like bsddb is
> only supported up to 3.9. Its replacement is berkeleydb from Oracle, but
> I’d need to set up an account to download the installer. I’m assuming
> berkeleydb is freeware and redistributable, but I’d have to look into that.
> Before I go that far, though, is there a different database (maybe one
> already built into python, or a python friendly db) that you would
> recommend? Ideally, the db would be just a drop-in replacement for bsddb.
>
>
>
> Thanks,
>
> jwc
>
>
>
> *From:* John Cherney <jwcherney at hotmail.com> <jwcherney at hotmail.com>
> *Sent:* Saturday, November 18, 2023 6:07 PM
> *To:* Marko von Oppen <marko at von-oppen.com> <marko at von-oppen.com>; Skip
> Montanaro <skip.montanaro at gmail.com> <skip.montanaro at gmail.com>;
> mhammond at skippinet.com.au
> *Cc:* spambayes-dev at python.org
> *Subject:* RE: [spambayes-dev] Dev environment setup
>
>
>
> Thank you, Marko! That did help. I found DebugView at
> https://learn.microsoft.com/en-us/sysinternals/downloads/debugview. When
> I re-enable the spambayes add-in, I see the errors pop up in the debugview
> window. At this point, I haven’t had to install anything new. I’m playing
> around with the PATH and PYTHONPATH so the modules are found, and then
> adjusting code to python3. Fortunately, the changes fall within my limited
> knowledge of python and ability to google. It isn’t fast progress, but it’s
> progress.
>
>
>
> For your other question: Spambayes with Python3, there is none that I
> could find. I’m hoping it’s just syntax changes that are required to get
> this running on a 64-bit MS Office install with Python3.
>
>
>
> Thanks!
>
> jwc
>
>
>
>
>
> *From:* Marko von Oppen <marko at von-oppen.com>
> *Sent:* Thursday, November 16, 2023 7:38 PM
> *To:* John Cherney <jwcherney at hotmail.com>; Skip Montanaro <
> skip.montanaro at gmail.com>; mhammond at skippinet.com.au
> *Cc:* spambayes-dev at python.org
> *Subject:* Re: [spambayes-dev] Dev environment setup
>
>
>
> Hello jwc,
>
> some 15 years ago I was a little bit involved in the development of the
> Outlook plugin. I do not remember much because I also switched to
> Thunderbird in 2007 and have not used Outlook since that nor did I develop
> anything for Windows since that except some Python console scripting.
>
> I remember one thing, when you write that you see no output. In the
> development tools at that time there existed a Windows function
> "OutputDebugMsg()" and an application from Microsoft named "Debug Monitor"
> catching all debug output. If I remember correctly things like Python
> exceptions where redirected to that application if no other debugger
> running.
>
> I do not know if this mechanism still exists in modern Windows but maybe
> it could be a hint where to start searching.
>
> BR
>
> Marko
>
>
>
> Am 15.11.2023 um 04:48 schrieb John Cherney:
>
> Thank you for the info and support!
>
>
>
> Do either of you use the Outlook client, and if so, can you suggest and
> alternate spam filter? I agree gmail does a great job keeping out spam. I
> want to still use the Outlook client for my Microsoft email accounts.
> (Although I suppose I could be convinced to use their web client, but I
> think its spam filtering is just like the Outlook client)
>
>
>
> With the help of Google and StackOverflow, I was able to get the local
> code running with Python3.12 and get the Add In gets registered. But it has
> the same issue as with the original 1.1a6 install. Outlook complains that
> there is a runtime error on loading of the Add In, and the add-in is
> disabled. Any ideas on where I could go to see that exception? I was hoping
> something would go in the EventViewer, or in a log file somewhere, but I
> haven’t found anything yet.
>
>
>
> Thanks,
>
> jwc
>
>
>
> *From:* Skip Montanaro <skip.montanaro at gmail.com>
> <skip.montanaro at gmail.com>
> *Sent:* Monday, November 13, 2023 2:57 PM
> *To:* mhammond at skippinet.com.au
> *Cc:* John Cherney <jwcherney at hotmail.com> <jwcherney at hotmail.com>;
> spambayes-dev at python.org
> *Subject:* Re: [spambayes-dev] Dev environment setup
>
>
>
> Thanks for responding Mark. As you indicated, SpamBayes has been on
> long-term hiatus. The biggest impressive for me are a) Gmail does a good
> job, and b) I've so far been unable to convince anyone with Windows
> packaging experience to update that side of things.
>
>
>
> That said, porting most of it to Python 3 isn't likely to be all that
> difficult. A couple of us have taken partial cracks at it.
>
>
>
> Skip
>
>
>
> On Mon, Nov 13, 2023, 10:13 AM Mark Hammond <skippy.hammond at gmail.com>
> wrote:
>
> I don't think SpamBayes has any current developers and most work dried up
> even before Python 3 was ready. Modern Outlook also hasn't been tested. So
> I suspect you are probably on your own here, but those of us left holding
> the keys would be happy to arrange any doc etc changes you might make get
> committed if possible.
>
> Cheers,
>
> Mark
>
> On 2023-11-12 10:57 p.m., John Cherney wrote:
>
> Is there a recommended set of versions of tools and libraries for
> Spambayes development? (In particular, is there a recommended setup for
> python 3.x?)
>
>
>
> I would like to get Spambayes to work on a Windows 10 64-bit machine. The
> current recommendation works on my machine: Use the 32-bit version of
> Office 365 Outlook and install Spambayes into a non-standard directory like
> C:/Spambayes. Given that the software works under those conditions on a
> 64-bit machine, it makes me think that there is some interaction (registry
> key, etc) missing for Outlook 64-bit to recognize Spambayes. Maybe this is
> something as simple as rebuilding the installer on a 64-bit machine? (Ok,
> realistically, I’m sure someone has already tried that.)
>
>
>
> Additionally, is there any way to see the runtime error being generated?
> Outlook 64-bit recognizes that Spambayes is there but generates this
> message in the Add-ins window: “Load Behavior: Not loaded. A runtime error
> occurred during the loading of the COM Add-in.” Where can I see that
> runtime error? Maybe it’s a problem that’s already been solved? I remember
> seeing some DEP related issues with Outlook and/or the Spambayes plugin. I
> didn’t have any luck trying to disable DEP on the machine or disable it for
> Outlook (per the FAQ). Any other suggestions?
>
>
>
> Thanks,
>
> jwc
>
>
>
>
>
>
>
> _______________________________________________
>
> spambayes-dev mailing list
>
> spambayes-dev at python.org
>
> https://mail.python.org/mailman/listinfo/spambayes-dev
>
> _______________________________________________
> spambayes-dev mailing list
> spambayes-dev at python.org
> https://mail.python.org/mailman/listinfo/spambayes-dev
>
>
>
> _______________________________________________
>
> spambayes-dev mailing list
>
> spambayes-dev at python.org
>
> https://mail.python.org/mailman/listinfo/spambayes-dev
>
>
>
> _______________________________________________
> spambayes-dev mailing listspambayes-dev at python.orghttps://mail.python.org/mailman/listinfo/spambayes-dev
>
> --
>
> *Marko von Oppen - Technische Software*
> Nürnberger Str. 43, 01187 Dresden
> <https://www.google.com/maps/search/N%C3%BCrnberger+Str.+43,+01187+Dresden?entry=gmail&source=g>
> fon +49 (0)351 21118511, fax +49 (0)351 27229679
> e-mail marko at von-oppen.com web www.von-oppen.com
> _______________________________________________
> spambayes-dev mailing list
> spambayes-dev at python.org
> https://mail.python.org/mailman/listinfo/spambayes-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/spambayes-dev/attachments/20231202/099e0848/attachment-0001.html>


More information about the spambayes-dev mailing list