[Mailman-Users] Archive merge and search
Hal
my_list_address at yahoo.no
Tue Nov 18 15:35:46 CET 2014
On 07/11/2014 19:42, Mark Sapiro wrote:
> On 11/07/2014 03:52 AM, Hal wrote:
>> For not allowing text/html
[snip]
>> But does the above apply to *all* archived postings, or does it only
>> filter anything that comes in from now on?
>
>
> It applies to all posts that arrive after you make those settings, buth
> in the archive and in the messages delivered to the list. It won't
> affect messages already archived or in the mylist.mbox file, even if you
> rebuild the archive.
That's fine, but good to know for later.
So for any new messages from now on I want my list to work this way:
1) HTML formatted postings should be converted to plain text before
reaching other members.
2) HTML formatted postings can retain their formatting for the archive
(I believe the archive is in the HTML format anyway?), but if it only
archives whatever is sent to list members I don't mind. The important
thing is that members receive plain text messages.
3) Since many people have their email programs set by default to send in
HTML these days I just want Mailman to do its filtering, then continue
by sending the posting as plain text without any moderator request or
alerting the sender.
4) I'd like to block all attachements (list members should only receive
plain text files).
40kb is already set for Max_message_size (in "General options" within
the list administration web interface) which seems to have worked fine
(as far as I know).
Furthermore I understand that Filter_filename_extensions (in the
"Content filtering" section) in addition removes any attachements based
on specific filename *extensions* regardless of their file size?
I see exe, bat, cmd and a bunch of other filetypes I've never heard of
(geared towards Windows/DOS users I suppose -I'm a Mac user) are listed,
but I suppose I could block .zip and those pesky .vcf/.vcard and
"winmail.dat" files the same way.
When such extensions are encountered, are they just removed from the
messages while the message posting itself is passed on to list members,
or is the whole posting stopped for approval first?
I'm thinking out loud here, so feel free to chime in for better ideas,
but I'm thinking there are two kind of attachement groups which need
different actions to be taken:
Deliberate attachements: zip files, gif/jpg images etc. which a poster
wants to share. The message/attachement should be stopped from reaching
the list and an email sent to the poster with a "your message has been
blocked. Please resend your message, this time without an attachement"
type of message.
Accidental attachements: winmail.dat, .vcf or .vcard an so on. Many
users don't know (as with HTML postings) that their email program is set
up to send this stuff. IMHO those attachements don't have anything to do
with the actual content of their postings, so Mailman should just remove
the attachement(s), then pass on the rest of the message to the list.
Having said that, have I understood things correctly by setting up my
"Content filtering" options as follows? (based on what you've said and
what I've read here:
http://wiki.list.org/pages/viewpage.action?pageId=4030684):
Edit_filter_content: YES
Filter_mime_types: (left blank)
Pass_mime_types: multipart
message/rfc822
text/plain
text/html
filter_filename_ext.: exe
bat
cmd
com
pif
scr
vbs
cpl
zip
dat
vcf
vcard
pass_filename_ext.: (left blank)
Collapse_alternatives: YES
conv_html_to_plaintext: YES
Filter_action: DISCARD
>>> A different obfuscation for email addresses would require source code
>>> modification. I.e., there's no 'plugin' for it.
>>
>> Is this a feature that could be suggested for the upcoming Mailman 3?
>> Perhaps an optional user-configuration through the web admin interface?
>
>
> Mailman 3 uses different and 'pluggable' archiving. The archiver that is
> bundled with MM 3 is called Hyperkitty. I'm not sure what address
> obfuscation it does.
Thanks. I'll look into it.
>> Failing that, is there a way I could have the (currently private)
>> archive have a filter before HTTP access?
>
> You could create your own CGI or other web process to access the
> archives and present them any way you want.
Being ignorant on the subject, what kind of pre-written CGI script
should I try to find (i.e. "search engine to web archive gateway" or
something like that?).
You previously suggested htdig (http://www.htdig.org/) with your patches
for allowing my visitors to search through both the Mailman archives and
my website. Assuming this is a more ready-to-use solution than the other
search engines out there, are there features I will be missing out on
(e.g. the ability to use CSS and Ajax for making its search results
appear more in line with the rest of my website) and is it still secure?
I've read that malicious code can sometimes be entered as search phrases
and damage the database if the search engine isn't using "parametrized
queries".
I've found other search engines (Nutch, Lucene, Solr, Tipue search,
Xapian and Ajax live search) but I have no idea if they're suitable for
my use and how well they work or how difficult they are to set up.
Opinions from anyone are highly appreciated.
Thanks.
Hal
More information about the Mailman-Users
mailing list