Mailergate (was: python docs search for 'print')
Thomas 'PointedEars' Lahn
PointedEars at web.de
Wed Sep 5 14:42:18 EDT 2012
Stephen D'Aprano wrote:
> On Tue, 04 Sep 2012 20:27:38 +0200, Thomas 'PointedEars' Lahn wrote:
>> ¹ The other mess they created (or allowed to be created) is this mashup
>> of newsgroup and mailing list, neither of which works properly,
>
> In what way do they not work properly?
Most prominently, threads are completely and utterly borken.
>> because the underlying protocols are not compatible.
>
> What?
>
> That is rather like saying that you can't read email via a web interface
> because the http protocol is not compatible with the smtp protocol.
Apples and oranges. The problem is gating messages from a mail server to a
news server and vice-versa without regard to the differences between the
underlying protocols.
Netnews User Agents (NUAs, newsreaders), are currently based on [RFC3977]
and [RFC5536].
In a Netnews article, a References header field is mandatory for a posting
that is a follow-up. (Threading by Subject and Date works poorly, if at
all, so the Specification does not suggest that.) The last element of the
References header field value has to be a Message-ID specifiying the
article's precursor. That Message-ID has to match the Message-ID header
field value of an existing posting, unless it has expired on the target
newsserver or was canceled (with Supersedes being a special case). The
In-Reply-To header field (see below) is not allowed there, but it is set by
some hybrid MUA/NUAs like Mozilla Thunderbird anyway¹.
Mail User Agents (MUAs, mailreaders), on the other hand, are currently based
on [RFC5321], [RFC1939], IMAP4 (various RFCs, starting with [RFC1730]), and
last but not least [RFC5322].
There are two possible header fields to build a thread of e-mail messages:
In-Reply-To, and References. Whereas the first header field's value is
supposed to be a Message-ID and the second one's as described in [RFC5536].
Few MUAs set both, some set the first one, and many set none of them at all,
because there is no absolute requirement to set any of them (see [RFC5322],
section 3.6.4.)
And then there is utterly borken software – or shall we say utterly borken
approaches? Consider for example the recent thread with Subject "simple
client data base" started by Mark R Rivet. The original posting has:
| User-Agent: ForteAgent/7.00.32.1200
(posted using a newsreader)
| […]
| Message-ID: <lae9489ct99mp704um93sdqlatofb2i8gq at 4ax.com>
Chris Angelico's follow-up to that has
| In-Reply-To: <lae9489ct99mp704um93sdqlatofb2i8gq at 4ax.com>
| References: <lae9489ct99mp704um93sdqlatofb2i8gq at 4ax.com>
| […]
| Message-ID: <mailman.142.1346682533.27098.python-list at python.org>
| […]
| X-Mailman-Version: 2.1.15
(apparently posted using a mailreader, gated by python.org's mail software)
So far, so good. But Peter Otten's follow-up to Chris Angelico's posting
has
| References: <lae9489ct99mp704um93sdqlatofb2i8gq at 4ax.com>
| <CAPTjJmpHPE=SdE_XJtdi4DMFVeWa8Exo3Arsu13Hd8fgSuZ5bw at mail.gmail.com>
| […]
| User-Agent: KNode/4.7.3
(posted using a newsreader)
| […]
| Message-ID: <mailman.145.1346683813.27098.python-list at python.org>
As you can see, the Message-ID of Chris' posting does not occur in the
References header field value of Peter's posting, which is caused by
python.org's SMTP-to-NNTP gating program to set its own Message-ID, ignoring
the Message-ID of the server where the message was injected. Therefore,
although it is a followup to Chris' posting, Peter's posting has no
*technical* (metadata) relation to Chris' posting.
Instead, it should have
| References: <lae9489ct99mp704um93sdqlatofb2i8gq at 4ax.com>
| <mailman.142.1346682533.27098.python-list at python.org>
| […]
or, better: Chris' posting should have had the original
| […]
| Message-ID:
| <CAPTjJmpHPE=SdE_XJtdi4DMFVeWa8Exo3Arsu13Hd8fgSuZ5bw at mail.gmail.com>
| […]
(no word-wrap), then the header fields of Peter's posting can stay as they
are.
My newsreader (KNode/4.4.11) tries its best to resolve this (short of
threading by Subject and Date, which does not work; see above) which causes
Peter's posting to end up as a follow-up to *Mark's* posting instead
(specified by the only valid Message-ID in the References header). Only
when you read Peter's posting you realize that it is not a follow-up to
Mark's at all. Confusion ensues.
There are a lot of similar examples here. As a result of the Message-ID
rewriting, in several cases a follow-up even appears as if it was an
original posting, without any technical (and therefore without any obvious
visual) relation to the thread it actually belongs to at all, even though
the precursor has not expired. For example,
| […]
| X-Original-To: python-list at python.org
| Delivered-To: python-list at mail.python.org
| […]
| In-Reply-To: <50464153.5090402 at gmail.com>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| References: <50464153.5090402 at gmail.com>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| Date: Tue, 4 Sep 2012 14:27:35 -0400
| Subject: Re: python docs search for 'print'
| From: Joel Goldstick <joel.goldstick at gmail.com>
| To: David Hoese <dhoese at gmail.com>
| Content-Type: text/plain; charset=UTF-8
| Cc: python-list at python.org
| […]
| Newsgroups: comp.lang.python
| Message-ID: <mailman.185.1346783257.27098.python-list at python.org>
| […]
|
| On Tue, Sep 4, 2012 at 1:58 PM, David Hoese <dhoese at gmail.com> wrote:
| > […]
There is no message with Message-ID <50464153.5090402 at gmail.com> (at least
not on the newsserver that I use), because that header field value was
overwritten by the borken gating software that python.org uses. The actual
message posted by that software is:
| […]
| X-Original-To: python-list at python.org
| Delivered-To: python-list at mail.python.org
| […]
| Date: Tue, 04 Sep 2012 13:58:43 -0400
| From: David Hoese <dhoese at gmail.com>
| User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7;
| rv:15.0) Gecko/20120824 Thunderbird/15.0
| […]
| To: python-list at python.org
| Subject: python docs search for 'print'
| […]
| Newsgroups: comp.lang.python
| Message-ID: <mailman.184.1346781550.27098.python-list at python.org>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
To further show that this is not a coincidence, and that I am not imagining
things here, the same problems started to occur when some people of the
German-speaking Python mailing list at python.org thought it would be a good
idea to merge that mailing list and the German-speaking newsgroup
de.comp.lang.python not so long ago, using the same software. As a result,
that Python newsgroup is a complete mess now, too.
>> Add to that the abomination that Google Groups has become.
>
> It's always been an abomination,
After they took over the Dejanews archive it was rather OK. You could use
it with the keyboard, lines were at least automatically wrapped at 80
columns (but unfortunately, only when sending and there was no preview
[AFAIK it still isn't]), they removed postings reported as spam, and so
forth.
> although I understand it is much, much worse now.
Now you cannot even use it with the keyboard, the postings are not properly
word-wrapped when typing or submitting (resulting in lines of 200 characters
and more). The spam is not removed at all, but only hidden from *Google*
*Groups* users, which causes it to be distributed on Usenet unchecked unless
the closest peers of the Google Groups servers happen to employ a suitable
spam filter, or have at least one dedicated user who runs a killbot.
> Blame Google for that.
I do, and I have UDP'd Google Groups since April for that (except follow-ups
to my postings). However, I am also blaming the people still using it
without complaining sufficiently, because if they would not use it or would
complain more often and louder, Google would have to fix it. Unfortunately,
most people do not even know where they are posting to when they access
Usenet via Google Groups, so there is little hope for improvement of the
situation.
But that is another can of worms entirely.
__________
¹ Recent example: <news:k23c3l$ldn$1 at news.albasani.net>
References:
[RFC1730] Crispin, M. "INTERNET MESSAGE ACCESS PROTOCOL - VERSION 4"
(IMAP4). December 1994. <http://tools.ietf.org/html/rfc1730>
[RFC1939] Myers, J. and Rose, M. "Post Office Protocol - Version 3".
May 1996. <http://tools.ietf.org/html/rfc1939>
[RFC3977] Feather, C. "Network News Transfer Protocol (NNTP)".
October 2006. <http://tools.ietf.org/html/rfc3977>
[RFC5321] Klensin, J. "Simple Mail Transfer Protocol" (SMTP).
October 2008. <http://tools.ietf.org/html/rfc5321>
[RFC5322] Resnick, P. (ed.) "Internet Message Format".
October 2008. <http://tools.ietf.org/html/rfc5322>
[RFC5536] Murchison, K., Lindsey, C., and Kohn, D.
"Netnews Article Format". November 2009.
<http://tools.ietf.org/html/rfc5536>
--
PointedEars
Twitter: @PointedEars2
Please do not Cc: me. / Bitte keine Kopien per E-Mail.
More information about the Python-list
mailing list