[Patches] [ python-Patches-521478 ] mailbox / fromline matching

SourceForge.net noreply at sourceforge.net
Sat Oct 9 23:18:57 CEST 2004


Patches item #521478, was opened at 2002-02-22 09:54
Message generated for change (Comment added) made by bwarsaw
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=521478&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
>Status: Closed
Resolution: Rejected
Priority: 5
Submitted By: Camiel Dobbelaar (camield)
Assigned to: Barry A. Warsaw (bwarsaw)
Summary: mailbox / fromline matching

Initial Comment:
mailbox.py does not parse this 'From' line correctly:
>From camield at sentia.nl Mon Apr 23 18:22:28 2001 +0200
                                                ^^^^^
This is because of the trailing timezone information, 
that the regex does not account for.

Also, 'From' should match at the beginning of the line.

----------------------------------------------------------------------

>Comment By: Barry A. Warsaw (bwarsaw)
Date: 2004-10-09 17:18

Message:
Logged In: YES 
user_id=12800

I see no follow up to Martin's comment of 2004-08-18. 
Therefore, closing, however if you come up with a patch that
addresses his comments you can re-open it.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2004-08-18 07:36

Message:
Logged In: YES 
user_id=21627

The patch, as it stands, appears to be incorrect. It is
looking for *two* empty lines between messages, whereas
folder conventionally contain only a single empty line; this
is also what Zawinski says.

If that was fixed, I think the patch would be acceptable -
mailbox.py currently does not implement the rule that the
From: line must be preceded with an empty line.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-03-21 14:31

Message:
Logged In: YES 
user_id=31435

Since the Monday in question happened over 2 years ago, the 
answer to Michael's question is apparently "no" <wink>.  
Barry, we're stretching the conventional meaning of "asap" 
here -- can you close this one way or t'other now?

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-16 11:43

Message:
Logged In: YES 
user_id=6656

Anything going to happen here by Monday?

----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2002-03-02 11:47

Message:
Logged In: YES 
user_id=12800

Re-opening and assigning to myself.  I'll take a look at
your patches asap.

----------------------------------------------------------------------

Comment By: Camiel Dobbelaar (camield)
Date: 2002-03-02 09:34

Message:
Logged In: YES 
user_id=466784

PortableUnixMailbox is not that useful, because it only
matches '^From '.  From-quoting is an even bigger mess then
From-headerlines, so that does not really help.

I submit a new diff that matches '\n\nFrom ' or
'&lt;start-of-file&gt;From ', which makes PortableUnixMailbox
useful for my purposes.  It is not that intrusive as the
comment in the mailbox.py suggests.







----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2002-03-01 16:42

Message:
Logged In: YES 
user_id=12800

IMO, Jamie Zawinski (author of the original mail/news reader
in Netscape among other accomplishments), wrote the
definitive answer on From_

http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html

As far as Python's support for this in the mailbox module,
for backwards compatibility, the UnixMailbox class has a
strict-ish interpretation of the From_ delimiter, which I
think should not change.  It also has a class called
PortableUnixMailbox which recognizes delimiters as specified
in JWZ's document.  Personally, if I was trolling over a
real world mbox file I'd only use PortableUnixMailbox (as
long as non-delimiter From_ lines were properly escaped -- I
have some code in Mailman which tries to intelligently &quot;fix&quot;
non-escaped mbox files).

I agree with the Rejected resolution.

----------------------------------------------------------------------

Comment By: Camiel Dobbelaar (camield)
Date: 2002-03-01 06:34

Message:
Logged In: YES 
user_id=466784

I have tracked this down to Pine, the mailreader. 

In imap/src/c-client/mail.c, it has this flag:
 static int notimezones = NIL;    /* write timezones in
&quot;From &quot; header */

(so timezones are written in the &quot;From&quot; lines by default)

I also found the following comment in imap/docs/FAQ in the
Pine distribution:

&quot;&quot;&quot;
So, good mail reading software only considers a line to be a
&quot;From &quot; line if it follows the actual specification for a
&quot;From &quot; line. This means, among other things, that the day
of week is fixed-format: &quot;May 14&quot;, but &quot;May  7&quot; (note the
extra space) as opposed to &quot;May 7&quot;.  ctime() format for the
date is the most common, although POSIX also allows a
numeric timezone after the year.
&quot;&quot;&quot;

While I don't consider Pine to be the ultimate mailreader,
its heritage may warrant that the 'From ' lines it creates
are considered 'standard'.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-02-28 17:37

Message:
Logged In: YES 
user_id=6380

That From line is simply illegal, or at least nonstandard.

If your system uses this nonstandard format, you can extend
the mailbox parser by overriding the ._isrealfromline
method.

The pattern doesn't need ^ because match() is used, which
only matches at the start of the line.

Rejected.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=521478&group_id=5470


More information about the Patches mailing list