[Email-SIG] fixing the current email module

Sun Oct 11 02:47:39 CEST 2009

Glenn Linderman writes:
 > On approximately 10/10/2009 8:40 AM, came the following characters from 
 > the keyboard of Stephen J. Turnbull:

 > > So why are we discussing this?  We don't even know what our mainline
 > > APIs are going to look like, why are we discussing forcibly operating
 > > on broken input?
 > 
 > Use case generation.  If the only way to access header values is to 
 > successfully, fully, decode them, then some uses may be rendered 
 > impossible, or at least difficult, even by choice of APIs.

Since invertibility is a requirement, "successfully fully decoding" a
header field is not a prerequisite to accessing it.

The question of "what should we do about broken mail" at this point
has three components:

(1) To what level do we (ie, the email module) promise to parse
    conforming wire format into useful objects?

(2) For nonconforming input, when is it OK to raise an error and
    return to the calling client rather than handle it ourselves?

(3) What is the API for accessing and/or mutating unparsed data, and
    requesting a reparse?

I don't think we should go any farther than that.

 > > "Re" is a Latin abbreviation; there is no appropriate translation. ;-)
 > >   
 > 
 > Nonetheless, I have seen both Re: and Fwd: translated to other languages 
 > (besides Latin or geek) :)

Sure.  This is an aspect of question (1): is this the responsibility
of the email module?

 > > Maybe they are, but the email module doesn't know or care about what
 > > they do.  Let's stick within what the email module is supposed to
 > > handle
 > 
 > Yep, this is just use case exploration.

But since by definition this is broken input, discussing what
applications are going to want to do with it is inappropriate, IMO.
We don't care if the app is going to prefix, suffix, or crucifix it.
We need to specify

(a) what object will hold the raw data we couldn't handle
(b) how a calling client can retrieve the raw data
(c) how the client can replace (or more generally mutate) that data
(d) how the client can request a reparse from us if it attempted to
    repair the breakage at a low level rather than parse it

Manipulations of text or bytes are in principle not the responsibility
of the email module IMO; that will be done *by* the client *using* raw
Python, not methods provided by email.  I don't see how discussion of
*what* manipulations can be done with one hand up our nose is anything
but useless bikeshedding.

If we decide that the email module can usefully provide sufficiently
general facilities that would be convenient and hard to implement by
general client programmers (eg, the Mailman Developers collective
wisdom about foreign equivalents for "re" and "fwd" is surely greater
than that of the average American programmer), we will do it by
calling low-level methods to get and put the data, and raw Python to
manipulate it as text or bytes.