[Python-3000] should rfc822 accept text io or binary io?

Stephen J. Turnbull stephen at xemacs.org
Tue Aug 7 20:49:53 CEST 2007


Guido van Rossum writes:

 > Bizarre. I'm not aware of any HTTP header that requires *binary*
 > values. I can imagine though that they may contain *encoded* text and
 > that they are leaving the encoding up to separate negotiations between
 > client and server, or another header, or specified explicitly by the
 > header, etc. It can't be pure binary because it's still subject to the
 > \r\n line terminator.

I assume that the relevant explanation is from RFC 2616, sec 2.2
<ftp://ftp.rfc-editor.org/in-notes/rfc2616.txt>:

   The TEXT rule is only used for descriptive field contents and values
   that are not intended to be interpreted by the message parser. Words
   of *TEXT MAY contain characters from character sets other than ISO-
   8859-1 [22] only when encoded according to the rules of RFC 2047
   [14].

       TEXT           = <any OCTET except CTLs, but including LWS>

   A CRLF is allowed in the definition of TEXT only as part of a header
   field continuation. It is expected that the folding LWS will be
   replaced with a single SP before interpretation of the TEXT value.

Many parsed fields are made up of tokens, whose components are a
subset of CHAR, which is US-ASCII characters as octets (also
sec. 2.2).  This is the ASCII coded character set (EBCDIC encoding of
the ASCII repertoire won't do).  Other parsed fields contain special
data, such as dates, written with some subset of ASCII.


More information about the Python-3000 mailing list