[issue35547] email.parser / email.policy does not correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers

Martijn Pieters report at bugs.python.org
Fri Dec 21 06:34:21 EST 2018


Martijn Pieters <mj at python.org> added the comment:

While RFC2047 clearly states that an encoder MUST not split multi-byte encodings in the middle of a character (section 5, "Each 'encoded-word' MUST represent an integral number of characters. A multi-octet character may not be split across adjacent 'encoded-word's.), it also states that to fit length restrictions, CRLF SPACE is used as a delimiter between encoded words (section 2, "If it is desirable to encode more text than will fit in an 'encoded-word' of 75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may be used."). In section 6.2 it states

   When displaying a particular header field that contains multiple
   'encoded-word's, any 'linear-white-space' that separates a pair of
   adjacent 'encoded-word's is ignored.  (This is to allow the use of
   multiple 'encoded-word's to represent long strings of unencoded text,
   without having to separate 'encoded-word's where spaces occur in the
   unencoded text.)

(linear-white-space is the RFC822 term for foldable whitespace).

The parser is leaving spaces between two encoded-word tokens in place, where it must remove them instead. And it is doing so correctly for unstructured headers, just not in get_bare_quoted_string, get_atom and get_dot_atom.

Then there is Postel's law (*be liberal in what you accept from others*), and the email package already applies that principle to RFC2047 elsewhere; RFC2047 also states that "An 'encoded-word' MUST NOT appear within a 'quoted-string'." yet email._header_value_parser's handling of quoted-string will process EW sections.

----------
title: email.parser / email.policy does correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers -> email.parser / email.policy does not correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35547>
_______________________________________


More information about the Python-bugs-list mailing list