email header decoding fails

Thu Apr 10 04:45:41 EDT 2008

On Apr 10, 4:31 pm, "Gabriel Genellina" <gagsl-... at yahoo.com.ar>
wrote:
> En Wed, 09 Apr 2008 23:12:00 -0300, ZeeGeek <ZeeG... at gmail.com> escribió:
>
> > It seems that the decode_header function in email.Header fails when
> > the string is in the following form,
>
> > '=?gb2312?Q?=D0=C7=C8=FC?=(revised)'
>
> > That's when a non-encoded string follows the encoded string without
> > any whitespace. In this case, decode_header function treats the whole
> > string as non-encoded. Is there a work around for this problem?
>
> That header does not comply with RFC2047 (MIME Part Three: Message Header
> Extensions for Non-ASCII Text)
>
> Section 5 (1)
>      An 'encoded-word' may replace a 'text' token (as defined by RFC 822)
>      in any Subject or Comments header field, any extension message
>      header field, or any MIME body part field for which the field body
>      is defined as '*text'. [...]
>      Ordinary ASCII text and 'encoded-word's may appear together in the
>      same header field.  However, an 'encoded-word' that appears in a
>      header field defined as '*text' MUST be separated from any adjacent
>      'encoded-word' or 'text' by 'linear-white-space'.
>
> Section 5 (3)
>      As a replacement for a 'word' entity within a 'phrase', for example,
>      one that precedes an address in a From, To, or Cc header.  [...]
>      An 'encoded-word' that appears within a
>      'phrase' MUST be separated from any adjacent 'word', 'text' or
>      'special' by 'linear-white-space'.

Thank you very much, Gabriel.