[Python-Dev] Encoding detection in the standard library?

Stephen J. Turnbull stephen at xemacs.org
Wed Apr 23 06:44:20 CEST 2008


Bill Janssen writes:

 > Internet-compliant email actually has well-specified mechanisms for
 > including encoding information; see RFCs 2047 and 2231.  There's no
 > need to guess; you can just look.

You must be very special to get only compliant email.

About half my colleagues use RFC 2047 to encode Japanese file names in
MIME attachments (a MUST NOT behavior according to RFC 2047), and a
significant fraction of the rest end up with binary Shift JIS or EUC
or MacRoman in there.

And those are just the most widespread violations I can think of off
the top of my head.

Not to mention that I find this:

    =?X-UNKNOWN?Q?Martin_v=2E_L=F6wis?= <martin at v.loewis.de>,

in the header I got from you.  (I'm not ragging on you, I get Martin's
name wrong a significant portion of the time myself. :-( )


More information about the Python-Dev mailing list