stmplib MIMEText charset weirdness

Terry Reedy tjreedy at udel.edu
Tue Feb 26 14:46:14 EST 2013


On 2/25/2013 11:00 PM, Adam W. wrote:
> Can someone explain to me why I can't set the charset after the fact.

Email was revised to v.6 for 3.3, so the immediate answer to both your 
why questions is 'because email was not revised yet'.

> text = MIMEText('❤¥'.encode('utf-8'), 'html')

In 3.3 this fails immediately with
AttributeError: 'bytes' object has no attribute 'encode'
because when _charset is not given, MIMEText.__init__ test encodes to 
discover what it should be
         if _charset is None:
             try:
                 _text.encode('us-ascii')
                 _charset = 'us-ascii'
             except UnicodeEncodeError:
                 _charset = 'utf-8'

> text = MIMEText('❤¥'.encode('utf-8'), 'html', 'utf-8')

If one provides bytes, one must provide the charset and MIMEText assumes 
you are not lying.

> text.as_string()
> Content-Type: text/html; charset="utf-8"
 > MIME-Version: 1.0
 > Content-Transfer-Encoding: base64
 >
> 4p2kwqU=

> Side question:
> text = MIMEText('❤¥', 'html')
> text.set_charset('utf-8')

This is redundant here. This method is inherited from Message and 
appears pretty useless for the subclass.

> text.as_string()
> 'MIME-Version: 1.0\nContent-Transfer-Encoding: 8bit\nContent-Type:
 > text/html;charset="utf-8"\n\n❤¥'
>
> Why is it now 8-bit encoding?

Bug fixed in 3.3. Output now same as above. Use 3.3 for email unless you 
cannot due to other dependencies not yet being available.

-- 
Terry Jan Reedy





More information about the Python-list mailing list