[Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client

Random832 random832 at fastmail.com
Thu Jan 7 11:59:23 EST 2016


On Thu, Jan 7, 2016, at 07:59, Steven D'Aprano wrote:
> On Thu, Jan 07, 2016 at 08:49:55PM +1100, Chris Angelico wrote:
> 
> > It makes sense, but I disagree with the suggestion. Having "Latin-1 or
> > UTF-8" as the effective default encoding is not a good idea, IMO;
> 
> I'm curious what your reasoning is. That seems to be fairly common 
> behavious with some email clients, for example I seem to recall that 
> Thunderbird will try encoding emails as US-ASCII, if that fails, 
> Latin-1, and only send UTF-8 if the other two don't work.

Sure, but it includes a content-type header with a charset parameter.

I think the behavior of encoding text but not including a charset
parameter is fundamentally broken. If the user supplies a charset
parameter, it should try to use the matching encoding, otherwise it
should pick an encoding (whether that is "always UTF-8" or some other
rule) and add the charset parameter.


More information about the Python-ideas mailing list