Assignment Versus Equality

Random832 random832 at fastmail.com
Tue Jun 28 10:13:17 EDT 2016


On Tue, Jun 28, 2016, at 00:31, Rustom Mody wrote:
> GG downgrades posts containing unicode if it can, thereby increasing
> reach to recipients with unicode-broken clients

That'd be entirely reasonable, except for the excessively broad
application of "if it can".

Certainly it _can_ do it all the time. Just replace anything that
doesn't fit with question marks or hex notation or \N{NAME} or some
human readable pseudo-representation a la unidecode. It could have done
any of those with the Hindi that you threw in to try to confound it, (or
it could have chosen ISCII, which likewise lacks arrow characters, as
the encoding to downgrade to).

It should pick an encoding which it expects recipients to support and
which contains *all* of the characters in the message, as proper
characters and not as pseudo-representations, and downgrade to that if
and only if such an encoding can be found. For most messages, it can use
US-ASCII. For most of the remainder it can use some ISO-8859 or
Windows-125x encoding.

Or include the UTF-8 and some other character set as
multipart/alternative representations.



More information about the Python-list mailing list