ASCII and Unicode
rusi
rustompmody at gmail.com
Sun Dec 8 12:39:47 EST 2013
On Sunday, December 8, 2013 10:52:34 PM UTC+5:30, Steven D'Aprano wrote:
> On Sat, 07 Dec 2013 17:05:34 +0100, giacomo boffi wrote:
> > Steven D'Aprano writes:
> >> Ironically, your post was not Unicode. [...] Your post was sent using
> >> a legacy encoding, Windows-1252, also known as CP-1252
> > i access rusi's post using a NNTP server, and in his post i see
> > Content-Type: text/plain; charset=UTF-8
> But *which post* are you looking at?
> I have just looked at three posts from him:
> Rusi's original post, where he used the ellipsis characters:
> Subject: Re: Managing Google Groups headaches
> Date: Thu, 5 Dec 2013 23:13:54 -0800 (PST)
> Content-Type: text/plain; charset=windows-1252
> Then his reply to me:
> Subject: Re: ASCII and Unicode [was Re: Managing Google Groups headaches]
> Date: Fri, 6 Dec 2013 18:33:39 -0800 (PST)
> Content-Type: text/plain; charset=UTF-8
> And finally, his reply to you:
> Subject: Re: ASCII and Unicode
> Date: Sun, 8 Dec 2013 08:41:10 -0800 (PST)
> Content-Type: text/plain; charset=ISO-8859-1
> It seems to me that whatever client he is using to post (I believe it is
> Google Groups web interface?) varies the encoding depending on what
> characters are included in his post.
> > is it possible that what you see is an artifact of the gateway?
> I doubt it. Unfortunately the email mailing list archive doesn't display
> all the email headers, but for the record here is his original post as
> seen by the email mailing list:
> https://mail.python.org/pipermail/python-list/2013-December/661782.html
> If you view source, you'll see that Mailman (the mailing list software)
> sets the webpage encoding to US-ASCII and encodes the ellipses to …,
> which is a perfectly reasonable thing for a web page to do. So we can be
> confident that when Mailman saw Rusi's post, it was able to correctly
> decode the message and see ellipses.
> Although I think that (probably) Google Groups is being stupid by varying
> the charset (why not just use UTF-8 always?), at least it is setting the
> charset correctly.
I think GG is being being sweet and affectionate and delectable enough that a
💩 in the footer will keep it stuck at UTF-8 you think ?? :-)
More information about the Python-list
mailing list