To unicode or not to unicode
Thorsten Kampe
thorsten at thorstenkampe.de
Sat Feb 21 20:25:54 EST 2009
* Ross Ridge (Sat, 21 Feb 2009 19:39:42 -0500)
> Thorsten Kampe <thorsten at thorstenkampe.de> wrote:
> >That's right. As long as you use pure ASCII you can skip this nasty step
> >of informing other people which charset you are using. If you do use non
> >ASCII then you have to do that. That's the way virtually all newsreaders
> >work. It has nothing to do with some 21+ year old RFC. Even your Google
> >Groups "newsreader" does that ('content="text/html; charset=UTF-8"').
>
> No, the original post demonstrates you don't have include MIME headers for
> ISO 8859-1 text to be properly displayed by many newsreaders.
*sigh* As you still refuse to read the article[1] I'm going to quote it
now here:
'The Single Most Important Fact About Encodings
If you completely forget everything I just explained, please remember
one extremely important fact. It does not make sense to have a string
without knowing what encoding it uses.
[...]
If you have a string [...] in an email message, you have to know what
encoding it is in or you cannot interpret it or display it to users
correctly.
Almost every [...] "she can't read my emails when I use accents" problem
comes down to one naive programmer who didn't understand the simple fact
that if you don't tell me whether a particular string is encoded using
UTF-8 or ASCII or ISO 8859-1 (Latin 1) or Windows 1252 (Western
European), you simply cannot display it correctly [...]. There are over
a hundred encodings and above code point 127, all bets are off.'
Enough said.
> The fact that your obscure newsreader didn't display it properly
> doesn't mean that original poster's newsreader is broken.
You don't even know if my "obscure newsreader" displayed it properly.
Non ASCII text without a declared encoding is just a bunch of bytes.
It's not even text.
T.
[1] http://www.joelonsoftware.com/articles/Unicode.html
More information about the Python-list
mailing list