How do I decode unicode characters in the subject using email.message_from_string()?
Thorsten Kampe
thorsten at thorstenkampe.de
Wed Feb 25 13:19:35 EST 2009
* Tim Golden (Wed, 25 Feb 2009 17:27:07 +0000)
> Thorsten Kampe wrote:
> > * Gabriel Genellina (Wed, 25 Feb 2009 14:00:16 -0200)
> >> En Wed, 25 Feb 2009 13:40:31 -0200, Thorsten Kampe
[...]
> >>> And I wonder why you would think the header contains Unicode characters
> >>> when it says "us-ascii" ("=?us-ascii?Q?"). I think there is a tendency
> >>> to label everything "Unicode" someone does not understand.
> >> And I wonder why you would think the header does *not* contain Unicode
> >> characters when it says "us-ascii"?.
> >
> > Basically because it didn't contain any Unicode characters (anything
> > outside the ASCII range).
>
> And I imagine that Gabriel's point was -- and my point certainly
> is -- that Unicode includes all the characters *inside* the
> ASCII range.
I know that this was Gabriel's point. And my point was that Gabriel's
point was pointless. If you call any text (or character) "Unicode" then
the word "Unicode" is generalized to an extent where it doesn't mean
anything at all anymore and becomes a buzz word.
With the same reason you could call ASCII an Unicode encoding (which it
isn't) because all ASCII characters are Unicode characters (code
points). Only encodings that cover the full Unicode range can reasonably
be called Unicode encodings.
The OP just saw some "weird characters" in the email subject and thought
"I know. It looks weird. Must be Unicode". But it wasn't. It was good
ole ASCII - only Quoted Printable encoded.
Thorsten
More information about the Python-list
mailing list