Everything you did not want to know about Unicode in Python 3
Skip Montanaro
skip at pobox.com
Tue May 13 10:02:29 EDT 2014
On Tue, May 13, 2014 at 3:38 AM, Chris Angelico <rosuav at gmail.com> wrote:
>> Python 2's ambiguity allows me not to answer the tough philosophical
>> questions. I'm not saying it's necessarily a good thing, but it has its
>> benefits.
>
> It's not a good thing. It means that you have the convenience of
> pretending there's no problem, which means you don't notice trouble
> until something happens... and then, in all probability, your app is
> in production and you have no idea why stuff went wrong.
BITD, when I still maintained and developed Musi-Cal (an early online
concert calendar, long since gone), I faced a challenge when I first
started encountering non-ASCII band names and cities. I resisted UTF-8.
After all, if I printed a string containing an "é", it came out looking like
What kind of mess was that???
I tried to ignore it, or assume Latin-1 would cover all the bases (my first
non-ASCII inputs tended to come from Western Europe). If nothing else, at
least "é" was legible.
Needless to say, those approaches didn't work well. After perhaps six
months or a year, I broke down and started converting everything coming in
or going out
to UTF-8 at the boundaries of my system (making educated guesses at
input
encodings if necessary). My life got a whole lot easier after that. The
distinction between bytes and text didn't really matter much, certainly not
compared to the mess I had before where strings of unknown data leaked into
my system and its database.
Skip
P.S. My apologies for the mess this message probably is. Amazing as it may
seem, Gmail in Chrome does a crappy job editing anything other than plain
text. Also, I'm surprised in this day and age that common tools like Gnome
Terminal have little or no encoding support. I wound up having to pop up
urxvt to get an encodings-flexible terminal emulator...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20140513/7343b4dd/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: e.png
Type: image/png
Size: 270 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20140513/7343b4dd/attachment.png>
More information about the Python-list
mailing list