[Python-ideas] Python 3 open() text files: make encoding parameter optional for cross-platform scripts

Chris Angelico rosuav at gmail.com
Mon Jun 10 22:15:32 CEST 2013


On Tue, Jun 11, 2013 at 4:50 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
> Of course in Unicode, it's a script tag or file metadata or user preference setting that controls which font is used; in Shift-JIS, the fact that nobody uses Shift-JIS for Chinese is generally all the information you need. But, either way, if you want to write "I could tell my pen-pal was a Chinese spy because she wrote 刃 instead of 刃", you can't.


You don't even need Japanese/Chinese confusion to get that. Sherlock
Holmes, "A Study in Scarlet" - am I allowed to spoil a minor subpoint
in something that's over a hundred years old? - has the discovery of
the German word "RACHE" ( == "revenge") written in blood. But Holmes
knew it to be an imitator, because the A was written the wrong way;
we, reading the book, can't see that, because all we get is the
letters. It takes out-of-band information - in this case, dialogue
between various characters - to tell us about the difference between
the German and the Latin way of writing the letter.

http://en.wikisource.org/wiki/A_Study_in_Scarlet/Part_1/Chapter_3
http://en.wikisource.org/wiki/A_Study_in_Scarlet/Part_1/Chapter_4

Unicode represents the symbols, not how they're drawn. It's up to the
application to take it the rest of the way.

ChrisA


More information about the Python-ideas mailing list