[Patches] [ python-Patches-1162825 ] EditorWindow's title with
non-ASCII chars.
SourceForge.net
noreply at sourceforge.net
Sat Mar 19 05:05:22 CET 2005
Patches item #1162825, was opened at 2005-03-14 03:19
Message generated for change (Comment added) made by kbk
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1162825&group_id=5470
Category: IDLE
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: SUZUKI Hisao (suzuki_hisao)
>Assigned to: Martin v. Löwis (loewis)
Summary: EditorWindow's title with non-ASCII chars.
Initial Comment:
This small patch makes it possible to display a path including non-
ASCII chars as the title of Editor Window. See the screen shots of
original IDLE and patched one.
----------------------------------------------------------------------
>Comment By: Kurt B. Kaiser (kbk)
Date: 2005-03-18 23:05
Message:
Logged In: YES
user_id=149084
I'm monitoring :-)
Martin knows far more than I do about these things.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2005-03-18 13:16
Message:
Logged In: YES
user_id=21627
Yes, IMO it is really said that there is no programmatic way
to determine the encoding of Terminal.app (an ioctl would be
nice).
----------------------------------------------------------------------
Comment By: SUZUKI Hisao (suzuki_hisao)
Date: 2005-03-18 06:08
Message:
Logged In: YES
user_id=495142
On OS X, any command line which you type in is encoded
to what Terminal.app specifies.
If it is other than UTF-8, you will see broken display for
non-ASCII file names when listing them in Terminal.app.
If does not match with LANG, line editing in bash will be
somewhat useless for multi-byte characters.
In Japan, seasoned Unix users tend to use EUC-JP on
OS X, and they also tend to restrict their file names to
ASCII. They use EUC-JP in their program sources,
LaTeX files, command messages, etc. It has been a
long tradition how you use Unix in Japan since circa
1990. Thus the broken display for non-ASCII file names
does not bother them.
Some newfangled Unix users use UTF-8 characters in
command line on OS X. And many other OS X users,
who use national characters for their file names natullay,
do not use command line at all.
In theory, you can use some non-UTF8 encoding for a
non-ASCII file name in your command line. However,
in practice for now, it seems very unlikely on OS X.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2005-03-18 01:49
Message:
Logged In: YES
user_id=21627
On Windows, both argv and file names are encoded as "mbcs",
which is both the locale's encoding, and the file system
encoding. The interesting question is: how are command line
arguments encoded on OSX (which is the only system which has
a file system encoding independent of the locale)?
----------------------------------------------------------------------
Comment By: SUZUKI Hisao (suzuki_hisao)
Date: 2005-03-17 21:47
Message:
Logged In: YES
user_id=495142
When you install Python in Windows, you will get
"Edit with IDLE" entry in the context menu for *.py file.
The entry launches the pythonw.exe with the name of
*.py file as one of the sys.argv[] parameters.
See the registry:
HKEY_CLASSES_ROOT\Python.File\shell\Edit with IDLE\command
There you will see:
"C:\Python24\pythonw.exe" "C:\Python24\Lib\idlelib\idle.pyw"
-n -e "%1"
I thought the file name given here as "%1" would be what
open(2) accepts.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2005-03-17 16:33
Message:
Logged In: YES
user_id=21627
On what operating system is sys.argv in file system encoding
(i.e. in the encoding that open(2) expects), and not in the
locale's encoding? AFAIK, both Linux and Windows use the
locale's encoding for sys.argv (but then, they also use the
same encoding for the file system).
----------------------------------------------------------------------
Comment By: SUZUKI Hisao (suzuki_hisao)
Date: 2005-03-17 04:25
Message:
Logged In: YES
user_id=495142
In the typical usage of IDLE, sys.argv[] are given to "pythonw" command
by the window system. Thus, in almost all cases, they are in the
filesystem encoding.
I believe IDLE with the last patch will run well on OS X (as well as on
Windows etc.) if the Tcl/Tk bug of OS X is fixed someday, or the
environment variable LANG is set to use UTF-8 for now.
----------------------------------------------------------------------
Comment By: SUZUKI Hisao (suzuki_hisao)
Date: 2005-03-17 03:40
Message:
Logged In: YES
user_id=495142
I'm sorry, but the previous patches are insufficient to handle non-ASCII file
names.
The menu "Recent Files" in "File" in the menu-bar does not display such
names correctly.
In addition, when updating the "Recent Files" menu, UnicodeDecodeError
raises in _implicit_ conversion of unicode filename given by tkFileDialog to
ASCII string.
So I made a new patch. Do not use the previous patches, please.
The new patch converts every multi-byte file name into unicode early in
IOBinding; thus the file path is correctly displayed in the title bar.
And it converts every unicode name into multi-byte string explicitly when
updating the menu.
Note that IDLE writes the recent file names as a text file. Conversion into
string is necessary anyway.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2005-03-17 03:00
Message:
Logged In: YES
user_id=21627
Hmm. When the string comes from sys.argv, it should be in
the user's preferred encoding, not in the file system
encoding, which would suggest that the current code is right.
When the string comes from tk_getOpenFile, I would expect to
get a Unicode string back instead of a byte string. I can
believe that Tk fails for OSX here: it relies on Tcl's glob
command, which apparently assumes that "encoding system" is
used for the file system; try
>>> [unicode(x) for x in t.tk.call(('glob','*'))]
There are more issues OSX glob, e.g. for Latin characters,
it processes the decomposed form inconveniently, see
http://sourceforge.net/tracker/?func=detail&aid=823330&group_id=10894&atid=110894
So I think it is fine to display question marks on OSX if
necessary;in general, it now seems that the locale's
encoding should be used indeed.
----------------------------------------------------------------------
Comment By: SUZUKI Hisao (suzuki_hisao)
Date: 2005-03-16 23:39
Message:
Logged In: YES
user_id=495142
Thank you for your comment.
First, indeed some titles may fail to be decoded, but it will be sufficient to
use 'ignore' as the error handling scheme. At least it gives more readable
titles than the present "mojibake'd" ones.
Second, the title comes from either sys.argv or tkFileDialog. tkFileDialog
calls tk_getOpenFile and tk_getSaveFile of Tck/Tk. So you are right. It
would be better to use sys.getfilesystemencoding(). Note that the patch
does not affect any unicode titles.
As for OSX, it seems that tk_getOpenFile sometimes returns a broken
string unless you set LANG so as to use UTF-8 (en_US.UTF-8,
ja_JP.UTF-8 etc.). You can see it as follows:
$ LANG=ja_JP.SJIS wish8.4
% tk_getOpenFile
For a folder name of Japanese characters, you will get a broken result; it is
neither UTF-8 nor SJIS. The same problem applies to eucJP. It is a bug
of Tcl/Tk (I found it in Aqua Tcl/Tk 8.4.9) and affects the original IDLE, too.
All in all, it would be the most reasonable to use
sys.getfilesystemencoding() and 'ignore' scheme for now.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2005-03-15 18:04
Message:
Logged In: YES
user_id=21627
I think the patch is wrong/not general enough:
- if decoding fails for some reason, it should continue anyway
- I'm not sure where the title comes from, but it might be
that it is a file name. If so, it should use
sys.getfilesystemencoding() instead of IOBinding.encoding.
This matters only on systems where these might differ, e.g.
MacOSX.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1162825&group_id=5470
More information about the Patches
mailing list