unicode filenames
David Eppstein
eppstein at ics.uci.edu
Sun Feb 2 22:32:14 EST 2003
In article <3E3DC9AF.387E307A at alcyone.com>,
Erik Max Francis <max at alcyone.com> wrote:
> > I normally use unix. What's the right way to treat filenames
> > under that OS? As Latin-1? Or UTF-8? As far as I can tell,
> > filenames are simply bytes, so I can make whatever interpretation
> > I want on the characters, and the standard viewpoint is to
> > interpret those characters as Latin-1.
>
> I believe that's the most common interpretation, but as you say, it
> doesn't much matter since filenames in UNIX are just considered streams
> of bytes. No reference to an encoding -- as far as I know -- is made in
> any UNIX-relevant standard.
Under Mac OS X, the shell displays text (e.g. from cat, or from ls
without the -q option) as utf-8 by default, and the Finder (gui file
browser) uses utf-8 for accented characters in file names. So I infer
that the correct interpretation of filenames under my OS is utf-8.
But other unixes may differ...
--
David Eppstein UC Irvine Dept. of Information & Computer Science
eppstein at ics.uci.edu http://www.ics.uci.edu/~eppstein/
More information about the Python-list
mailing list