[Python-Dev] Python-3.0, unicode, and os.environ

Toshio Kuratomi a.badger at gmail.com
Fri Dec 5 16:06:06 CET 2008


Terry Reedy wrote:
> Toshio Kuratomi wrote:
>>
>>> I would think life would be ultimately easier if either the file server
>>> or the shell server automatically translated file names from jis and
>>> utf8 and back, so that the PATH on the *nix shell server is entirely
>>> utf8.
>>
>> This is not possible because no part of the computer knows what the
>> encoding is.  To the computer, it's just a sequence of bytes.  Unlike
>> xml or the windows filesystem (winfs? ntfs?) where the encoding is
>> specified as part of the document/filesystem there's nothing to tell
>> what encoding the filenames are in.
> 
> I thought you said that the file server keep all filenames in shift-jis,
> and the shell server all in utf-8.

Yes.  But this is part of the setup of the example to keep things
simple.  The fileserver or shell server could themselves be of mixed
encodings (for instance, if it was serving home directories to users all
over the world each user might be using a different encoding.)

>  If so, then the shell server could
> know if it were told so.
> 

Where are you going to store that information?  In order for python to
run without errors, will it have to be configured on each system it's
installed on to know the encoding of each filename?  Or are we going to
try to talk each *NIX vendor into creating new filesystems that record
that information and after a five year span of time declare that python
will not run on other filesystems in corner cases?

I think that this way does not hold a reasonable expectation of keeping
python a portable language.

-Toshio

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-dev/attachments/20081205/e3be1a55/attachment-0001.pgp>


More information about the Python-Dev mailing list