[Python-3000] Proposed Python 3.0 schedule (bytes/unicde again)

Antoine Pitrou solipsis at pitrou.net
Tue Oct 7 13:45:30 CEST 2008


Hi,

James Y Knight <foom <at> fuhm.net> writes:
> 
>   - Having os.getcwdb isn't much use when you can't even run python in  
> the first place when the current directory has "bad" bytes in it.

I don't agree it's a similar problem. Python should be installed in a well-known
place with a sensible path. Of course, bonus points if Python can be launched
from anywhere, but I don't think it's a severe problem. In other words, I'd flag
this as "low priority".

If you want a more important issue, there's the issue of importing modules with
an unicode (non-ascii) path. Amaury has worked on this in the tracker.

> Currently Python outputs:
> Could not find platform independent libraries <prefix>
> Could not find platform dependent libraries <exec_prefix>
> Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
> Fatal Python error: Py_Initialize: can't initialize sys standard streams
> ImportError: No module named encodings.utf_8

Ok, so the error message is quite cryptic and would perhaps deserve improving.
Still, "low priority" IMHO.

>   - And then, getopt and optparse modules should work on bytestring  
> vectors, so that you can use sys.argvb without writing your own  
> argument parser. They don't currently.

Then we will gradually start moving all modules even remotely related with IO
and filesystem stuff to a dual bytes/unicode API? That's precisely the kind of
confusion we want to end with Py3k (the confusion between bytes and unicode as
similar data types which could be used almost interchangeably without giving any
consideration to semantics).

>   - Isn't it a potential security issue that " 'WHATEVER' in  
> os.environ" can return False if WHATEVER had some "bad" bytes in it,  
> but spawning a subprocess actually will include WHATEVER in the  
> subprocess's environment?

I do agree with that. Errors should certainly not pass silently, especially when
they can have strong security implications.

>   - I suppose sys.path should handle bytestrings on the path, and  
> should be populated using the bytes-version of os.environ so that  
> PYTHONPATH gets read in properly.

Well, except on Windows where unicode paths are the Right Thing to do. But then
we have a glaring incompatibility between major platforms.

Regards

Antoine.




More information about the Python-3000 mailing list