[Python-Dev] Windows: Remove support of bytes filenames in the os module?

Victor Stinner victor.stinner at gmail.com
Tue Feb 9 05:13:58 EST 2016


Hi,

2016-02-08 18:02 GMT+01:00 Brett Cannon <brett at python.org>:
> If Unicode string don't work in Python 2 then what is Python 2/3 to do as a
> cross-platform solution if we completely remove bytes support in Python 3?
> Wouldn't that mean there is no common type between Python 2 & 3 that one can
> use which will work with the os module except native strings (which are
> difficult to get right)?

IMHO we have to put a line somewhere between Python 2 and Python 3.
For some specific use cases, there is no good solution which works on
both Python versions.

For filenames, there is no simple design on Python 2. bytes is the
natural choice on UNIX, whereas Unicode is preferred on Windows. But
it's difficult to handle two types in the same code base. As a
consequence, most users use bytes on Python 2, which is a bad choice
for Windows...

On Python 3, it's much simpler: always use Unicode. Again, the PEP 383
helps on UNIX.

I wrote a PoC for Mercurial to always use Unicode, but the idea was
rejected since Mercurial must support undecodable filenames on UNIX.
It's possible on Python 3 (str+PEP 383), not on Python 2. I tried to
port Mercurial to Python 3 and use Unicode for filenames in the same
change. It's probably better to do that in two steps: first port to
Python 3, then use Unicode. I guess that the final change is to drop
Python 2? I don't know if it's feasible for Mercurial.

Victor


More information about the Python-Dev mailing list