[Python-Dev] File system path encoding on Windows

Steve Dower steve.dower at python.org
Tue Aug 30 20:27:12 EDT 2016


On 30Aug2016 1702, Victor Stinner wrote:
> I made another quick&dirty test on Django 1.10 (I ran Django test
> suite on my modified Python raising exception on bytes path): I didn't
> notice any exception related to bytes path.
>
> Django seems to only use Unicode for paths.
>
> I can try to run more tests if you know some other major Python
> applications (modules?) working on Windows/Python 3.

The major ones aren't really the concern. I'd be interested to see where 
numpy and pandas are at, but I suspect they've already encountered and 
fixed many of these issues due to the size of the user base. (Though 
skim-reading numpy I see lots of code that would be affected - for 
better or worse - if the default encoding for open() changed...)

I'm more concerned about the long-tail of more focused libraries. Feel 
free to grab a random selection of Django extensions and try them out, 
but I don't really think it's worth the effort. I'm certainly not 
demanding you do it.

> Note: About Twisted, I forgot to mention that I'm not really surprised
> that Twisted uses bytes. Twisted was created something like 10 years
> ago, when bytes was the defacto choice. Using Unicode in Python 2 was
> painful when you imagine a module as large as Twisted. Twisted has to
> support Python 2 and Python 3, so it's not surprising that it still
> uses bytes in some places, instead of Unicode.

Yeah, I don't think they're doing anything wrong and wouldn't want to 
call them out on it. Especially since they already correctly handle it 
by asking Python what encoding should be used for the bytes.

> Moreover, as many
> Python applications/modules, Linux is a first citizen, whereas Windows
> is more supported as "best effort".

That last point is exactly why I think this is important. Any arguments 
against making Windows behave more like Linux (i.e. bytes paths are 
reliable) need to be clear as to why this doesn't matter or is less 
important than other concerns.

Cheers,
Steve



More information about the Python-Dev mailing list