[Tutor] sys.getfilesystemencoding()

Oscar Benjamin oscar.j.benjamin at gmail.com
Tue Dec 18 15:18:30 CET 2012


On 18 December 2012 13:13, Albert-Jan Roskam <fomcl at yahoo.com> wrote:
> I am trying to write a file with a 'foreign' unicode name (I am aware that this is a highly western-o-centric way of putting it). In Linux, I can encode it to utf-8 and the file name is displayed correctly. In windows xp, the characters can, apparently, not be represented in this encoding called 'mbcs'. How can I write file names that are always encoded correctly on any platform? Or is this a shortcoming of Windows?

Eryksun has already explained the fix, but I'll answer this question.
It is not a short coming of Windows itself but rather a shortcoming of
the older ASCII based Win32 API that is being used by CPython. A long
time ago Windows obtained a newer improved "Unicode API" that supports
what at that time was considered to be Unicode. In Python, for
backward compatibility, the older API is used when you pass a byte
string as a filename. I believe this is the same in both recent 2.x
versions and 3.x versions of CPython.

The problem here is precisely that fact that you are encoding the
filename, rather than passing the unicode string directly to open().
This also isn't necessary on Linux unless you want to encode the
filename with something other than the sys.getfilesystemencoding().


Oscar


More information about the Tutor mailing list