[Tutor] sys.getfilesystemencoding()

Peter Otten __peter__ at web.de
Tue Dec 18 14:40:51 CET 2012


Albert-Jan Roskam wrote:

> I am trying to write a file with a 'foreign' unicode name (I am aware that
> this is a highly western-o-centric way of putting it). In Linux, I can
> encode it to utf-8 and the file name is displayed correctly. In windows
> xp, the characters can, apparently, not be represented in this encoding
> called 'mbcs'. How can I write file names that are always encoded
> correctly on any platform? Or is this a shortcoming of Windows?
> 
> # Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit
> # (Intel)] on win32
> import sys
> 
> 
> def _encodeFileName(fn):
> """Helper function to encode unicode file names into system file names.
> http://effbot.org/pyref/sys.getfilesystemencoding.htm"""
> isWindows = sys.platform.startswith("win")
> isUnicode = isinstance(fn, unicode)
> if isUnicode:  # and not isWindows
> encoding = sys.getfilesystemencoding()  # 'mbcs' on Windows, 'utf-8' on
> Linux encoding = "utf-8" if not encoding else encoding
> return fn.encode(encoding)
> return fn
> 
> fn = u'\u0c0f\u0c2e\u0c02\u0c21\u0c40' + '.txt'   # Telugu language
> with open(_encodeFileName(fn), "wb") as w:
> w.write("yaay!\n")   # the characters of the FILE NAME can not be
> represented in the encoding (squares/tofu) print "written: ", w.name
> 
> Thank you very much in advance!

I don't understand your question. Do you get an error if you pass the unicde 
filename directly?

fn = u'\u0c0f\u0c2e\u0c02\u0c21\u0c40.txt'
with open(fn, "wb") as f:
    ...

If so, I don't think there's a way around that. 

On the other hand, if your file manager displays the name as squares it may 
be that the characters aren't available in the font used the show the name.



More information about the Tutor mailing list