[Pythonmac-SIG] Unicode Filenames on the Mac

Piet van Oostrum piet at cs.uu.nl
Fri Jul 15 19:30:36 CEST 2005


>>>>> Bob Ippolito <bob at redivi.com> (BI) wrote:

>>>>> import sys
>>>>> sys.getfilesystemencoding()
>BI> 'utf-8'

It is UTF-8, but you must be careful: the filenames are in normalized (or
whatever they call it) UTF-8, meaning that accented letters are split up
into the letter followed by the accent. The filename API does accept the
composed accented letters, but normalizes them, and that is what the
listdir calls return.

>>> fn = u'\u00E1'
>>> f = open(fn,'w')
>>> f.close()

We now have a file with name 'á'

>>> import os
>>> os.listdir (u'.')
[u'a\u0301']

The accent follows the 'a'.
-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: piet at vanoostrum.org


More information about the Pythonmac-SIG mailing list