unicode filenames
Carlos Ribeiro
cribeiro at mail.inet.com.br
Thu Feb 6 08:44:52 EST 2003
On Thursday 06 February 2003 11:16 am, Beni Cherniavsky wrote:
> Since unix can afford to change all APIs and programs like windows did
> (the mess that resulted explains why <wink>), unix must stay with the
> byte-orineted filenames at the low level. This ensures that all programs
> that store file names in files, etc., continue to work. UTF-8 is the only
> encoding that can represent all of unicode that satisfies all these needs,
> so everybody should migrate to UTF-8 filenames (CJK users might have
> reservations to this; I'd be happy to learn their opinion).
Sorry. It would be a big mess. Here in Brazil, I can safely assume that it is
nearly impossible to find a computer *without* filenames with latin-1
accented characters. Not to mention the problems that we have when mounting
FAT partitions under Linux - many Unix users still need to use dual boot
machines in order to use a few Windows apps.
In my opinion, this is the type of problem that has to be solved at its root,
by slowly migrating the filesystem itself to accept only UTF-8 filenames. All
conversions during the migration phase have to be done by the operating
system itself; when moving files from one FS to the other, it would do the
necessary conversions. It's not going to be easy, though.
Carlos Ribeiro
cribeiro at mail.inet.com.br
More information about the Python-list
mailing list