[Python-Dev] Filename as byte string in python 2.6 or 3.0?

Victor Stinner victor.stinner at haypocalc.com
Sun Sep 28 17:14:48 CEST 2008


Le Saturday 27 September 2008 19:41:50 Martin v. Löwis, vous avez écrit :
> > I think that the problem is important because it's a regression from 2.5
> > to 2.6/3.0. Python 2.5 uses bytes filename, so it was possible to
> > open/unlink "invalid" unicode strings (since it's not unicode but bytes).
>
> I'd like to stress that the problem is *not* a regression from 2.5 to 2.6.

Sorry, 2.6 has no problem. This issue is a regression from Python2 to Python3.

> Even though you may run into file names that can't be decoded, 
> that happening really indicates some bigger problem in the management 
> of the system where this happens, and the proper solution (IMO) should 
> be to change the system

In the *real world*, people are using different file systems, different 
operations systems, and some broken programs and/or operating system create 
invalid filenames. It could be a configuration problem (wrong charset 
definition in /etc/fstab) or the charset autodetection failure, but who 
cares? Sometimes you don't care that your music directory contains some 
strange filenames, you just want to hear the music. Or maybe you would like 
to *fix* the encoding problem, which is not possible using Python3 trunk.

People having this problem are, for example, people who write or use a backup 
program. This week someone asked me (on IRC) how to manage filenames in pure 
unicode with python 2.5 and Linux... which was impossible because on of his 
filename was invalid (maybe a file from a Windows system). So he switched to 
raw (bytes) filenames.

In a perfect world, everybody uses Linux with utf-8 filenames and only 
programs in Python using space indentation :-D

-- 
Victor Stinner aka haypo
http://www.haypocalc.com/blog/


More information about the Python-Dev mailing list