[Python-Dev] Removal of Win32 ANSI API

Antoine Pitrou solipsis at pitrou.net
Fri Nov 12 14:40:29 CET 2010


On Fri, 12 Nov 2010 13:13:08 +0100
Victor Stinner <victor.stinner at haypocalc.com> wrote:
> On Thursday 11 November 2010 21:02:43 Antoine Pitrou wrote:
> > On Thu, 11 Nov 2010 20:44:52 +0100
> > 
> > "Martin v. Löwis" <martin at v.loewis.de> wrote:
> > > > How do you support cross-platform code using bytes filenames?
> > > > IIRC, it has already been argued that it was an important feature. Many
> > > > filesystem-related utilities might prefer to handle filenames in bytes
> > > > form.
> > > 
> > > It would be a policy decision. However, I think it is hear-say that
> > > filesystem-related utilities might prefer byte file names.
> > 
> > One possible situation is when you receive filenames in bytes form from
> > an external API or tool (or even the contents of a file). If you don't
> > know the encoding, keeping the bytes form is obviously recommended.
> 
> I disagree with you: the filename stored in the binary content/network stream 
> may be encoded with a different code page than the current Windows code page. 
> The application have to decode the filename itself, the application has more 
> information about the right encoding than Windows.

I'm not talking about Windows obviously. POSIX filenames are natively
bytes, so if you get a bytes filename from an external source, it makes
sense to reuse the bytes form.

I think it would be a mistake to allow bytes filenames under POSIX but
not under Windows. It makes porting harder.

>  - tar stores filenames... in the locale encoding (except for PAX format which 
> uses utf-8)

So bytes filenames are useful at least for tar. I'm sure there are many
other cases (actually, most kinds of configuration files containing
paths would apply).

Regards

Antoine.




More information about the Python-Dev mailing list