Python 3 encoding question: Read a filename from stdin, subsequently open that filename

Albert Hopkins marduk at letterboxes.org
Tue Nov 30 08:46:53 EST 2010


On Tue, 2010-11-30 at 11:52 +0100, Peter Otten wrote:
Dan Stromberg wrote:
> 
> > I've got a couple of programs that read filenames from stdin, and
then
> > open those files and do things with them.  These programs sort of do
> > the *ix xargs thing, without requiring xargs.
> > 
> > In Python 2, these work well.  Irrespective of how filenames are
> > encoded, things are opened OK, because it's all just a stream of
> > single byte characters.
> 
> I think you're wrong. The filenames' encoding as they are read from
stdin 
> must be the same as the encoding used by the file system. If the file
system 
> expects UTF-8 and you feed it ISO-8859-1 you'll run into errors.
> 
> I think this is wrong.  In Unix there is no concept of filename
encoding.  Filenames can have any arbitrary set of bytes (except '/' and
'\0').   But the filesystem itself neither knows nor cares about
encoding.

You always have to know either
> 
> (a) both the file system's and stdin's actual encoding, or 
> (b) that both encodings are the same.
> 
> 
If this is true, then I think that it is wrong to do in Python3.  Any
language should be able to deal with the filenames that the host OS
allows.

Anyway, going on with the OP.. can you open stdin so that you can accept
arbitrary bytes instead of strings and then open using the bytes as the
filename? I don't have that much experience with Python3 to say for
sure.

-a





More information about the Python-list mailing list