LC_ALL and os.listdir()
Kenneth Pronovici
pronovic at skyjammer.com
Wed Feb 23 18:37:16 EST 2005
On Wed, Feb 23, 2005 at 10:07:19PM +0100, "Martin v. Löwis" wrote:
> So we have three options:
> 1. skip this string, only return the ones that can be
> converted to Unicode. Give the user the impression
> the file does not exist.
> 2. return the string as a byte string
> 3. refuse to listdir altogether, raising an exception
> (i.e. return nothing)
>
> Python has chosen alternative 2, allowing the application
> to implement 1 or 3 on top of that if it wants to (or
> come up with other strategies, such as user feedback).
Understood. This appears to be the most flexible solution among the
three.
> >3) The proper "general" way to deal with this situation?
>
> You can chose option 1 or 3; you could tell the user
> about it, and then ignore the file, you could try to
> guess the encoding (UTF-8 would be a reasonable guess).
Ok.
> >My goal is to build generalized code that consistently works with all
> >kinds of filenames.
>
> Then it is best to drop the notion that file names are
> character strings (because some file names aren't). You
> do so by converting your path variable into a byte
> string. To do that, you could try
[snip]
> So your code would read
>
> try:
> path = path.encode(sys.getfilesystemencoding() or
> sys.getdefaultencoding())
> except UnicodeError:
> print >>sys.stderr, "Invalid path name", repr(path)
> sys.exit(1)
This makes sense to me. I'll work on implementing it that way.
Thanks for the in-depth explanation!
KEN
--
Kenneth J. Pronovici <pronovic at ieee.org>
Personal Homepage: http://www.skyjammer.com/~pronovic/
"They that can give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
- Benjamin Franklin, Historical Review of Pennsylvania, 1759
More information about the Python-list
mailing list