Newbie question about text encoding

Dan Sommers dan at tombstonezero.net
Sat Mar 7 13:34:27 EST 2015


On Sun, 08 Mar 2015 05:13:09 +1100, Chris Angelico wrote:

> On Sun, Mar 8, 2015 at 5:02 AM, Dan Sommers <dan at tombstonezero.net> wrote:
>> On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:
>>
>>> On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa <marko at pacujo.net> wrote:
>>
>>>> Correct. Linux pathnames are octet strings regardless of the locale.
>>>>
>>>> That's why Linux developers should refer to filenames using bytes.
>>>> Unfortunately, Python itself violates that principle by having
>>>> os.listdir() return str objects (to mention one example).
>>>
>>> Only because you gave it a str with the path name. If you want to
>>> refer to file names using bytes, then be consistent and refer to ALL
>>> file names using bytes. As I demonstrated, that works just fine.
>>
>> Python 3.4.2 (default, Oct  8 2014, 10:45:20)
>> [GCC 4.9.1] on linux
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> import os
>>>>> type(os.listdir(os.curdir)[0])
>> <class 'str'>
> 
> Help on module os:
> 
> DESCRIPTION
>     This exports:
>       - os.curdir is a string representing the current directory ('.' or ':')
>       - os.pardir is a string representing the parent directory ('..' or '::')
> 
> Explicitly documented as strings. If you want to work with strings,
> work with strings. If you want to work with bytes, don't use
> os.curdir, use bytes instead. Personally, I'm happy using strings, but
> if you want to go down the path of using bytes, you simply have to be
> consistent, and that probably means being platform-dependent anyway,
> so just use b"." for the current directory.

I think we're all agreeing:  not all file systems are the same, and
Python doesn't smooth out all of the bumps, even for something that
seems as simple as displaying the names of files in a directory.  And
that's *after* we've agreed that filesystems contain files in
hierarchical directories.

Dan



More information about the Python-list mailing list