Newbie question about text encoding

Mark Lawrence breamoreboy at yahoo.co.uk
Sat Mar 7 14:00:47 EST 2015


On 07/03/2015 18:34, Dan Sommers wrote:
> On Sun, 08 Mar 2015 05:13:09 +1100, Chris Angelico wrote:
>
>> On Sun, Mar 8, 2015 at 5:02 AM, Dan Sommers <dan at tombstonezero.net> wrote:
>>> On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:
>>>
>>>> On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa <marko at pacujo.net> wrote:
>>>
>>>>> Correct. Linux pathnames are octet strings regardless of the locale.
>>>>>
>>>>> That's why Linux developers should refer to filenames using bytes.
>>>>> Unfortunately, Python itself violates that principle by having
>>>>> os.listdir() return str objects (to mention one example).
>>>>
>>>> Only because you gave it a str with the path name. If you want to
>>>> refer to file names using bytes, then be consistent and refer to ALL
>>>> file names using bytes. As I demonstrated, that works just fine.
>>>
>>> Python 3.4.2 (default, Oct  8 2014, 10:45:20)
>>> [GCC 4.9.1] on linux
>>> Type "help", "copyright", "credits" or "license" for more information.
>>>>>> import os
>>>>>> type(os.listdir(os.curdir)[0])
>>> <class 'str'>
>>
>> Help on module os:
>>
>> DESCRIPTION
>>      This exports:
>>        - os.curdir is a string representing the current directory ('.' or ':')
>>        - os.pardir is a string representing the parent directory ('..' or '::')
>>
>> Explicitly documented as strings. If you want to work with strings,
>> work with strings. If you want to work with bytes, don't use
>> os.curdir, use bytes instead. Personally, I'm happy using strings, but
>> if you want to go down the path of using bytes, you simply have to be
>> consistent, and that probably means being platform-dependent anyway,
>> so just use b"." for the current directory.
>
> I think we're all agreeing:  not all file systems are the same, and
> Python doesn't smooth out all of the bumps, even for something that
> seems as simple as displaying the names of files in a directory.  And
> that's *after* we've agreed that filesystems contain files in
> hierarchical directories.
>
> Dan
>

Isn't pathlib 
https://docs.python.org/3/library/pathlib.html#module-pathlib 
effectively a more recent attempt at smoothing or even removing (some 
of) the bumps?  Has anybody here got experience of it as I've never used it?

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence




More information about the Python-list mailing list