[Python-ideas] Working with Path objects: p-strings?

Michel Desmoulin desmoulinmichel at gmail.com
Sat Mar 26 10:03:16 EDT 2016



Le 26/03/2016 13:43, Alexander Walters a écrit :
> I have been watching this thread, and read the docs and the PEP, and...
> 
> What is pathlib even for?
> 
> Yes, its there to be an object oriented abstraction of a file system. 
> Why?  Why do I want this?

Because it makes a very common task (FS manipulation) much more natural.

This is the same reason people like request over urllib or like
@decorator over func = decorator(func) or unpacking vs manual item extract.

In a directory of music files, you ave mp3 files you converted to other
formats. Now you want to remove those conversion. Your task is to find
all files having the same name as the ones with the mp3 ones, but with a
different extension, remove them, then list the mp3 absolument path in a
text file.

This example has baked in a lot of the tasks you do when you use Python
as a scripting language at the core of you job such as if it's your glue
language, if you are a tester or for sysadmin tasks. But it helps also
everybody that once in a file does a one off script.

The traditional approach:

import os
import glob
import sys

root = os.path.abspath(sys.argv[1])

playlist = os.path.join(root, 'playlist.m3u8')

with open(playlist, 'w') as f:

    for path in glob.glob(os.path.join(root, '*.mp3')):

        name, ext = os.path.splitext(os.path.basename(path))

        for to_remove in glob.glob(os.path.join(root, name + '.*')):
            if not to_remove.endswith('mp3'):
                os.remove(to_remove)

        f.write(os.path.join(root, path) + "\n")

Now with pathlib you don't have to wonder about whether the feature you
are looking for is on "os", "os.path" or "glob" or "open". You don't
have to deal the file opening for such a small script. You don't have to
switch between functions and methods all the time and have to choose
between nested function calls or intermediary variables.

The pathlib version is way easier to figure out without knowing the
stdlib by heart, it is one line shorter and one level of indent less.
And you can discover it all from iPython with ".<tab>":

import sys
import pathlib

root = pathlib.Path(sys.argv[1])

files = []
for path in root.glob('*.mp3'):

    name = str(path).replace(path.suffix, '.*')
    files.append(str(path.absolute()))

    for to_remove in root.glob(name):
        if to_remove.suffix != ".mp3":
            to_remove.unlink()

(root / 'playlist.m3u8').write_text('\n'.join(files))

And this true while pathlib is a half backed written lib, since the
competition (that existed before pathlib and that pathlib failed to
inspire from), can do even shorter, easier and cleaner:

import sys
import path

root = path.Path(sys.argv[1]).realpath()

files = []
for p in root.glob('*.mp3'):

    name = p.namebase + '.*'
    files.append(p)

    for to_remove in root.glob(name):
        if to_remove.ext != ".mp3":
            to_remove.remove()

(root / 'playlist.m3u8').write_lines(files)

Because path.py:

- inherit from str
- has all the methods from os, not just a few cherry picked
- has logical names for attributes and methods
- have more utilities than pathlib or os

So yes, if you do a lot of scripting, this is a must. It's also way
easier for beginers to grasp.




  Am I alone in wondering why this exists?  Is
> it even worth improving?
> 
> On 3/26/2016 06:59, Andrew Barnert via Python-ideas wrote:
>> On Mar 25, 2016, at 13:20, Koos Zevenhoven <k7hoven at gmail.com> wrote:
>>> So, let's start a new thread about how to deal with pathlib.Path
>>> objects, involving how/whether to convert between str and Path
>> As a different point:
>>
>> If we _don't_ either subclass str or find some way to make things
>> magically work, is there any other way to start getting more uptake on
>> pathlib? This may only be anecdotal, but from what I can tell, nobody
>> is using it, because everyone who tries is put off by the need to
>> convert back and forth everywhere.
>>
>> Last year, everyone agreed that it would be good if at least the
>> stdlib accepted paths everywhere, which might prompt third party libs
>> to start doing the same. But nobody's started writing patches to do
>> that. And I'm not sure we'd want it to happen in an uncoordinated way
>> anyway, since there are at least four different ways to do it, and if
>> we pick among them arbitrarily for different parts of the stdlib,
>> we'll have a huge mess, and possibly behavioral inconsistencies.
>>
>> The four ways I can think of are (in every function that currently
>> takes a "path: str" argument, and should now take a "path: str | Path"):
>>
>>   * path = str(path)
>>   * path = Path(path)
>>   * if isinstance(path, Path): ... else: ...
>>   * try: f = path.open('w') except AttributeError: open(path, 'w')
>>
>> It's also worth noting that all but the first require every module to
>> depend on Path. Including C modules. And modules in the bootstrap. But
>> the first version makes a bad guide for third-party code, because a
>> lot of third-party code is dual-version/single-source libs, and you
>> definitely don't want to unconditionally call str on a path argument
>> that may be Unicode in 2.7 (or to unconditionally call a six-style
>> Unicode function on a path argument that may be str in 2.7 on Linux).
>>
>> So, assuming we all want a future in which pathlib is actually used by
>> people, and assuming str subclassing is out and there's no other magic
>> bullet, how do we get there from here?
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


More information about the Python-ideas mailing list