[Python-ideas] PEP 428 - object-oriented filesystem paths

Nick Coghlan ncoghlan at gmail.com
Mon Oct 8 12:31:06 CEST 2012


I've said before that I like the general shape of the pathlib API and
that's still the case. It's the only OO API I've seen that's
semantically clean enough for me to support introducing it as "the"
standard path abstraction in the standard library.

However, there are still a few rough edges I would like to see smoothed out :)

On Sat, Oct 6, 2012 at 5:48 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Sat, 6 Oct 2012 11:27:58 +0100
> Paul Moore <p.f.moore at gmail.com> wrote:
>> I agree that's what I thought relative() would be when I first read the name.
>
> You are right, relative() could be removed and replaced with the
> current relative_to() method. I wasn't sure about how these names would
> feel to a native English speaker.

The minor problem is that "relative" on its own is slightly unclear
about whether the invariant involved is "a ==
b.subpath(a.relative(b))" or "b == a.subpath(a.relative(b))"

By including the extra word, the intended meaning becomes crystal
clear: "a == b.subpath(a.relative_to(b))"

However, "a relative to b" is the more natural interpretation, so +1
for using "relative" for the semantics of the method based equivalent
to the current os.path.relpath(). I agree there's no need for a
shorthand for "a.relative(a.root)"

As the invariants above suggest, I'm also currently -1 on *any* of the
proposed shorthands for "p.subpath(subpath)", *as well as* the use of
"join" as the method name (due to the major difference in semantics
relative to str.join).

All of the shorthands are magical and/or cryptic and save very little
typing over the explicitly named method. As already noted in the PEP,
you can also shorten it manually by saving the bound method to a local
variable.

It's important to remember that you can't readily search for syntactic
characters or common method names to find out what they mean, and
these days that kind of thing should be taken into account when
designing an API. "p.subpath('foo', 'bar')" looks like executable
pseudocode for creating a new path based on existing one to me, unlike
"p / 'foo' / 'bar'", "p['foo', 'bar']", or "p.join('foo', 'bar')".

The method semantics are obvious by comparison, since they would be
the same as those for ordinary construction: "p.subpath(*args) ==
type(p)(p, *args)"

I'm not 100% sold on "subpath" as an alternative (since ".." entries
may mean that the result isn't really a subpath of the original
directory at all), but I do like the way it reads in the absence of
parent directory references, and I definitely like it better than
"join" or "[]" or "/" or "+". This interpretation is also favoured by
the fact that the calculation of relative path references is strict by
default (i.e. it won't insert ".." to make the reference work when the
target isn't a subpath)

> You can't really add '..' components and expect the result to be
> correct, for example if '/usr/lib' is a symlink to '/lib', then
> '/usr/lib/..' is '/', not /usr'.
>
> That's why the resolve() method, which resolves symlinks along the path,
> is the only one allowed to muck with '..' components.

This seems too strict for the general case. Configuration files in
bundled applications, for example, often contain paths relative to the
file (e.g. open up a Visual Studio project file). There are no
symlinks involved there. Perhaps a "require_subpath" flag that
defaults to True would be appropriate? Passing "require_subpath=False"
would then provide explicit permission to add ".." entries as
appropriate, and it would be up to the developer to document the "no
symlinks!" restriction on their layout.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-ideas mailing list