[Python-Dev] When should pathlib stop being provisional?

Wed Apr 6 05:02:09 EDT 2016

On 6 April 2016 at 06:00, Guido van Rossum <guido at python.org> wrote:
> On Tue, Apr 5, 2016 at 9:29 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
>> [...] we can't do:
>>
>>     app_root = Path(...)
>>     config = app_root/'settings.cfg'
>>     with open(config) as blah:
>>         # whatever
>>
>> It feels like instead of addressing this basic disconnect, the answer has
>> instead been:  add that to pathlib!  Which works great -- until a user or a
>> library gets this path object and tries to use something from os on it.
>
> I agree that asking for config.open() isn't the right answer here
> (even if it happens to work). But in this example, once 3.5.2 is out,
> the solution would be to use open(config.path), and that will also
> work when passing it to a library. Is it still unacceptable then?

My sense is that this will remain unacceptable to those people who
have a problem here.

The issue is not so much the ugliness of the code (in spite of the
fact that this is what people focus on) but rather the disconnect
between the mental model people have and the reality of the code they
have to write.

The basic idea behind pathlib.Path objects is that they represent a
*path*. And when you call open, you should pass it a path. So (the
argument goes) why should you have to convert the path you have (a
Path object) to pass it to a function (like open) that requires a path
argument?

Making stdlib functions work with Path objects would fix a lot of the
conceptual difficulties here. And it would also mean that (thanks to
duck typing) a lot of 3rd party code would work without change,
further alleviating the issue. But ultimately, there will still be
code that needs changing to be aware of Path objects. The change is
simple enough (patharg = str(patharg), or the getattr('path')
approach) but it's a change in mental model (this time by library
authors) and the benefit of the change is not sufficiently obvious.

Inheriting from str is the commonly-proposed solution, because in
practical terms it works. But it does so by mixing layers of
abstraction in a way that is difficult to explain to someone who
thinks of a "path" as an abstract object rather than as a (text?
byte?) string. Ultimately, all that's happening is that the burden of
keeping the abstractions separate is placed on the design, rather than
being explicit in the code. But while I have no evidence that this is
a problem, it does leave me with a nagging feeling that it "seems
similar to the bytes/text issue".

My feelings:
- I'd *like* to push for the cleaner separation of abstractions that a
"pure" Path object provides.
- It does need library writers (and in particular the stdlib) to "buy
into" the model and make changes to support Path objects
- I don't have a huge problem with using str(p) or p.path as a
workaround during the transition, but that's from the POV of throwaway
scripting. I'm not sure I'd be so happy using the workaround in code
that would need to be supported for a long time.
- I'd rather compromise on principles than abandon the idea of a
stdlib Path object
- In practical terms, inheriting from str is probably fine. At least
evidence from 3rd party path libraries indicates so.

Paul