[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

Koos Zevenhoven k7hoven at gmail.com
Wed Apr 20 04:20:10 EDT 2016


On Wed, Apr 20, 2016 at 6:11 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Koos Zevenhoven writes:
>  > On Tue, Apr 19, 2016 at 2:55 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>  > >
>  > > AFAICS bytes return from __fspath__ is just YAGNI.  Show me something
>  > > that actually wants it.
>  >
>  > It might be,
>
> May I take that as meaning you just jumped to the conclusion that
> extending polymorphism is useful on no actual evidence of usefulness?

No you may not! YAGNI almost never means "you are *never* going to
need it". And if you implement a feature, better implement it well. If
a variation of the feature is rarely used, that is perfectly fine. I
think leaving bytes out would complicate things. If os.fspath does its
job well, everyone should be happy.

I kept bringing up bytes paths, because that is already a feature in
Python 3. Then (already some time ago in these discussions) I briefly
visited the thought of 'can we deprecate bytes paths', and it then
quickly became clear to me that is not going to happen any time soon.

In other words: As long as bytes paths are supported, they should be
supported consistently. I don't want DirEntry to behave differently
when the underlying type is bytes, which is one of the things I've
been talking about all the time. That would just be broken. And as you
also understand, one point is to allow passing DirEntry to open. Or
any of the os.path functions.

An some more: I don't want open(direntry_obj) to ever raise because it
is the bytes flavor of direntry, because, when they are created,
DirEntry objects always point to existing objects on the file system.
I also don't want implicit conversions between str and bytes paths,
because there are cases where they will produce strange results and
exceptions. [Yes, way back in the p-string thread, I did first suggest
a similiar thing that implied implicit conversion, but I soon
abandoned that part.]

Not that I will ever use these features---just to do this right.

>  > but as long as bytes paths are supported polymorphicly all over the
>  > stdlib, we won't get rid of supporting bytes paths. So are you
>  > proposing to deprecate bytes paths?
>
> You claim "almost always want str", Ethan claims "bias against bytes."
> Sorry, guys, you can't have it both ways.  Either bytes paths are
> discouraged (not "deprecated", not yet), or they aren't.
>
> I say, let's not encourage them.

It's all essentially the same thing:

"almost always want str":
Yes, I still claim this. This is the reason for str (and rejecting
bytes) being the default for third-party code. If we wanted to, we
could even leave bytes support out of the documentation, so no-one
will know about it unless they already deal with bytes paths. However,
I dont think we should do that---we should just strongly discourage
using the bytes version unless there is a reason to, and you know what
you are doing.

"bias against bytes":
I agree with this too. This is in line with making str (and rejecting
bytes) the default for third-party code.

"let's not encourage them":
And I even agree with this, as you may have noticed.

I just don't believe in deliberately making implementations awkward
for the bytes-based paths. Bytes paths already exist, not because of
Python 2 (as you know), but because not all operating systems
guarantee that paths make sense in any encoding, and people may need
to work at that level.

There is no need to make working with bytes-based paths awkward, and
we can support them with little additional work compared to supporting
str-based rich path objects. The additional work is mostly this
discussion.

> Ie, keep the status quo for bytes,
> and make things better for the preferred str.  Yes, that means
> discouraging bytes relative to str in this context.  That's a Python 3
> principle, one strong enough to justify the huge compatibility break
> involved in making str be Unicode.  That compatibility break has been
> extremely successful in my personal experience as a sometime Python
> teacher and Mailman developer, though the Mercurial developers have a
> different POV.

Yes. Luckily, people are already using str-based paths. We don't need
any more discrete transitions. If linux will start to enforce an
encoding, as Guido and Random832 may be suggesting on python-ideas,
these already obscure bytes paths will slowly fade away.

-Koos


More information about the Python-Dev mailing list