[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

Stephen J. Turnbull stephen at xemacs.org
Sun Apr 17 04:03:54 EDT 2016


Nick Coghlan writes:

 > str and bytes aren't going to implement __fspath__ (since they're
 > only *sometimes* path objects), so asking people to call the
 > protocol method directly for any purpose would be a pain.

It *should* be a pain.  People who need bytes should call fsencode,
people who need str should call fsdecode, and Ethan's antipathy checks
for bytes and str, then calls __fspath__ if needed.  Who's left?  Just
the bartender and the janitor, last call was hours ago.  OK, maybe
there are enough clients to make it worthwhile to provide the utility,
but it should be clearly marked as "double opt-in, for experts only
(consenting adults must show proof of insurance)".

The functionality of raising on wrong types can be incorporated in
fsencode and fsdecode, but I think there's still some discussion
needed about the conditions for raising, and what flags are needed.

Of course with this reinterpretation, names like "fs_ensure_str" and
"fs_ensure_bytes" might be more appropriate (much as y'all hate
putting types in function names, in this case I think that's best).
But backward compatibility, and the existing names aren't *that* bad I
guess.

 > You may have missed my email where I agreed os.fspath() itself
 > needs to ensure the output is a str object and throw an exception
 > otherwise.

Presumably it should do the same for bytes when those are desired,
though.  I don't find the "cast to bytes using memoryview" approach
plausible, especially not where I live: if str, very likely some of
the characters will be outside of the latin1 repertoire, and thus the
internal representation will likely be full of NULs, and certainly not
be what the user wants.

 > The remaining API design debate relates to whether the polymorphic
 > version should be "os.fspath(obj, allow_bytes=True)" or
 > "os._raw_fspath(obj)" (with Ethan favouring the former, and me the
 > latter).

 > > Et tu, Nick?  "Guarantee"?!  You can't guarantee any such thing
 > > with an implicitly invoked polymorphic API like this one --
 > > unless you consider a crashed program to be in the binary
 > > domain. ;-)
 > 
 > I do, as one of the core changes in design philosophy between
 > Python 2 and 3 is attempting to remove the implicit level shifting
 > between the binary and text domains,

Hey, Reverend, I've been singing those hymns since the early '90s.

 > and instead throw exceptions in those cases.

Then I don't understand the current design of fsdecode and fsencode.
Shouldn't they raise on str and bytes respectively, rather than
passing them through?  In general, I would expect that something
that's explicitly intended to be polymorphic would be documented as
such, and the *caller* would be responsible for type-checking and
raising if it got the wrong thing.

Steve


More information about the Python-Dev mailing list