[Python-ideas] Dunder method to make object str-like

Nick Coghlan ncoghlan at gmail.com
Sat Apr 9 01:00:07 EDT 2016


On 8 April 2016 at 04:48, Terry Reedy <tjreedy at udel.edu> wrote:
> To me, the default proposal to expand the domain of open and other path
> functions is to call str on the path arg, either always or as needed. We
> should then ask "why isn't str() good enough"?  Most bad args for open will
> immediately result in a file-not-found exception.

Not when you're *creating* files and directories.

Using "str(path)" as the protocol means these all become valid operations:

  open(1.0, "w")
  open(object, "w")
  open(object(), "w")
  open(str, "w")
  open(input, "w")

Everything implements __str__ or __repr__, so *everything* becomes
acceptable as an argument to filesystem mutating operations, instead
of those operations bailing out immediately complaining they've been
asked to do something that doesn't make any sense.

I strongly encourage folks interested in the fspath protocol design
debate to read the __index__ PEP:
https://www.python.org/dev/peps/pep-0357/

Start from the title: "Allowing Any Object to be Used for Slicing"

The protocol wasn't designed in the abstract: it had a concrete goal
of allowing objects other than builtins to be usable in the "x:y:z"
slicing syntax.

Those objects weren't hypothetical either: the rationale spells out

"In NumPy, for example, there are 8 different integer scalars
corresponding to unsigned and signed integers of 8, 16, 32, and 64
bits.  These type-objects could reasonably be used as integers in many
places where Python expects true integers but cannot inherit from the
Python integer type because of incompatible memory layouts.  There
should be some way to be able to tell Python that an object can behave
like an integer."

The PEP also spells out what's wrong with the "just use int(obj)" alternative:

"It is not possible to use the nb_int (and __int__ special method) for
this purpose because that method is used to *coerce* objects to
integers.  It would be inappropriate to allow every object that can be
coerced to an integer to be used as an integer everywhere Python
expects a true integer.  For example, if __int__ were used to convert
an object to an integer in slicing, then float objects would be
allowed in slicing and x[3.2:5.8] would not raise an error as it
should."

Extending the use of the protocol to other contexts (such as sequence
repetition and optimised lookups on range objects) was then taken up
on a case by case basis, but the protocol semantics themselves were
defined by that original use case of "allow NumPy integers to be used
when slicing sequences".

The equivalent motivating use case here is "allow pathlib objects to
be used with the open() builtin, os module functions, and os.path
module functions".

The open() builtin handles str paths, integers (file descriptors), and
bytes-like objects (pre-encoded paths)
The os and os.path functions handle some combination of those 3
depending on the specific function

Working directly with file descriptors is relatively rare, so we can
leave that as a special case.
Similarly, working directly with bytes-like objects introduces
cross-platform portability problems, and also changes the output type
of many operations, so we'll keep that as a special case, too.

That leaves the text representation, and the question of defining
equivalents to "operator.index" and its underlying __index__ protocol.

My suggestion of os.fspath as the conversion function is based on:

- "path" being too generic (we have sys.path, os.path, and the PATH
envvar as potential sources of confusion)
- "fspath" being similar to "os.fsencode" and "os.fsdecode", which are
the operations for converting a filesystem path in text form to and
from its bytes-like object form
- os and os.path being two of the main consumers of the proposed protocol
- os being a builtin module that underpins most filesystem operations
anyway, so folks shouldn't be averse to importing it in code that
wants to consume the new protocol

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list