[Python-ideas] Type hinting for path-related functions

Koos Zevenhoven k7hoven at gmail.com
Fri May 13 16:30:18 EDT 2016


It turns out it has been almost a month since this, and the PEP draft
is already looking good. It seems we might now be ready to discuss it.
Should we add the generic type FSPath[str]?

Again, there is a naming issue, and the question of including plain
str and bytes.

We'll need to address this, unless we want the type checker to not
know whether os.path.* etc. return str or bytes and to carry around
Union[str, bytes]. In theory, it would be possible to infer whether it
is str or bytes, as described.

-- Koos

On Tue, Apr 19, 2016 at 3:40 AM, Koos Zevenhoven <k7hoven at gmail.com> wrote:
> I actually proposed this already in one of the pathlib threads on
> python-dev, but I decided to repost here, because this is easily seen
> as a separate issue. I'll start with some introduction, then moving on
> to the actual type hinting part.
>
> In our seemingly never-ending discussions about pathlib support in the
> stdlib in various threads, first here on python-ideas, then even more
> extensively on python-dev, have perhaps almost converged. The required
> changes involve a protocol method, probably named __fspath__, which
> any path-like type could implement to return a more, let's say,
> "classical" path object such as a str. However, the protocol is
> polymorphic and may also return bytes, which has a lot do do with the
> fact that the stdlib itself is polymophic and currently accepts str as
> well as bytes paths almost everywhere, including the newly-introduced
> os.scandir + DirEntry combination. The upcoming improvements will
> further allow passing pathlib path objects as well as DirEntry objects
> to any stdlib function that take paths.
>
> It came up, for instance here [1], that the function associated with
> the protocol, potentially named os.fspath, will end up needing type
> hints. This function takes pathlike objects and turns them into str or
> bytes. There are various different scenarios [2] that can be
> considered for code dealing with paths, but let's consider the case of
> os.path.* and other traditional python path-related functions.
>
> Some examples:
>
> os.path.join
>
> Currently, it takes str or bytes paths and returns a joined path of
> the same type (mixing different types raises an exception).
>
> In the future, it will also accept pathlib objects (underlying type
> always str) and DirEntry (underlying type str or bytes) or third-party
> path objects (underlying type str or bytes). The function will then
> return a pathname of the underlying type.
>
> os.path.dirname
>
> Currently, it takes a str or bytes and returns the dirname of the same type.
> In the future, it will also accept Path and DirEntry and return the
> underlying type.
>
> Let's consider the type hint of os.path.dirname at present and in the future:
>
> Currently, one could write
>
> def dirname(p: Union[str, bytes]) -> Union[str, bytes]:
>     ...
>
> While this is valid, it could be more precise:
>
> pathstring = typing.TypeVar('pathstring', str, bytes)
>
> def dirname(p: pathstring) -> pathstring:
>     ...
>
> This now contains the information that the return type is the same as
> the argument type. The name 'pathstring' may be considered slightly
> misleading because "byte strings" are not actually strings in Python
> 3, but at least it does not advertise the use of bytes as paths, which
> is very rarely desirable.
>
> But what about the future. There are two kinds of rich path objects,
> those with an underlying type of str and those with an underlying type
> of bytes. These should implement the __fspath__() protocol and return
> their underlying type. However, we do care about what (underlying)
> type is provided by the protocol, so we might want to introduce
> something like typing.FSPath[underlying_type]:
>
> FSPath[str]       # str-based pathlike, including str
> FSPath[bytes]  # bytes-based pathlike, including bytes
>
> And now, using the above defined TypeVar pathstring, the future
> version of dirname would be type annotated as follows:
>
> def dirname(p: FSPath[pathstring]) -> pathstring:
>     ...
>
> It's getting late. I hope this made sense :).
>
> -Koos
>
> [1] https://mail.python.org/pipermail/python-dev/2016-April/144246.html
> [2] https://mail.python.org/pipermail/python-dev/2016-April/144239.html


More information about the Python-ideas mailing list