[Python-Dev] pathlib - current status of discussions

Brett Cannon brett at python.org
Wed Apr 13 19:09:57 EDT 2016


On Wed, 13 Apr 2016 at 15:20 Victor Stinner <victor.stinner at gmail.com>
wrote:

> Oh, since others voted, I will also vote and explain my vote.
>
> I like choice 1, str only, because it's very well defined. In Python
> 3, Unicode is simply the native type for text. It's accepted by almost
> all functions. In other emails, I also explained that Unicode is fine
> to store undecodable filenames on UNIX, it works as expected since
> many years (since Python 3.3).
>
> --
>
> If you cannot survive without bytes, I suggest to add two functions:
> one for str only, another which can return str or bytes.
>
> Maybe you want in fact two protocols: __fspath__(str only) and
> __fspathb__ (bytes only)? os.fspathb() would first try __fspathb__, or
> fallback to os.fsencode(__fspath__). os.fspath() would first try
> __fspath__, or fallback to os.fsdecode(__fspathb__). IMHO it's not
> worth to have such complexity while Unicode handles all use cases.
>

Implementing two magic methods for this seems like overkill. Best I would
be willing to do with automatic encode/decode is use
os.fsencode()/os.fsdecode() on the argument or what __fspath__() returned.


>
> Or do you know functions implemented in Python accepting str *and* bytes?
>

On purpose, nothing off the top of my head.


>
> --
>
> The C implementation of the os module has an important
> path_converter() function:
>
>  * path_converter accepts (Unicode) strings and their
>  * subclasses, and bytes and their subclasses.  What
>  * it does with the argument depends on the platform:
>  *
>  *   * On Windows, if we get a (Unicode) string we
>  *     extract the wchar_t * and return it; if we get
>  *     bytes we extract the char * and return that.
>  *
>  *   * On all other platforms, strings are encoded
>  *     to bytes using PyUnicode_FSConverter, then we
>  *     extract the char * from the bytes object and
>  *     return that.
>
> This function will implement something like os.fspath().
>
> With os.fspath() only accepting str, we will return directly the
> Unicode string on Windows. On UNIX, Unicode will be encoded, as it's
> already done for Unicode strings.
>
> This specific function would benefit of the flavor 4 (os.fspath() can
> return str and bytes), but it's more an exception than the rule. I
> would be more a micro-optimization than a good reason to drive the API
> design.
>

Yep, it's interesting to know but Chris and I won't let it drive the
decision (I assume).

-Brett


>
> Victor
>
> Le mercredi 13 avril 2016, Brett Cannon <brett at python.org> a écrit :
> >
> > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1
> has the four potential approaches implemented (although it doesn't follow
> the "separate functions" approach some are proposing and instead goes with
> the allow_bytes approach I originally proposed).
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160413/818e2cf4/attachment.html>


More information about the Python-Dev mailing list