[Python-Dev] pathlib - current status of discussions

Victor Stinner victor.stinner at gmail.com
Wed Apr 13 18:19:42 EDT 2016


Oh, since others voted, I will also vote and explain my vote.

I like choice 1, str only, because it's very well defined. In Python
3, Unicode is simply the native type for text. It's accepted by almost
all functions. In other emails, I also explained that Unicode is fine
to store undecodable filenames on UNIX, it works as expected since
many years (since Python 3.3).

--

If you cannot survive without bytes, I suggest to add two functions:
one for str only, another which can return str or bytes.

Maybe you want in fact two protocols: __fspath__(str only) and
__fspathb__ (bytes only)? os.fspathb() would first try __fspathb__, or
fallback to os.fsencode(__fspath__). os.fspath() would first try
__fspath__, or fallback to os.fsdecode(__fspathb__). IMHO it's not
worth to have such complexity while Unicode handles all use cases.

Or do you know functions implemented in Python accepting str *and* bytes?

--

The C implementation of the os module has an important
path_converter() function:

 * path_converter accepts (Unicode) strings and their
 * subclasses, and bytes and their subclasses.  What
 * it does with the argument depends on the platform:
 *
 *   * On Windows, if we get a (Unicode) string we
 *     extract the wchar_t * and return it; if we get
 *     bytes we extract the char * and return that.
 *
 *   * On all other platforms, strings are encoded
 *     to bytes using PyUnicode_FSConverter, then we
 *     extract the char * from the bytes object and
 *     return that.

This function will implement something like os.fspath().

With os.fspath() only accepting str, we will return directly the
Unicode string on Windows. On UNIX, Unicode will be encoded, as it's
already done for Unicode strings.

This specific function would benefit of the flavor 4 (os.fspath() can
return str and bytes), but it's more an exception than the rule. I
would be more a micro-optimization than a good reason to drive the API
design.

Victor

Le mercredi 13 avril 2016, Brett Cannon <brett at python.org> a écrit :
>
> https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 has the four potential approaches implemented (although it doesn't follow the "separate functions" approach some are proposing and instead goes with the allow_bytes approach I originally proposed).


More information about the Python-Dev mailing list