[Python-Dev] pathlib - current status of discussions
Brett Cannon
brett at python.org
Wed Apr 13 19:09:57 EDT 2016
On Wed, 13 Apr 2016 at 15:20 Victor Stinner <victor.stinner at gmail.com>
wrote:
> Oh, since others voted, I will also vote and explain my vote.
>
> I like choice 1, str only, because it's very well defined. In Python
> 3, Unicode is simply the native type for text. It's accepted by almost
> all functions. In other emails, I also explained that Unicode is fine
> to store undecodable filenames on UNIX, it works as expected since
> many years (since Python 3.3).
>
> --
>
> If you cannot survive without bytes, I suggest to add two functions:
> one for str only, another which can return str or bytes.
>
> Maybe you want in fact two protocols: __fspath__(str only) and
> __fspathb__ (bytes only)? os.fspathb() would first try __fspathb__, or
> fallback to os.fsencode(__fspath__). os.fspath() would first try
> __fspath__, or fallback to os.fsdecode(__fspathb__). IMHO it's not
> worth to have such complexity while Unicode handles all use cases.
>
Implementing two magic methods for this seems like overkill. Best I would
be willing to do with automatic encode/decode is use
os.fsencode()/os.fsdecode() on the argument or what __fspath__() returned.
>
> Or do you know functions implemented in Python accepting str *and* bytes?
>
On purpose, nothing off the top of my head.
>
> --
>
> The C implementation of the os module has an important
> path_converter() function:
>
> * path_converter accepts (Unicode) strings and their
> * subclasses, and bytes and their subclasses. What
> * it does with the argument depends on the platform:
> *
> * * On Windows, if we get a (Unicode) string we
> * extract the wchar_t * and return it; if we get
> * bytes we extract the char * and return that.
> *
> * * On all other platforms, strings are encoded
> * to bytes using PyUnicode_FSConverter, then we
> * extract the char * from the bytes object and
> * return that.
>
> This function will implement something like os.fspath().
>
> With os.fspath() only accepting str, we will return directly the
> Unicode string on Windows. On UNIX, Unicode will be encoded, as it's
> already done for Unicode strings.
>
> This specific function would benefit of the flavor 4 (os.fspath() can
> return str and bytes), but it's more an exception than the rule. I
> would be more a micro-optimization than a good reason to drive the API
> design.
>
Yep, it's interesting to know but Chris and I won't let it drive the
decision (I assume).
-Brett
>
> Victor
>
> Le mercredi 13 avril 2016, Brett Cannon <brett at python.org> a écrit :
> >
> > https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1
> has the four potential approaches implemented (although it doesn't follow
> the "separate functions" approach some are proposing and instead goes with
> the allow_bytes approach I originally proposed).
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160413/818e2cf4/attachment.html>
More information about the Python-Dev
mailing list