[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

Chris Angelico rosuav at gmail.com
Tue Apr 12 12:20:17 EDT 2016


On Wed, Apr 13, 2016 at 2:15 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 04/11/2016 04:43 PM, Victor Stinner wrote:
>>
>> Le 11 avr. 2016 11:11 PM, "Ethan Furman" a écrit :
>
>
>>> So my concern in such a case is what happens if we pass this SE
>>> string somewhere else: a UTF-8 file, or over a socket, or into a
>>> database? Does this have issues that we wouldn't face if we just used
>>> bytes?
>>
>>
>> "SE string" are returned by os.listdir(str), os.walk(str),
>> os.getenv(str), sys.argv[int], ... since Python 3.3. Nothing new under
>> the sun.
>
>
> So when we pass a bytes object in, Python (on posix) converts that to a
> string using surrogateescape, gets back strings from the os, and encodes
> them back to bytes, again using surrogateescape?
>
>
>> Trying to encode a surrogate to ascii, latin1 or utf8 raise an encoding
>> error.
>
>
> latin1?  I thought latin1 had a code point for 0-255, so how could using it
> raise an encoding error?

Latin-1 / ISO-8859-1 defines a character for every byte, so any byte
string will *decode*. It only defines 256 characters as having
equivalent bytes, though, so *encoding* can fail.

ChrisA


More information about the Python-Dev mailing list