[Python-Dev] PEP 461 Final?

Sat Jan 18 14:48:55 CET 2014

On 18 Jan 2014 11:52, "Ethan Furman" <ethan at stoneleaf.us> wrote:
>
> On 01/17/2014 05:27 PM, Steven D'Aprano wrote:
>>
>> On Fri, Jan 17, 2014 at 08:49:21AM -0800, Ethan Furman wrote:
>>>
>>>
>>> Overriding Principles
>>> =====================
>>>
>>> In order to avoid the problems of auto-conversion and Unicode
>>> exceptions that could plague Py2 code, all object checking will
>>> be done by duck-typing, not by values contained in a Unicode
>>>  representation [3]_.
>>
>>
>> I don't understand this paragraph. What does "values contained in a
>> Unicode representation" mean?
>
>
> Yeah, that is clunky.  I'm trying to convey the idea that we don't want
errors based on content, i.e. which characters happens to be in a str.
>
>
>
>> [...]
>>>
>>> %s is restricted in what it will accept::
>>>
>>>    - input type supports Py_buffer?
>>>      use it to collect the necessary bytes
>>
>>
>> Can you give some examples of what types support Py_buffer? Presumably
>> bytes. Anything else?
>
>
> Anybody?  Otherwise I'll go spelunking in the code.

bytes, bytearray, memoryview, ctypes arrays, array.array, numpy.ndarrray

It may actually be clearer to express this in terms of memoryview for the
benefits of those that aren't familiar with the C API, as that is the
closest equivalent Python level API (while there is an open issue regarding
the C only nature of the buffer export API, nobody has volunteered to put
together a PEP and implementation for a Python level follow up to the C
level PEP 3118. The problem is that the original use cases involve C
extensions anyway, so the relevant experts don't have any personal need for
a Python level buffer exporter interface. Instead, it's in the "should be
done for completeness, and would make some of our testing easier, but
doesn't have anyone clamouring for it" bucket.

>
>
>
>>>    - input type is something else?
>>>      use its __bytes__ method; if there isn't one, raise a TypeError
>>
>>
>> I think you should explicitly state that this is a new special method,
>> and state which built-in types will grow a __bytes__ method (if any).
>
>
> It's not new.  I know bytes, str, and numbers /do not/ have __bytes__.

Right, it is already used by bytes to convert arbitrary objects to a binary
representation. The difference with Py_buffer/memoryview is that they
provide access to the raw data without necessarily copying anything.

str and numbers don't implement it as there's no obvious default
interpretation (the b'\x00' * n interpretation of integers is part of the
bytes constructor and now a decision we mostly regret - it should have been
a keyword argument or a separate class method)

>
>
>
>>> Unsupported codes
>>> -----------------
>>>
>>> %r (which calls __repr__), and %a (which calls ascii() on __repr__) are
not
>>> supported.
>>
>>
>> +1 on not supporting b'%r' (i.e. I agree with the PEP).
>>
>> Why not support b'%a'? That seems to be a strange thing to prohibit.
>
>
> I'll admit to being somewhat on the fence about %a.
>
> It seems there are two possibilities with %a:
>
>   1) have it be ascii(repr(obj))
>
>   2) have it be str(obj).encode('ascii', 'strict')

This gets very close to crossing the line into implicit encoding of text
again. Binary interpolation is being added back for the specific use case
of working with ASCII compatible segments in binary formats, and it's at
best arguable that supporting %a will help with that use case.

However, without it, there may be a greater temptation to inappropriately
define __bytes__ just to support binary interpolation, rather than because
a type truly has an appropriate translation directly to bytes.

By allowing %a, we avoid that temptation. This is also potentially useful
specifically in the case of binary logging formats and as a quick way to
request backslash escaping of non-ASCII characters in text.

Call it +0.5 for allowing %a. I don't expect it to be used heavily, but I
think it will head off a fair bit of potential misuse of __bytes__.

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140118/8d23d332/attachment.html>