[Python-Dev] PEP 461 Final?

Ethan Furman ethan at stoneleaf.us
Sat Jan 18 02:51:05 CET 2014


On 01/17/2014 05:27 PM, Steven D'Aprano wrote:
> On Fri, Jan 17, 2014 at 08:49:21AM -0800, Ethan Furman wrote:
>>
>> Overriding Principles
>> =====================
>>
>> In order to avoid the problems of auto-conversion and Unicode
>> exceptions that could plague Py2 code, all object checking will
>> be done by duck-typing, not by values contained in a Unicode
>>  representation [3]_.
>
> I don't understand this paragraph. What does "values contained in a
> Unicode representation" mean?

Yeah, that is clunky.  I'm trying to convey the idea that we don't want errors based on content, i.e. which characters 
happens to be in a str.


> [...]
>> %s is restricted in what it will accept::
>>
>>    - input type supports Py_buffer?
>>      use it to collect the necessary bytes
>
> Can you give some examples of what types support Py_buffer? Presumably
> bytes. Anything else?

Anybody?  Otherwise I'll go spelunking in the code.


>>    - input type is something else?
>>      use its __bytes__ method; if there isn't one, raise a TypeError
>
> I think you should explicitly state that this is a new special method,
> and state which built-in types will grow a __bytes__ method (if any).

It's not new.  I know bytes, str, and numbers /do not/ have __bytes__.


>> Numeric Format Codes
>> --------------------
>>
>> To properly handle int and float subclasses, int(), index(), and float()
>> will be called on the objects intended for (d, i, u), (b, o, x, X), and
>> (e, E, f, F, g, G).
>
>
> -1 on this idea.
>
> This is a rather large violation of the principle of least surprise, and
> radically different from the behaviour of Python 3 str. In Python 3,
> '%d' interpolation calls the __str__ method, so if you subclass, you can
> get the behaviour you want:

Did you read the bug reports I linked to?  This behavior (which is a bug) has already been fixed for Python3.4.

As a quick thought experiment, why does "%d" % True return "1"?


>> Unsupported codes
>> -----------------
>>
>> %r (which calls __repr__), and %a (which calls ascii() on __repr__) are not
>> supported.
>
> +1 on not supporting b'%r' (i.e. I agree with the PEP).
>
> Why not support b'%a'? That seems to be a strange thing to prohibit.

I'll admit to being somewhat on the fence about %a.

It seems there are two possibilities with %a:

   1) have it be ascii(repr(obj))

   2) have it be str(obj).encode('ascii', 'strict')

(1) seems only useful for debugging, but even then not very -- if you switch from %s to %a you'll no longer see the 
bytes output (although you would get the name of the object, which could be handy);

(2) is (slightly) blurring the lines between text and encoded-ascii;  I would rather see "%s" % text.encode('ascii', 
'strict')"

So we have two possibilities, both can be useful, I don't know which is most useful or even most logical.

So I guess I'm still open to arguments.  :)


> Everythng else, well done and thank you.

You're welcome!  Thank you to everyone who participated.

--
~Ethan~


More information about the Python-Dev mailing list