[Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 2
Ethan Furman
ethan at stoneleaf.us
Sun Feb 23 22:25:45 CET 2014
On 02/23/2014 03:30 AM, Victor Stinner wrote:
>
> First, this is a warning in reST syntax:
>
> System Message: WARNING/2 (pep-0461.txt, line 53)
Yup, fixed that.
>> This area of programming is characterized by a mixture of binary data and
>> ASCII compatible segments of text (aka ASCII-encoded text). Bringing back a
>> restricted %-interpolation for ``bytes`` and ``bytearray`` will aid both in
>> writing new wire format code, and in porting Python 2 wire format code.
>
> You may give some examples here: HTTP (Latin1 headers, binary body),
> SMTP, FTP, etc.
>
>> All the numeric formatting codes (such as ``%x``, ``%o``, ``%e``, ``%f``,
>> ``%g``, etc.) will be supported, and will work as they do for str, including
>> the padding, justification and other related modifiers.
>
> IMO you should give the exhaustive list here and we should only
> support one formatter for integers: %d. Python 2 supports "%d", "%u"
> and "%i" with "%u" marked as obsolete. Python 3.5 should not
> reintroduce obsolete formatters. If you want to use the same code base
> for Python 2.6, 2.7 and 3.5: modify your code to only use %d. Same
> rule apply for 2to3 tool: modify your source code to be compatible
> with Python 3.
A link is provided to the exhaustive list. Including it verbatim here detracts from the overall readablity.
I agree that having only one decimal format code would be nice, or even two if the second one did something different,
and that three seems completely over the top -- unfortunately, Python 3.4 still supports all three (%d, %i, and %u).
Not supporting two of them would just lead to frustration. There is also no reason to exclude %o nor %x and making the
programmer reach for oct() and hex(). We're trying to simplify %-interpolation, not garner exclamations of "What were
they thinking?!?" ;)
>> ``%s`` is restricted in what it will accept::
>>
>> - input type supports ``Py_buffer`` [6]_?
>> use it to collect the necessary bytes
>>
>> - input type is something else?
>> use its ``__bytes__`` method [7]_ ; if there isn't one, raise a
>> ``TypeError``
>
> Hum, you may mention that bytes(n: int) creates a bytes string of n
> null bytes, but b'%s' % 123 will raise an error because
> int.__bytes__() is not defined. Just to be more explicit.
I added a line stating that %s does not accept numbers, but I'm not sure how bytes(n: int) is relevant?
>> ``%a`` will call :func:``ascii()`` on the interpolated value's
>> :func:``repr()``.
>> This is intended as a debugging aid, rather than something that should be
>> used
>> in production. Non-ascii values will be encoded to either ``\xnn`` or
>> ``\unnnn``
>> representation.
>
> (You forgot "/Uhhhhhhhh" representation (it's an antislah, but I don't
> see the key on my Mac keyboard?).)
Hard to forget what you don't know. ;) Will ascii() ever emit an antislash representation?
> What is the use case of this *new* formatter? How do you use it?
An aid to debugging -- need to see what's what at that moment? Toss it into %a. It is not intended for production
code, but is included to hopefully circumvent the inappropriate use of __bytes__ methods on classes.
> print(b'%a" % 123) may emit a BytesWarning and may lead to bugs.
Why would it emit a BytesWarning?
> I would like to help you to implement the PEP. IMO we should share as
> much code as possible with PyUnicodeObject. Something using the
> stringlib and maybe a new PyBytesWriter API which would have an API
> close to PyUnicodeWriter API. We should also try to share code between
> PyBytes_Format() and PyBytes_FromFormat().
Thanks. I'll holler when I get that far. :)
--
~Ethan~
More information about the Python-Dev
mailing list