[Python-Dev] PEP 460 reboot

Mon Jan 13 20:51:30 CET 2014

On 1/13/2014 1:40 PM, Brett Cannon wrote:

> > So bytes formatting really needn't (and shouldn't, IMO) mirror str
> > formatting.

This was my presumption in writing byteformat().

> I think one of the things about Guido's proposal that bugs me is that it
> breaks the mental model of the .format() method from str in terms of how
> the mini-language works. For str.format() you have the conversion and
> the format spec (e.g. "{!r}" and "{:d}", respectively). You apply the
> conversion by calling the appropriate built-in, e.g. 'r' calls repr().
> The format spec semantically gets passed with the object to format()
> which calls the object's __format__() method: ``format(number, 'd')``.
>
> Now Guido's suggestion has two parts that affect the mini-language for
> .format(). One is that for bytes.format() the default conversion is
> bytes() instead of str(), which is fine (probably want to add 'b' as a
> conversion value as well to be consistent). But the other bit is that
> the format spec goes from semantically meaning ``format(thing,
> format_spec)`` to ``format(thing, format_spec).encode('ascii',
> 'strict')`` for at least numbers. That implicitness bugs me as I have
> always thought of format specs just leading to a call to format(). I
> think I can live with it, though, as long as it is **consistently**
> applied across the board for bytes.format(); every use of a format spec
> leads to calling ``format(thing, format_spec).encode('ascii',
> 'strict')`` no matter what type 'thing' would be and it is clearly
> documented that this is done to ease porting and handle the common case
> then I can live with it.

This is how my byteformat function works, except that when no 
format_spec is given, byte and bytearrary objects are left unchanged 
rather than being decoded and encoded again.

> This even gives people in-place ASCII encoding for strings by always
> using '{:s}' with text which they can do when they port their code to
> run under both Python 2 and 3. So you should be able to do
> ``b'Content-Type: {:s}'.format('image/jpeg')`` and have it give ASCII.
> If you want more explicit encoding to latin-1 then you need to do it
> explicitly and not rely on the mini-language to do tricks for you.
>
> IOW I want to treat the format mini-language as a language and thus not
> have any special-casing or massive shifts in meaning between
> str.format() and bytes.format() so my mental model doesn't have to
> contort based on whether it's str or bytes. My preference is not have
> any, but if Guido is going say PBP here then I want absolute consistency
> across the board in how bytes.format() tweaks things.
>
> As for %s for the % operator calling ascii(), I think that will be a
> porting nightmare of finding out why your bytes suddenly stopped being
> formatted properly and then having to crawl through all of your code for
> that one use of %s which is getting bytes in. By raising a TypeError you
> will very easily detect where your screw-up occurred thanks to the
> traceback; do so otherwise feels too much like implicit type conversion
> and ask any JavaScript developer how that can be a bad thing.

I personally would not add 'bytes % whatever'.

-- 
Terry Jan Reedy