[Python-ideas] format specifier for "not bytes"

Barry Warsaw barry at python.org
Mon Aug 27 16:34:38 CEST 2012


On Aug 25, 2012, at 09:16 AM, Nick Coghlan wrote:

>A couple of people at PyCon Au mentioned running into this kind of issue
>with Python 3. It relates to the fact that:
>1. String formatting is *coercive* by default
>2. Absolutely everything, including bytes objects can be coerced to a
>string, due to the repr() fallback
>
>So it's relatively easy to miss a decode or encode operation, and end up
>interpolating an unwanted "b" prefix and some quotes.
>
>For existing versions, I think the easiest answer is to craft a regex that
>matches bytes object repr's and advise people to check that it *doesn’t*
>match their formatted strings in their unit tests.
>
>For 3.4+ a non-coercive string interpolation format code may be desirable.

Or maybe just one that calls __str__ without a __repr__ fallback?

FWIW, the representation of bytes with the leading b'' does cause problems
when trying to write doctests that work in both Python 2 and 3.

http://www.wefearchange.org/2012/01/python-3-porting-fun-redux.html

It might be a bit nicer to be able to write:

>>> print('{:S}'.format(somebytes))

Of course, in the bytes case, its __str__() would have to be rewritten to not
call its __repr__() explicitly.  It's probably not worth it just to save from
writing a small helper function, but it would be useful in eliminating a
surprising gotcha.

The other option is of course just to make doctests smarter[*].

Cheers,
-Barry

[*] Doctest haters need not respond snarkily. :)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120827/0b831205/attachment.pgp>


More information about the Python-ideas mailing list