new string-formatting preferred? (was "What is this syntax ?")

Tue Jun 21 18:19:26 EDT 2011

On 6/21/2011 7:33 AM, Tim Chase wrote:
> On 06/20/2011 09:17 PM, Terry Reedy wrote:
>> On 6/20/2011 8:46 PM, Tim Chase wrote:
>>> On 06/20/2011 05:19 PM, Ben Finney wrote:
>>>> “This method of string formatting is the new standard in
>>>> Python 3.0, and should be preferred to the % formatting
>>>> described in String Formatting Operations in new code.”
>>>>
>>>> <URL:http://docs.python.org/library/stdtypes.html#str.format>
>>>
>>> Is there a good link to a thread-archive on when/why/how .format(...)
>>> became "preferred to the % formatting"?
>>
>> That is a controversial statement.
>
> I'm not sure whether you're "controversial" refers to
>
> - the documentation at that link,
> - Ben's quote of the documentation at that link,
> - my quotation of Ben's quote of the documentation,
> - or my request for a "thread-archive on the when/why/how"
>
> I _suspect_ you mean the first one :)

I meant the preceding statement (derived from the linked source, but 
that is not important) that .format is preferred to %. Guido prefers it. 
I prefer it. At least a couple of developers vocally do not prefer it 
and might prefer that the statement was not there. Guido recognizes that 
deprecation of % formatting would at least require a conversion function 
that does not now exist.

I see that the linked doc says 'in new code'. That makes the statement 
less (but only less) controversial.

>
>>> I haven't seen any great wins of the new formatting over
>>> the classic style.
>>
>> It does not abuse the '%' operator,
>
> Weighed against the inertia of existing code/documentation/tutorials, I
> consider this a toss-up. If .format() had been the preferred way since
> day#1, I'd grouse about adding/overloading '%', but going the other
> direction, there's such a large corpus of stuff using '%', the addition
> of .format() feels a bit schizophrenic.
>
>> it does not make a special case of tuples (a source of bugs),
>
> Having been stung occasionaly by this, I can see the benefit here over
> writing the less-blatant
>
> "whatever %s" % (tupleish,)
>
>> and it is more flexible, especially
>> indicating objects to be printed. Here is a simple example from my code
>> that would be a bit more difficult with %.
>>
>> multi_warn = '''\
>> Warning: testing multiple {0}s against an iterator will only test
>> the first {0} unless the iterator is reiterable; most are not.'''.format
>> ...
>> print(multiwarn('function'))
>> ...
>> print(multiwarn('iterator'))
>
> Does the gotcha of a non-restarting iterator

Huh? What iterator?

 > trump pulling each field  you want and passing it explicitly?

Huh? I explicitly pass the strings to be printed.

> In pre-.format(), I'd just use dictionary formatting:
>
> "we have %(food)s & eggs and %(food)s, bacon & eggs" % {
> "food": "spam", # or my_iterator.next()?
> }

A better parallel to my example would be

menu = "We have %(meat)s & eggs or %(meat)s and potatoes."
print(menu % {'meat':'spam'})
print(menu % {'meat':'ham'})

The exact duplicate of that with .format is

menu = "We have {meat} & eggs or {meat} and potatoes.".format
print(menu(meat = 'spam'))
print(menu(meat = 'ham'))

One knock against '.format' is that it is 6 chars more that '%'. But for 
repeat usage, it is only needed once. And look: '%(meat)s' is 2 more 
chars than '{meat}' and, to me, {} is easier to type than (). Then " % 
{'meat':"spam"}" is 3 more chars than "(meat = 'ham')" and definitely 
harder to type. While I prefer '}' to ')', I prefer '))' to the mixed 
'})'. The % way is at least 'a bit more difficult' even compared to the 
longer and harder .format with named fields.

menu = "We have {0} & eggs or {0} and potatoes.".format
print(menu('spam'))
print(menu('ham'))

it a little easier yet, though perhaps less clear, especially if there 
were multiple substitutions.

> The other new feature I saw was the use of __format__() which may have
> good use-cases, but I don't yet have a good example of when I'd want
> per-stringification formatting compared to just doing my desired
> formatting in __str__() instead.

__str__ always returns the same string for an instance in a given state.
Similarly, __float__ and __int__ will return the same float or int 
version of an unchanged instance. __format__(spec) can directly adjust 
the result according to spec without the restriction of going through an 
intermediary str, int, or float.

Suppose one had a Money class with currency and decimal amount fields. 
.__str__ can add a currency symbol (before or after as appropriate) but 
has to use a standard format for the amount field. .__float__ can be 
post-processed according to a %...f spec, but cannot include a currency 
symbol. Money.__format__(self,spec) can format the amount at it wishes, 
including its rounding rules, *and* add a currency symbol.

Or suppose one has a multi-precision float. %80.40f will require an mpf 
instance to appoximate itself as a float, possibly with error. 
mpg.__format__ should be able to do better.

(Sadly, this new ability to more accurately represent objects is not yet 
used for ints and is broken for fractions.Fraction. I will probably post 
issues on the tracker.)

-- 
Terry Jan Reedy