[Python-Dev] transitioning from % to {} formatting

Brett Cannon brett at python.org
Thu Oct 1 02:29:04 CEST 2009


On Wed, Sep 30, 2009 at 16:03, Vinay Sajip <vinay_sajip at yahoo.co.uk> wrote:
> Steven Bethard <steven.bethard <at> gmail.com> writes:
>
>> There's a lot of code already out there (in the standard library and
>> other places) that uses %-style formatting, when in Python 3.0 we
>> should be encouraging {}-style formatting. We should really provide
>> some sort of transition plan. Consider an example from the logging
>> docs:
>>
>> logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
>>
>> We'd like to support both this style as well as the following style:
>>
>> logging.Formatter("{asctime} - {name} - {levelname} - {message}")
>>
>
> In logging at least, there are two different places where the formatting issue
> crops up.
>
> The first is creating the "message" part of the the logging event, which is
> made up of a format string and arguments.
>
> The second is the one Steven's mentioned: formatting the message along with
> other event data such as time of occurrence, level, logger name etc. into the
> final text which is output.
>
> Support for both % and {} forms in logging would need to be considered in
> these two places. I sort of liked Martin's proposal about using different
> keyword arguments, but apart from the ugliness of "dicttemplate" and the fact
> that "fmt" is already used in Formatter.__init__ as a keyword argument, it's
> possible that two different keyword arguments "fmt" and "format" both referring
> to format strings might be confusing to some users.
>
> Benjamin's suggestion of providing a flag to Formatter seems slightly better,
> as it doesn't change what existing positional or keyword parameters do, and
> just adds an additional, optional parameter which can start off with a default
> of False and transition to a default of True.
>
> However, AFAICT these approaches only cover the second area where formatting
> options are chosen - not the creation of the message from the parameters passed
> to the logging call itself.
>
> Of course one can pass arbitrary objects as messages which contain their own
> formatting logic. This has been possible since the very first release but I'm
> not sure that it's widely used, as it's usually easier to pass strings. So
> instead of passing a string and arguments such as
>
> logger.debug("The %s is %d", "answer", 42)
>
> one can currently pass, for a fictitious class PercentMessage,
>
> logger.debug(PercentMessage("The %s is %d", "answer", 42))
>
> and when the time comes to obtain the formatted message, LogRecord.getMessage
> calls str() on the PercentMessage instance, whose __str__ will use %-formatting
> to get the actual message.
>
> Of course, one can also do for example
>
> logger.debug(BraceMessage("The {} is {}", "answer", 42))
>
> where the __str__() method on the BraceMessage will do {} formatting.
>
> Of course, I'm not suggesting we actually use the names PercentMessage and
> BraceMessage, I've just used them there for clarity.
>
> Also, although Raymond has pointed out that it seems likely that no one ever
> needs *both* types of format string, what about the case where application A
> depends on libraries B and C, and they don't all share the same preferences
> regarding which format style to use? ISTM no-one's brought this up yet, but it
> seems to me like a real issue. It would certainly appear to preclude any
> approach that configured a logging-wide or logger-wide flag to determine how to
> interpret the format string.
>
> Another potential issue is where logging events are pickled and sent over
> sockets to be finally formatted and output on different machines. What if a
> sending machine has a recent version of Python, which supports {} formatting,
> but a receiving machine doesn't? It seems that at the very least, it would
> require a change to SocketHandler and DatagramHandler to format the "message"
> part into the LogRecord before pickling and sending. While making this change
> is simple, it represents a potential backwards-incompatible problem for users
> who have defined their own handlers for doing something similar.
>
> Apart from thinking through the above issues, the actual formatting only
> happens in two locations - LogRecord.getMessage and Formatter.format - so
> making the code do either %- or {} formatting would be simple, as long as it
> knows which of % and {} to pick.
>
> Does it seems too onerous to expect people to pass an additional "use_format"
> keyword argument with every logging call to indicate how to interpret the
> message format string? Or does the PercentMessage/BraceMessage type approach
> have any mileage? What do y'all think?

I personally prefer the keyword argument approach to act as a flag,
but that's me.

As for the PercentMessage/BraceMessage, I would make sure that you
just simply take the string format and simply apply the arguments
later to cut down on the amount of parentheses butting up against each
other: ``logger.debug(BraceMessage("The {} is {}"), "answer", 42)``.
It's still an acceptable solution that provides a clear transition:
simply provide the two classes, deprecate PercentMessage or bare
string usage, require BraceMessage, remove requirement. This wrapper
approach also provides a way for libraries that have not shifted over
to still work with PEP 3101 strings by letting the user wrap the
string to be interpolated themselves and then to pass it in to the
libraries. It's just unfortunate that any transition would have this
cost of wrapping all strings for a while. I suspect most people will
simply import the wrapping class and give it some short name like
people do with gettext.

-Brett


More information about the Python-Dev mailing list