[Python-3000] Format specifier proposal

Ron Adam rrr at ronadam.com
Wed Aug 15 08:52:33 CEST 2007



Andrew James Wade wrote:
> On Tue, 14 Aug 2007 21:12:32 -0500
> Ron Adam <rrr at ronadam.com> wrote:

>> What I was thinking of was just a simple left to right evaluation order.
>>
>>      "{0:spec1, spec2, ... }".format(x)
>>
>> I don't expect this will ever get very long.
> 
> The first __format__ will return a str, so chains longer than 2 don't
> make a lot of sense. And the delimiter character should be allowed in
> spec1; limiting the length of the chain to 2 allows that without escaping:
> 
>     "{0:spec1-with-embedded-comma,}".format(x)
> 
> My scheme did the same sort of thing with spec1 and spec2 reversed.
> Your order makes more intuitive sense; I chose my order because I
> wanted the syntax to be a generalization of formatting strings.
 >
> Handling the chaining within the __format__ methods should be all of
> two lines of boilerplate per method.

I went ahead and tried this out and it actually cleared up some difficulty 
  in organizing the parsing code.  That was a very nice surprise. :)

     (actual doctest)

     >>> import time
     >>> class GetTime(object):
     ...     def __init__(self, time=time.gmtime()):
     ...         self.time = time
     ...     def __format__(self, spec):
     ...         return fstr(time.strftime(spec, self.time))

     >>> start = GetTime(time.gmtime(1187154773.0085449))

     >>> fstr("Start: {0:%d/%m/%Y %H:%M:%S,<30}").format(start)
     'Start: 15/08/2007 05:12:53           '

After each term is returned from the __format__ call, the results 
__format__ method is called with the next specifier.  GetTime.__format__ 
returns a string.  str.__format__, aligns it.  A nice left to right 
sequence of events.

The chaining is handled before the __format__ method calls so each 
__format__ method only needs to be concerned with doing it's own thing.

The alignment is no longer special cased as it's just part of the string 
formatter.  No other types need it as long as their __format__ methods 
return strings. Which means nobody needs to write parsers to handle field 
alignments.

If you had explicit conversions for other types besides !r and !s, it might 
be useful to do things like the following.  Suppose you had text data with 
floats in it along with some other junk.  You could do the following...

      # Purposely longish example just to show sequence of events.

      "The total is: ${0:s-10,!f,(.2),>12}".format(line)

Which would grab 10 characters from the end of the line, convert it to a 
float, the floats __format__ method is called which formats it to 2 decimal 
places, then it's right aligned in a field 12 characters wide.

That could be shorted to {0:s-10,f(.2),>12} as long as strings types know 
how to convert to float.  Or if you want the () to line up on both sides, 
you'd probably just use {0:s-10,f(7.2)}.

This along with the nested substitutions Guido wants, this would be a 
pretty powerful mini formatting language like that Talon hinted at earlier.

I don't think there is any need to limit the number of terms, that sort of 
spoils the design.  The two downsides of this are it's a bit different from 
what users are use to, and we would need to escape commas inside of 
specifiers somehow.

It simplifies the parsing and formatting code underneath like I was hoping, 
but it may scare some people off.  But the simple common cases are still 
really simple, so I hope not.

BTW... I don't think I can add anything more to this idea.  The rest is 
just implementation details and documentation. :)

Cheers,
    Ron



More information about the Python-3000 mailing list