[Python-3000] More PEP 3101 changes incoming

Thu Aug 2 04:01:02 CEST 2007

I had a long discussion with Guido today, where he pointed out numerous 
flaws and inconsistencies in my PEP that I had overlooked. I won't go 
into all of the details of what he found, but I'd like to focus on what 
came out of the discussion. I'm going to be updating the PEP to 
incorporate the latest thinking, but I thought I would post it on Py3K 
first to see what people think.

The first change is to divide the conversion specifiers into two parts, 
which we will call "alignment specifiers" and "format specifiers". So 
the new syntax for a format field will be:

     valueSpec [,alignmentSpec] [:formatSpec]

In other words, alignmentSpec is introduced by a comma, and conversion 
spec is introduced by a colon. This use of comma and colon is taken 
directly from .Net. although our alignment and conversion specifiers 
themselves look nothing like the ones in .Net.

Alignment specifiers now includes the former 'fill', 'align' and 'width' 
properties. So for example, to indicate a field width of 8:

     "Property count {0,8}".format(propertyCount)

The 'formatSpec' now includes the former 'sign' and 'type' parameters:

     "Number of errors: {0:+d}".format(errCount)

In the preceding example, this would indicate an integer field preceded 
by a sign for both positive and negative numbers.

There are still some things to be worked out. For example, there are 
currently 3 different meanings of 'width': Minimum width, maximum width, 
and number of digits of decimal precision. The previous version of the 
PEP followed the 2.x convention, which was 'n.n' - 'min.prec' for 
floats, and 'min.max' for everything else. However, that seems confusing.

(I'm actually still working out the details - and in fact a little bit 
of a bikeshed discussion would be welcome at this point, as I could use 
some help ironing out these kinds of little inconsistencies.)

In general, you can think of the difference between format specifier and 
alignment specifier as:

     Format Specifier: Controls how the value is converted to a string.
     Alignment Specifier: Controls how the string is placed on the line.

Another change in the behavior is that the __format__ special method can 
only be used to override the format specifier - it can't be used to 
override the alignment specifier. The reason is simple: __format__ is 
used to control how your object is string-ified. It shouldn't get 
involved in things like left/right alignment or field width, which are 
really properties of the field, not the object being printed.

The __format__ special method can basically completely change how the 
format specifier is interpreted. So for example for Date objects you can 
have a format specifier that looks like the input to strftime().

However, there are times when you want to override the __format__ hook. 
The primary use case is the 'r' conversion specifier, which is used to 
get the repr() of an object.

At the moment I'm leaning towards using the exclamation mark ('!') to 
indicate this, in a way that's analogous to the CSS "! important" flag - 
it basically means "No, I really mean it!" Two possible syntax 
alternatives are:

     "The repr is {0!r}".format(obj)
     "The repr is {0:r!}".format(obj)

In the first option, we use '!' in place of the colon. In the second 
case, we use '!' as a suffix.

Another change suggested by Guido is explicit support for the Decimal 
type. Under the current proposal, a format specifier of 'f' will cause 
the Decimal object to be coerced to float before printing. That's not 
what we want, because it will cause a loss of precision. Instead, the 
rule should be that Decimal can use all of the same formatting types as 
float, but it won't try to convert the Decimal to float as an 
intermediate step.

Here's some pseudo code outlining how the new formatting algorithm for 
fields will work:

     def format_field(value, alignmentSpec, formatSpec):
         if value has a __format__ attribute, and no '!' flag:
             s = value.__format__(value, formatSpec)
         else:
             if the formatSpec is 'r':
                  s = repr(value)
             else if the formatSpec is 'd' or one of the integer types:
                  # Coerce to int
                  s = formatInteger(int(value), formatSpec)
             else if the formatSpec is 'f' or one of the float types:
                   if value is a Decimal:
                       s = formatDecimal(value, formatSpec)
                   else:
                       # Coerce to float
                       s = formatFloat(float(value), formatSpec)
             else:
                 s = str(value)

     # Now that we have 's', apply the alignment options
     return applyAlignment(s, alignmentSpec)

My goal is that some time in the next several weeks I would like to get 
working a C implementation of just this function. Most of the complexity 
of the PEP implementation is right here IMHO.

Before I edit the PEP I'm going to let this marinate for a week and see 
what the discussion brings up.

-- Talin