[Python-3000] More PEP 3101 changes incoming

Sun Aug 5 18:33:29 CEST 2007

Ron Adam wrote:
> Talin wrote:
>> Another thing I want to point out is that Guido and I (in a private 
>> discussion) have resolved our argument about the role of __format__. 
>> Well, not so much *agreed* I guess, more like I capitulated.
> 
> Refer to the message in this thread where I discuss the difference 
> between concrete and abstract format specifiers.  I think this is 
> basically where you and Guido are differing on these issues.  I got the 
> impression you prefer the more abstract interpretation and Guido prefers 
> a more traditional interpretation.  We can have both as long as they are 
> well defined and documented as being one or the other.  It's when we try 
> to make one format specifier have both qualities at different times that 
> it gets messy.
> 
> 
> Here's how the apply_format function could look, we may not be in as 
> much disagreement as you think.
> 
> def apply_format(value, format_spec):
>     abstract = False
>     type = format_spec[0]
>     if type in 'rtgd':
>     abstract = True
>         if format_spec[0] == 'r':      # abstarct repr
>             value = repr(value)
>         elif format_spec[0] == 't':    # abstarct text
>             value = str(value)
>         elif format_spec[0] == 'g':    # abstract float
>             value = float(value)
>         else
>             format_spec[0] == 'd':     # abstarct int
>             value = int(value)
>     return value.__format__(format_spec, abstract)
> 
> The above abstract types use duck typing to convert to concrete types 
> before calling the returned types __format__ method. There aren't that 
> many abstract types needed.  We only need a few to cover the most common 
> cases.
> 
> That's it.  It's up to each types __format__ method to figure out things 
> from there.  They can look at the original type spec passed to them and 
> handle special cases if need be.

Let me define some terms again for the discussion. As noted before, the 
',' part is called the alignment specifier. It's no longer appropriate 
to use the term 'conversion specifier', since we're not doing 
conversions, so I guess I will stick with the term 'format specifier' 
for the ':' part.

What Guido wants is for the general 'apply_format' function to not 
examine the format specifier *at all*.

The reason is that for some types, the __format__ method can define its 
own interpretation of the format string which may include the letters 
'rtgd' as part of its regular syntax. Basically, he wants no constraints 
on what __format__ is allowed to do.

Given this constraint, it becomes pretty obvious which attributes go in 
which part. Attributes which are actually involved in generating the 
text (signs and leading digits) would have to go in the 
format_specifier, and attributes which are are interpreted by 
apply_format (such as left/right alignment) would have to go in the 
alignment specifier.

Of course, the two can't be entirely isolated because there is 
interaction between the two specifiers for some types. For example, it 
would normally be the case that padding is applied by 'apply_format', 
which knows about the field width and the padding character. However, in 
the case of an integer that is printed with leading zeros, the sign must 
come *before* the padding: '+000000010'. It's not sufficient to simply 
apply padding blindly to the output of __format__, which would give you 
'000000+10'.

(Maybe leading zeros and padding are different things? But the 
__format__ would still need to know the field width, which is usually 
part of the alignment spec, since it's usually applied as a 
post-processing step by 'apply_format')

> If the abstract flag is False and the format_spec type doesn't match the 
> type of the __format__ methods class, then an exception can be raised. 
> This offers a wider range of strictness/leniency to string formatting. 
> There are cases where you may want either.
> 
> 
>> But in any case, the deal is that int, float, and decimal all get to 
>> have a __format__ method which interprets the format string for those 
>> types.
> 
> Good, +1
> 
>> There is no longer any automatic coercion of types based on the format 
>> string
> 
> Ever?  This seems to contradict below where you say int needs to handle 
> float, and float needs to handle int.  Can you explain further?

What I mean is that a float, upon receiving a format specifier of 'd', 
needs to print the number so that it 'looks like' an integer. It doesn't 
actually have to convert it to an int. So 'd' in this case is just a 
synonym for 'f0'.

>> - so simply defining an __int__ method for a type is insufficient if 
>> you want to use the 'd' format type. Instead, if you want to use 'd' 
>> you can simply write the following:
>>
>>    def MyClass:
>>       def __format__(self, spec):
>>          return int(self).__format__(spec)
> 
> 
> So if an item has an __int__ method, but not a __format__ method, and 
> you tried to print it with a 'd' format type, it would raise an exception?
> 
>  From your descriptions elsewhere in this reply it sounds like it would 
> fall back to string output.  Or am I missing something?

Yes, we have to have some sort of fallback if there's no __format__ 
method at all. My thought here is to coerce to str() in this case.

>> So for example, in .Net having a float field of minimum width 10 and a 
>> decimal precision of 3 digits would be ':f3,10'.
> 
> It looks ok to me, but there may be some cases where it could be 
> ambiguous.   How would you specify leading 0's.  Or would we do that in 
> the alignment specifier?
> 
>     {0:f3,-10/0}    '000123.000'

I'm not sure. This is the one case where the two specifiers interact, as 
I mentioned above.

>> Now, as stated above, there's no 'max field width' for any data type 
>> except strings. So in the case of strings, we can re-use the precision 
>> specifier just like C printf does: ':s10' to limit the string to 10 
>> characters. So 's:10,5' to indicate a max width of 10, min width of 5.
> 
> I'm sure you meant '{0:s10,5}' here.

Right.

>> -- For the 'repr' override, Guido suggests putting 'r' in the 
>> alignment field: '{0,r}'. How that mixes with alignment and padding is 
>> unknown, although frankly why anyone would want to pad and align a 
>> repr() is completely beyond me.
> 
> Sometimes it's handy for formatting a variable repr output in columns. 
> Mostly for debugging, learning exercises, or documentation purposes.
> 
> Since there is no actual Repr type, it may seem like it shouldn't be a 
> type specifier. But if you consider it as indirect string type, an 
> abstract type that converts to string type, the idea and implementation 
> works fine and it can then forward it's type specifier to the strings 
> __format__ method.  (or not)
> 
> The exact behavior can be flexible.
> 
> To me there is an underlying consistency with grouping abstract/indirect 
> types with more concrete types rather than makeing an exception in the 
> field alignment specifier.
> 
> Moving repr to the format side sort of breaks the original clean idea of 
> having a field alignment specifier and separate type format specifiers.

The reason for this is because of the constraint that apply_format never 
looks at the format specifier, so overrides for repr() can only go in 
the thing that it does look at - the alignment spec.

> I think if we continue to sort out the detail behaviors of the 
> underlying implementation, the best overall solution will sort it self 
> out.  Good and complete example test cases will help too.
> 
> I think we actually agree on quite a lot so far. :-)

Me too.

> Cheers,
>    Ron
>