[Python-3000] Substantial rewrite of PEP 3101

Mon Jun 4 18:34:47 CEST 2007

Eric V. Smith wrote:
>  > Formatter Creation and Initialization
>  >
>  >     The Formatter class takes a single initialization argument, 'flags':
>  >
>  >         Formatter(flags=0)
>  >
>  >     The 'flags' argument is used to control certain subtle behavioral
>  >     differences in formatting that would be cumbersome to change via
>  >     subclassing. The flags values are defined as static variables
>  >     in the "Formatter" class:
>  >
>  >         Formatter.ALLOW_LEADING_UNDERSCORES
>  >
>  >             By default, leading underscores are not allowed in 
> identifier
>  >             lookups (getattr or getitem).  Setting this flag will allow
>  >             this.
>  >
>  >         Formatter.CHECK_UNUSED_POSITIONAL
>  >
>  >             If this flag is set, the any positional arguments which are
>  >             supplied to the 'format' method but which are not used by
>  >             the format string will cause an error.
>  >
>  >         Formatter.CHECK_UNUSED_NAME
>  >
>  >             If this flag is set, the any named arguments which are
>  >             supplied to the 'format' method but which are not used by
>  >             the format string will cause an error.
> 
> I'm not sure I'm wild about these flags which would have to be or'd
> together, as opposed to discrete parameters.  I realize have a single
> flag field is likely more extensible, but my impression of the
> standard library is a move away from bitfield flags.  Perhaps that's
> only in my own mind, though!

Making them separate fields is fine if that's easier.

Another possibility is to make them setter methods rather than 
constructor params.

> Also, why put this in the base class at all?  These could all be
> implemented in a derived class (or classes), which would leave the
> base class state-free and therefore without a constructor.

My reason for doing this is as follows.

Certain kinds of customizations are pretty easy to do via subclassing. 
For example, supporting a default namespace takes only a few lines of 
code in a subclass.

Other kinds of customization require replacing a much larger chunk of 
code. Changing the "underscores" and "check-unused" behavior requires 
overriding 'vformat', which means replacing the entire template string 
parser. I figured that there would be a lot of people who might want 
these features, but didn't want to rewrite all of vformat.

Now, some of this could be resolved by breaking up vformat into a set of 
smaller, overridable functions which controlled these behaviors. 
However, I didn't do this because I didn't want the PEP to micro-manage 
the implementation of vformat - I wanted to leave you guys some leeway 
as to design choices.

For example, I had thought perhaps to break out a separate method that 
would just do the parsing of a replacement field (the part inside the 
brackets) - so in other words, you'd have one function that recognizes 
the start of a replacement field, which then calls a method which 
consumes the contents of that field, and so on. You could also break 
that up into two pieces, one which recognizes the field reference, and 
one which recognizes the conversion string.

However, these various parsing functions aren't entirely isolated from 
each other. The various parsers would need to pass the current parse 
position (character iterator or whatever) and other state back and 
forth. Exposing this requires codifying in the API a lot of the internal 
state of parsing.

Also, the syntax defining the end of a replacement field is a mirror of 
the syntax  that starts one; And conversion specs can contain 
replacement fields too. Which means that the various parsing methods 
aren't entirely independent. (Although I think that in your earlier 
proposal, the syntax for 'internal' replacement fields inside conversion 
specifiers was always the same, regardless of the markup syntax chosen.)

What I wanted to avoid in the PEP was having to specify how all of these 
different parts fit together and the exact nature of the parameters 
being passed between them.

And I think that even if we do break up vformat this way, we still end 
up with people having to replace a fairly substantial chunk of code in 
order to change the behaviors represented by these flags.

>  > Formatter Methods
>  >
>  >     The methods of class Formatter are as follows:
>  >
>  >         -- format(format_string, *args, **kwargs)
>  >         -- vformat(format_string, args, kwargs)
>  >         -- get_positional(args, index)
>  >         -- get_named(kwds, name)
>  >         -- format_field(value, conversion)
> 
> I've started a sample implementation to test this API.  For starters,
> I'm writing it in pure Python, but my intention is to use the code in
> the pep3101 sandbox once I have some tests written and we're happy
> with the API.

Cool.

-- Talin