[Python-3000] Proposed changes to PEP3101 advanced string formatting -- please discuss and vote!

Wed Mar 14 16:21:32 CET 2007

On 3/14/07, Nick Coghlan <ncoghlan at gmail.com> wrote:

> Including a very limited set of alternate syntaxes in the base
> formatting operation seems like a bad idea from a readability point of
> view.

The proposal includes the selection of (non-default) format options
via markup in the string itself.  It's easy to see when something
different is going on.

> the fact that the proposed alternatives happen to be easy to
> implement is no realy justification at all.

I absolutely agree with this statement.  But sometimes even good ideas
are left unimplemented because the implementation is difficult or
error-prone, so I wasn't presenting this data point as a
justification, just pointing out the lack of an impediment.

> For serious template usage, string.Template is a much stronger
> foundation for modification (as it allows complete alteration of the
> regular expression used to detect fields for substitution).

Agreed, but the question is really are there intermediate use-cases
where it would be nice to support some different data domains without
requiring the programmer to learn/understand all the features of yet
another package, especially when there are obvious, known, simple
use-cases for different syntaxes.

> A string.FormattedTemplate class that combined the pattern matching
> flexibility of string.Template with the formatting options of PEP 3101
> would provide a significantly more powerful approach to supporting
> alternate syntaxes (as well as providing a good workout for the
> reusability of the formatting internals).

Agreed.  I view templates as more "heavyweight", though.  In my
viewpoint, templates are optimized for reusing multiple times, and
'somestring'.format() is optimized for single one-off calls.  Another
issue is just the issue of learning something for one-time use.  Or to
put it another way, if I am doing floating point math, I won't bother
to learn and start using numeric python until it really hurts.  The
same thing is probably true of templates.  There is certainly a
delicate balance to be struck in deciding the right amount of
functionality for built-in stuff vs. libraries, and multiple syntaxes
might just trip that balance.

So another question here is, would this extra functionality be as
frowned upon if it were only available in the string module functions?

> I suggest trawling through the py3k archive for a bit before deciding
> whether or not you feel it is worth getting distracted by this argument
> in order to save typing a dozen characters (or writing a two line
> utility function that you put in a support module somewhere).

For me, it's not about either of thoses things.  (See my earlier post
on the possibility of doing string.expand()).  The stackframe/utility
function you mention might not work on all versions of Python, but it
is easy for a builtin to access the stackframe.  Do you have a real
concern with the "utility function" being built in, like the proposed
expand()?

> I detest the name 'flag_format', though - the function doesn't format a
> flag!

I don't like it that much myself, either.  I considered
"extended_format" and a few others, but wasn't happy with any of them.
 Name suggestions are certainly welcome.

> I don't like this - the caller has to provide a template that uses a
> hook specifier in the appropriate place, as well as providing the
> correct hook function. That kind of cross-dependency is ugly.

But the other way, you have a hook function which has to know exactly
which objects it is expected to format.  You've really just moved the
cross-dependency.

> The approach in the PEP is simple - every field is passed to the hook
> function, and the hook function decides whether or not it wants to
> override the default handling.

Which requires a lot of intelligence in the hook function.  If you are
displaying two integers, and you want to display one of them in the
default format, and the other one in base 37, you will need to
specify, in the field specifier of the format string, exactly how you
want a particular integer displayed, in a fashion which is
understandable to the hook function.  So you really haven't even
removed the dependency between the format string and the hook
function.  All you've done is made the hook function more complicated
by forcing it to return None on any object that isn't an integer, or
any integer field specifier that doesn't say 'base37", or something.

> Keep the string method simple, leave the flexibility and configurability
> for the underlying functions in the string module.

OK, but it would be extra (and unnecessary IMO) work to allow a
function in the string module to support this particular functionality
but disallow it in the string method.

> This is only an issue if implicit access to locals()/globals() is
> permitted, and is unlikely to help much in that case (underscores are
> rarely used with local variables, and those are the most likely to
> contain juicy information which may be leaked)

Eric already clarified this, but I wanted to reiterate that this is
about attribute lookup as well as variable name lookup (and it's most
consistent and easier to explain in the final docs if we just say that
"identifiers cannot have leading underscores").

Thanks for the feedback.

Regards,
Pat