[Python-3000] Proposed changes to PEP3101 advanced string formatting -- please discuss and vote!

Wed Mar 14 12:09:48 CET 2007

(I've read the whole thread as it currently stands - responding to the 
initial post to cover the various topics)

Patrick Maupin wrote:
> Feature:  Alternate syntaxes for escape to markup.
> The syntaxes are similar enough that they can all be efficiently
> parsed by the same loop, so there are no real implementation issues.
> The currently contemplated method for declaring a markup syntax is by
> using decorator-style  markup, e.g. {@syntax1} inside the string,
> although I am open to suggestions about better-looking ways to do
> this.

-1

Including a very limited set of alternate syntaxes in the base 
formatting operation seems like a bad idea from a readability point of 
view - the fact that the proposed alternatives happen to be easy to 
implement is no realy justification at all.

For serious template usage, string.Template is a much stronger 
foundation for modification (as it allows complete alteration of the 
regular expression used to detect fields for substitution).

A string.FormattedTemplate class that combined the pattern matching 
flexibility of string.Template with the formatting options of PEP 3101 
would provide a significantly more powerful approach to supporting 
alternate syntaxes (as well as providing a good workout for the 
reusability of the formatting internals).

> Feature:  Automatic search of locals() and globals() for name lookups
> if no parameters are given.
> 
> This is contentious because it violates EIBTI.  However, it is
> extremely convenient.  To me, the reasons for allowing or disallowing
> this feature on 'somestring'.format() appear to be exactly the same as
> the reasons for allowing or disallowing this feature on
> eval('somestring').   Barring a distinction between these cases that I
> have not noticed, I think that if we don't want to allow this for
> 'somestring'.format(), then we should seriously consider removing the
> capability in Python 3000 for eval('somestring').

-1

As others have noted, any use of eval() is already enough to raise alarm 
bells for a reviewer. It would be a pain if every use of string 
formatting had to be subjected to the same level of scrutiny.

I suggest trawling through the py3k archive for a bit before deciding 
whether or not you feel it is worth getting distracted by this argument 
in order to save typing a dozen characters (or writing a two line 
utility function that you put in a support module somewhere).

A reasonable place to start might be:
http://mail.python.org/pipermail/python-3000/2006-December/004971.html

> Feature: Ability to pass in a dictionary or tuple of dictionaries of
> namespaces to search.
> 
> This feature allows, in some cases, for much more dynamic code than
> *kwargs.  (You could manually smush multiple dictionaries together to
> build kwargs, but that can be ugly, tedious, and slow.)
> Implementation-wise, this feature and locals() / globals() go hand in
> hand.

+1

> Feature:  Placement of a dummy record on the traceback stack for
> underlying errors.
> Removed feature:  Ability to dump error information into the output string.

+0

> Feature: Addition of functions and "constants" to string module.
> 
> The PEP proposes doing everything as string methods, with a "cformat"
> method allowing some access to the underlying machinery.  I propose
> only having a 'format' method of the string (unicode) type, and a
> corresponding 'format' and extended 'flag_format' function in the
> string module, along with definitions for the flags for access to
> non-default underlying format machinery.

+1

I significantly prefer this to the approach currently in the PEP - it 
keeps the string type's namespace comparatively clean, while providing 
access to the underlying building blocks when someone needs something 
which is 'similar but different'.

I detest the name 'flag_format', though - the function doesn't format a 
flag!

> Feature: Ability for "field hook" user code function to only be called
> on some fields.
> 
> The PEP takes an all-or-nothing approach to the field hook -- it is
> either called on every field or no fields.  Furthermore, it is only
> available for calling if the extended function ('somestring'.cformat()
> in the spec, string.flag_format() in this proposal) is called.  The
> proposed change keeps this functionality, but also adds a field type
> specifier 'h' which causes the field hook to be called as needed on a
> per-field basis.  This latter method can even be used from the default
> 'somestring'.format() method.

-1.

I don't like this - the caller has to provide a template that uses a 
hook specifier in the appropriate place, as well as providing the 
correct hook function. That kind of cross-dependency is ugly.

The approach in the PEP is simple - every field is passed to the hook 
function, and the hook function decides whether or not it wants to 
override the default handling.

Keep the string method simple, leave the flexibility and configurability 
for the underlying functions in the string module.

> Changed feature: By default, not using all arguments is not an exception
 >
> Also, it is arguably not Pythonic to require a check that all
> arguments to a function are actually used by the execution of the
> function (but see interfaces!),  and format() is, after all, just
> another function.  So it seems that the default should be to not check
> that all the arguments are used.  In fact, there are similar reasons
> for not using all the arguments here as with any other function.  For
> example, for customization, the format method of a string might be
> called with a superset of all the information which might be useful to
> view.

+1

This seems like a good idea to me.

> Feature:  Ability to insert non-printing comments in format strings
> 
> This feature is implemented in a very intuitive way, e.g. " text {#
> your comment here} more text" (example shown with the default
> transition to markup syntax).  One of the nice benefits of this
> feature is the ability to break up long source lines (if you have lots
> of long variable names and attribute lookups).

+0

Also a reasonable idea.

> Feature:  Exception raised if attribute with leading underscore accessed.
> 
> The syntax supported by the PEP is deliberately limited in an attempt
> to increase security.  This is an additional security measure, which
> is on by default, but can be optionally disabled if
> string.flag_format() is used instead of 'somestring'.format().

-0

This is only an issue if implicit access to locals()/globals() is 
permitted, and is unlikely to help much in that case (underscores are 
rarely used with local variables, and those are the most likely to 
contain juicy information which may be leaked)

> Feature: Support for "center" alignment.
> 
> The field specifier uses "<" and ">" for left and right alignment.
> This adds "^" for center alignment.

+0

> Feature: support of earlier versions of Python
> Feature: no global state

Both significant improvements, in my opinion.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org