[Python-3000] string.format

Talin talin at acm.org
Sun Apr 9 08:13:21 CEST 2006


I was browsing through the C# documentation for the String.format
function, because I'd noticed that a number of the suggestions
made in the string formatting thread were syntactically similar
to what is used in .Net.

A quick review of .Net string formatting:

Substitution fields are specified using a number in braces: {0},
where the number indicates the argument number to substitute
(named arguments are not supported.)

Formatting options are indicated by putting a colon after the
number, followed by a formatting string: {0:8x}, for example,
to indicate an 8-digit hex number.

Note that the formatting options are *options*, they are not
type specifiers. There is no equivalent to %s or %d, and in a
dynamically typed language, there is little need for such.

One interesting thing is that the formatting options are interpreted
differently for different types. Thus, if printing a date, you can
specify something like {0:dd:mm:yy}, whereas this same formatting
option would be meaningless in the case of an integer. (Note that
only the first colon is interpreted as a separator, the rest are passed
on to the type-specific function uninterpreted.)

There are two ways in which you can customize the interpretation
of the formatting string. If you are writing a subclass, you can overload
the ToString() function, which is like Python's __str__ function, except
that it takes an optional format string (which is simply the literal text
of the format options following the colon.) The other method is to
supply a 'custom formatter' which is passed in as an argument to
String.Format, and which can override the various object's decisions
as to how to present themselves.

(I wonder if generic dispatch would be useful here - being able to
override the interpretation of the format string on a per-type basis
might be a cleaner way to do it. Problem is, however, you might want
to have different overrides at different times.)

In general, I kind of like the brace syntax, assuming of course that named
arguments would be supported. I agree that $name, while popular,
can be troublesome, and while I'm used to using ${name} with Kid
templates, I wonder if one really needs both the prefix and the braces,
especially when you consider that they are only meaningful when
calling format - it's not like Perl where $name works on every string.

Other details:

In .Net literal braces are escaped with {{ and }}, however \{ and \}
seem a bit more consistent to me.

I am definately not in favor of being able to put arbitrary expressions
embedded in strings (its easy enough to move the expression outside,
and its a potential security hole), however it seems to me that you might
want to add a few convenience features for cases where you are just
passing in locals() as a dict. Perhaps it would be enough to support
just dot (.) and brackets:

    "{name.name} and {name[index]}"

Of course, there's no guarantee that the __getattr__ and __getitem__
functions aren't going to do something potentially bad when invoked
by a maliciously-written format string, but when it comes to that,
there's no guarantee that __str__ has been written sanely either!
In general, however, its a good bet that for a given random object,
__getattr__ and __getitem__ have fewer side effects than __call__.

All right I'm done. Back to my other Python programming... :)

-- Talin




More information about the Python-3000 mailing list