[Python-Dev] Binary Operator for New-Style String Formatting

Jerry Chen j at 3rdengine.com
Sun Jun 21 19:36:40 CEST 2009


Hello all,

For better or for worse, I have created a patch against the py3k trunk
which introduces a binary operator '@' as an alternative syntax for
the new string formatting system introduced by PEP 3101 ("Advanced
String Formatting"). [1]

For common cases, this syntax should be as simple and as elegant as
its deprecated [2] predecessor ('%'), while also ensuring that more
complex use cases do not suffer needlessly.

I would just like to know whether this idea will float before
submitting the patch on Roundup and going through the formal PEP
process.  This is my first foray into the internals of the Python
core, and with any luck, I did not overlook any BDFL proclamations
banning all new binary operators for string formatting. :-)

QUICK EXAMPLES

    >>> "{} {} {}" @ (1, 2, 3)
    '1 2 3'

    >>> "foo {qux} baz" @ {"qux": "bar"}
    'foo bar baz'

One of the main complaints of a binary operator in PEP 3101 was the
inability to mix named and unnamed arguments:

    The current practice is to use either a dictionary or a tuple as
    the second argument, but as many people have commented ... this
    lacks flexibility.

To address this, a convention of having the last element of a tuple
as the named arguments dictionary is introduced.

    >>> "{} {qux} {}" @ (1, 3, {"qux": "bar"})
    '1 bar 3'

Lastly, to print the repr() of a dictionary as an unnamed argument,
one would have to append an additional dictionary so there is no
ambiguity:

    >>> "{}" @ {"foo": "bar"}
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    IndexError: tuple index out of range

    >>> "{}" @ ({"foo": "bar"}, {})
    "{'foo': 'bar'}"

Admittedly, these workarounds are less than clean, but the
understanding is the '@' syntax would indeed be an alternative, so one
could easily fall back to the str.format() method or the format()
function.

IMPLEMENTATION

Code-wise, the grammar was edited per PEP 306 [3], and a
function was introduced in unicodeobject.c as PyUnicode_FormatPrime
(in the mathematical sense of A and A' -- I didn't fully understand or
want to intrude upon the *_FormatAdvanced namespace).

The PyUnicode_FormatPrime function transforms the incoming arguments,
i.e. the operands of the binary '@', and makes the appropriate
do_string_format() call.  Thus, I have reused as much code as
possible.

I have done my development with git by using two branches: 'master'
and 'subversion', the latter of which can be used to run 'svn update'
and merge back into master.  This way my code changes and the official
ones going into the Subversion repository can stay separate, meanwhile
allowing 'svn diff' to produce an accurate patch at any given time.

The code is available at:

    http://github.com/jcsalterego/py3k-atsign/

The SVN patch [4] or related commit [5] are good starting points.

References:

[1] http://www.python.org/dev/peps/pep-3101
[2] http://docs.python.org/3.0/whatsnew/3.0.html
[3] http://www.python.org/dev/peps/pep-0306/
[4] http://github.com/jcsalterego/py3k-atsign/blob/master/py3k-atsign.diff
[5] http://github.com/jcsalterego/py3k-atsign/commit/5c8bdf72d9252cea78af2b7809613f6530e25db4

Thanks,
-- 
Jerry Chen


More information about the Python-Dev mailing list