[Python-Dev] PEP 3101: Advanced String Formatting

Talin talin at acm.org
Sun Apr 30 22:33:54 CEST 2006


Zachary Pincus wrote:

> I'm not sure about introducing a special syntax for accessing  
> dictionary entries, array elements and/or object attributes *within a  
> string formatter*... much less an overloaded one that differs from  how 
> these elements are accessed in "regular python".
> 
>>      Compound names are a sequence of simple names seperated by
>>      periods:
>>
>>          "My name is {0.name} :-\{\}".format(dict(name='Fred'))
>>
>>      Compound names can be used to access specific dictionary entries,
>>      array elements, or object attributes.  In the above example, the
>>      '{0.name}' field refers to the dictionary entry 'name' within
>>      positional argument 0.
> 
> 
> Barring ambiguity about whether .name would mean the "name" attribute  
> or the "name" dictionary entry if both were defined, I'm not sure I  
> really see the point. How is:
>   d = {last:'foo', first:'bar'}
>   "My last name is {0.last}, my first name is {0.first}.".format(d)
> 
> really that big a win over:
>   d = {last:'foo', first:'bar'}
>   "My last name is {0}, my first name is {1}.".format(d['last'], d 
> ['first'])

At one point I had intended to abandon the compound-name syntax, until I 
realized that it had one beneficial side-effect, which is that it offers 
a way around the 'dict-copying' problem.

There are a lot of cases where you want to pass an entire dict as the 
format args using the **kwargs syntax. One common use pattern is for 
debugging code, where you want to print out a bunch of variables that 
are in the local scope:

    print "Source file: {file}, line: {line}, column: {col}"\
         .format( **locals() )

The problem with this is one of efficiency - the interpreter handles **
by copying the entire dictionary and merging it with any keyword arguments.

Under most sitations this is fine; However if the dictionary is 
particularly large, it might be a problem. So the intent of the
compound name syntax is to allow something very similar:

    print "Source file: {0.file}, line: {0.line}, column: {0.col}"\
         .format( locals() )

Now, its true that you could also do this by passing in the 3 parameters 
as individual arguments; However, there have been some strong proponents 
of being able to pass in a single dict, and rather than restating their 
points I'll let them argue their own positions (so as not to 
accidentally mis-state them.)

> Plus, the in-string syntax is limited -- e.g. what if I want to call  a 
> function on an attribute? Unless you want to re-implement all  python 
> syntax within the formatters, someone will always be able to  level 
> these sort of complaints. Better, IMO, to provide none of that  than a 
> restricted subset of the language -- especially if the syntax  looks and 
> works differently from real python.

The in-string syntax is limited deliberately for security reasons. 
Allowing arbitrary executable code within a string is supported by a 
number of other scripting languages, and we've seen a good number of 
exploits as a result.

I chose to support only __getitem__ and __getattr__ because I felt that 
they would be relatively safe; usually (but not always) those functions 
are written in a way that has no side effects.

-- Talin


More information about the Python-Dev mailing list