[Python-Dev] Subtle difference between f-strings and str.format()

Serhiy Storchaka storchaka at gmail.com
Wed Mar 28 11:27:19 EDT 2018


There is a subtle semantic difference between str.format() and 
"equivalent" f-string.

     '{}{}'.format(a, b)
     f'{a}{b}'

In the former case b is evaluated before formatting a. This is equivalent to

     t1 = a
     t2 = b
     t3 = format(t1)
     t4 = format(t2)
     r = t3 + t4

In the latter case a is formatted before evaluating b. This is equivalent to

     t1 = a
     t2 = format(t1)
     t3 = b
     t4 = format(t3)
     r = t2 + t4

In most cases this doesn't matter, but when implement the optimization 
that transforms the former expression to the the latter one ([1], [2]) 
we have to make a decision what to do with this difference.

1. Keep the exact semantic of str.format() when optimize it. This means 
that it should be transformed into AST node different from the AST node 
used for f-strings. Either introduce a new AST node type, or add a 
boolean flag to JoinedStr.

2. Change the semantic of f-strings. Make it closer to the semantic of 
str.format(): evaluate all subexpressions first than format them. This 
can be implemented in two ways:

2a) Add additional instructions for stack manipulations. This will slow 
down f-strings.

2b) Introduce a new complex opcode that will replace FORMAT_VALUE and 
BUILD_STRING. This will speed up f-strings.

3. Transform str.format() into an f-string with changing semantic, and 
ignore this change. This is not new. The optimizer already changes 
semantic. Non-optimized "if a and True:" would call bool(a) twice, but 
optimized code calls it only once.

[1] https://bugs.python.org/issue28307
[2] https://bugs.python.org/issue28308



More information about the Python-Dev mailing list