[issue25483] Improve f-string implementation: FORMAT_VALUE opcode

Eric V. Smith report at bugs.python.org
Mon Oct 26 11:44:59 EDT 2015


New submission from Eric V. Smith:

Currently, the f-string f'a{3!r:10}' evaluates to bytecode that does the same thing as:

''.join(['a', format(repr(3), '10')])

That is, it literally calls the functions format() and repr(). The same holds true for str() and ascii() with !s and !a, respectively.

By redefining format, str, repr, and ascii, you can break or pervert the computation of the f-string's value:

>>> def format(v, fmt=None): return '42'
...
>>> f'{3}'
'42'

It's always been my intention to fix this. This patch adds an opcode FORMAT_VALUE, which instead of looking up format, etc., directly calls PyObject_Format, PyObject_Str, PyObject_Repr, and PyObject_ASCII. Thus, you can no longer modify what an f-string produces merely by overriding the named functions.


In addition, because I'm now saving the name lookups and function calls, performance is improved.

Here are the times without this patch:

$ ./python -m timeit -s 'x="test"' 'f"{x}"'
1000000 loops, best of 3: 0.3 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!s}"'
1000000 loops, best of 3: 0.511 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!r}"'
1000000 loops, best of 3: 0.497 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!a}"'
1000000 loops, best of 3: 0.461 usec per loop


And with this patch:

$ ./python -m timeit -s 'x="test"' 'f"{x}"'
10000000 loops, best of 3: 0.02 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!s}"'
100000000 loops, best of 3: 0.02 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!r}"'
10000000 loops, best of 3: 0.0896 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!a}"'
10000000 loops, best of 3: 0.0923 usec per loop


So a 90%+ speedup, for these simple cases.

Also, now f-strings are faster than %-formatting, at least for some types:

$ ./python -m timeit -s 'x="test"' '"%s"%x'
10000000 loops, best of 3: 0.0755 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x}"'
10000000 loops, best of 3: 0.02 usec per loop


Note that people often "benchmark" %-formatting with code like the following. But the optimizer converts this to a constant string, so it's not a fair comparison:

$ ./python -m timeit '"%s"%"test"'
100000000 loops, best of 3: 0.0161 usec per loop


These microbenchmarks aren't the end of the story, since the string concatenation also takes some time. That's another optimization I might implement in the future.

Thanks to Mark and Larry for some advice on this.

----------
assignee: eric.smith
components: Interpreter Core
files: format-opcode.diff
keywords: patch
messages: 253476
nosy: Mark.Shannon, eric.smith, larry
priority: normal
severity: normal
stage: patch review
status: open
title: Improve f-string implementation: FORMAT_VALUE opcode
type: enhancement
versions: Python 3.6
Added file: http://bugs.python.org/file40863/format-opcode.diff

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue25483>
_______________________________________


More information about the Python-bugs-list mailing list