[New-bugs-announce] [issue33325] Optimize sequences of constants in the compiler

Serhiy Storchaka report at bugs.python.org
Sat Apr 21 08:09:20 EDT 2018


New submission from Serhiy Storchaka <storchaka+cpython at gmail.com>:

The following PR makes three optimizations in the compiler.

1. A sequence of several LOAD_CONSTs is replaced with a single LOAD_CONST followed by UNPACK_SEQUENCE.

For example, "{'a': 1, 'b': 2, 'c': 3}" is currently compiled to

  1           0 LOAD_CONST               0 (1)
              2 LOAD_CONST               1 (2)
              4 LOAD_CONST               2 (3)
              6 LOAD_CONST               3 (('a', 'b', 'c'))
              8 BUILD_CONST_KEY_MAP      3
             10 POP_TOP
             12 LOAD_CONST               4 (None)
             14 RETURN_VALUE

With this optimization it will be compiled to:

  1           0 LOAD_CONST               5 ((('a', 'b', 'c'), 3, 2, 1))
              2 UNPACK_SEQUENCE          4
              4 BUILD_CONST_KEY_MAP      3
              6 POP_TOP
              8 LOAD_CONST               4 (None)
             10 RETURN_VALUE

2. Optimized building lists and sets of constants. [1, 2, 3, 4, 5] will be compiled to [*(1, 2, 3, 4, 5)], and {1, 2, 3, 4, 5} will be compiled to {*frozenset(1, 2, 3, 4, 5)}, where (1, 2, 3, 4, 5) and frozenset(1, 2, 3, 4, 5) are just constants.

x = [1, 2, 3, 4, 5]
y = {1, 2, 3, 4, 5}

currently is compiled to

  1           0 LOAD_CONST               0 (1)
              2 LOAD_CONST               1 (2)
              4 LOAD_CONST               2 (3)
              6 LOAD_CONST               3 (4)
              8 LOAD_CONST               4 (5)
             10 BUILD_LIST               5
             12 STORE_NAME               0 (x)

  2          14 LOAD_CONST               0 (1)
             16 LOAD_CONST               1 (2)
             18 LOAD_CONST               2 (3)
             20 LOAD_CONST               3 (4)
             22 LOAD_CONST               4 (5)
             24 BUILD_SET                5
             26 STORE_NAME               1 (y)
             28 LOAD_CONST               5 (None)
             30 RETURN_VALUE

With optimization 1 it will be compiled to

  1           0 LOAD_CONST               6 ((5, 4, 3, 2, 1))
              2 UNPACK_SEQUENCE          5
              4 BUILD_LIST               5
              6 STORE_NAME               0 (x)

  2           8 LOAD_CONST               6 ((5, 4, 3, 2, 1))
             10 UNPACK_SEQUENCE          5
             12 BUILD_SET                5
             14 STORE_NAME               1 (y)
             16 LOAD_CONST               5 (None)
             18 RETURN_VALUE

And with optimization 2 it will be compiled to

  1           0 LOAD_CONST               0 ((1, 2, 3, 4, 5))
              2 BUILD_LIST_UNPACK        1
              4 STORE_NAME               0 (x)

  2           6 LOAD_CONST               1 (frozenset({1, 2, 3, 4, 5}))
              8 BUILD_SET_UNPACK         1
             10 STORE_NAME               1 (y)
             12 LOAD_CONST               2 (None)
             14 RETURN_VALUE

3. Remove unused constants.

After folding tuples of constants created at code generation level, eliminating unreachable code, and after the above two optimizations, unused constants are left in the co_consts tuple. The third optimization removes them and reenumerate constants in the order of occurrence. The above example will be compiled to:

  1           0 LOAD_CONST               0 ((1, 2, 3, 4, 5))
              2 BUILD_LIST_UNPACK        1
              4 STORE_NAME               0 (x)

  2           6 LOAD_CONST               1 (frozenset({1, 2, 3, 4, 5}))
              8 BUILD_SET_UNPACK         1
             10 STORE_NAME               1 (y)
             12 LOAD_CONST               2 (None)
             14 RETURN_VALUE

See issue28813 for the implementation of this optimization on the level of raw bytecode (in peephole.c). 

These optimizations are useful not only for initializing collections of constants. Calling function with constant arguments "f(x, a=1, b=2)":

Current:
  1           0 LOAD_NAME                0 (f)
              2 LOAD_NAME                1 (x)
              4 LOAD_CONST               0 (None)
              6 LOAD_CONST               1 (1)
              8 LOAD_CONST               2 (2)
             10 LOAD_CONST               3 (('a', 'b'))
             12 CALL_FUNCTION_KW         4
             14 POP_TOP
             16 LOAD_CONST               0 (None)
             18 RETURN_VALUE

Optimized:
  1           0 LOAD_NAME                0 (f)
              2 LOAD_NAME                1 (x)
              4 LOAD_CONST               0 ((('a', 'b'), 2, 1, None))
              6 UNPACK_SEQUENCE          4
              8 CALL_FUNCTION_KW         4
             10 POP_TOP
             12 LOAD_CONST               1 (None)
             14 RETURN_VALUE

This issue depends on issue33318.

----------
components: Interpreter Core
messages: 315563
nosy: benjamin.peterson, brett.cannon, inada.naoki, ncoghlan, pitrou, rhettinger, serhiy.storchaka, yselivanov
priority: normal
severity: normal
status: open
title: Optimize sequences of constants in the compiler
type: enhancement
versions: Python 3.8

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue33325>
_______________________________________


More information about the New-bugs-announce mailing list