[New-bugs-announce] [issue36551] Optimize list comprehensions with preallocate size and protect against overflow

anthony shaw report at bugs.python.org
Sun Apr 7 21:28:21 EDT 2019


New submission from anthony shaw <anthony.p.shaw at gmail.com>:

List comprehensions currently create a series of opcodes inside a code object, the first of which is BUILD_LIST with an oparg of 0, effectively creating a zero-length list with a preallocated size of 0.

If you're doing a simple list comprehension on an iterator, e.g.

def foo():
  a = iterable
  return [x for x in a]

Disassembly of <code object <listcomp> at 0x109db2c40, file "<stdin>", line 3>:
  3           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                 8 (to 14)
              6 STORE_FAST               1 (x)
              8 LOAD_FAST                1 (x)
             10 LIST_APPEND              2
             12 JUMP_ABSOLUTE            4
        >>   14 RETURN_VALUE

The list comprehension will do a list_resize on the 4, 8, 16, 25, 35, 46, 58, 72, 88th iterations, etc.

This PR preallocates the list created in a list comprehension to the length of the iterator using PyObject_LengthHint().

It uses a new BUILD_LIST_PREALLOC opcode which builds a list with the allocated size of PyObject_LengthHint(co_varnames[oparg]).

[x for x in iterable] compiles to:

Disassembly of <code object <listcomp> at 0x109db2c40, file "<stdin>", line 3>:
  3           0 BUILD_LIST_PREALLOC      0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                 8 (to 14)
              6 STORE_FAST               1 (x)
              8 LOAD_FAST                1 (x)
             10 LIST_APPEND              2
             12 JUMP_ABSOLUTE            4
        >>   14 RETURN_VALUE

If the comprehension has ifs, then it will use the existing BUILD_LIST opcode

Testing using a range length of 10000

./python.exe -m timeit "x=list(range(10000)); [y for y in x]"

Gives 392us on the current 3.8 branch
and 372us with this change (about 8-10% faster)

the longer the iterable, the bigger the impact.

This change also catches the issue that a very large iterator, like a range object :
[a for a in range(2**256)]

Would cause the 3.8< interpreter to consume all memory and crash because there is no check against PY_SSIZE_MAX currently.

With this change (assuming there is no if inside the comprehension) is now caught and thrown as an OverflowError:

>>> [a for a in range(2**256)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <listcomp>
OverflowError: Python int too large to convert to C ssize_t

----------
components: Interpreter Core
messages: 339586
nosy: anthony shaw, ncoghlan
priority: normal
severity: normal
status: open
title: Optimize list comprehensions with preallocate size and protect against overflow
versions: Python 3.8

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue36551>
_______________________________________


More information about the New-bugs-announce mailing list