[issue26107] code.co_lnotab: use signed line number delta to support moving instructions in an optimizer

STINNER Victor report at bugs.python.org
Thu Jan 14 04:12:46 EST 2016


New submission from STINNER Victor:

Python doesn't store the original line number in the .pyc file in the bytecode. Instead, an efficient table is used to find the line number from the current in the bytecode: code.co_lnotab.

Basically, it's a list of (offset_delta, line_number_delta) pairs where offset_delta and line_number_delta are unsigned 8 bits numbers. If an offset delta is larger than 255, (offset_delta % 255, line_number_delta) and (offset_delta // 255, 0) pairs are emited. Same for line_number_delta. (In fact, more than two pairs can be created.)

The format is described in Objects/lnotab_notes.txt.

I implemented an optimizer which can generate *negative* line number. For example, the loop:

   for i in range(2):   # line 1
      print(i)          # line 2

is replaced with:

   i = 0      # line 1
   print(i)   # line 2
   i = 1      # line 1
   print(i)   # line 2

The third instruction has a negative line number delta.

I'm not the first one hitting the issue, but it's just that no one proposed a patch before. Previous projects bitten by this issue:

* issue #10399: "AST Optimization: inlining of function calls"
* issue #11549: "Build-out an AST optimizer, moving some functionality out of the peephole optimizer"

Attached patch changes the type of line number delta from unsigned 8-bit integer to *signed* 8-bit integer. If a line number delta is smaller than -128 or larger than 127, multiple pairs are created (as before).

My code in Lib/dis.py is inefficient. Maybe unpack the full lnotab than *then* skip half of the bytes? (instead of calling struct.unpack times for each byte).

The patch adds also "assert(Py_REFCNT(lnotab_obj) == 1);" to PyCode_Optimize(). The assertion never fails, but it's just to be extra safe.

The patch renames variables in PyCode_Optimize() because I was confused between "offset" and "line numbers". IMHO variables were badly named.

I changed the MAGIC_NUMBER of importlib, but it was already changed for f-string:

#     Python 3.6a0  3360 (add FORMAT_VALUE opcode #25483)

Is it worth to modify it again?

You may have to recompile Python/importlib_external.h if it's not recompiled automatically (just touch the file before running make).

Note: this issue is related to the PEP 511 (the PEP is not ready for a review, but it gives a better overview of the use cases.)

----------
files: lnotab.patch
keywords: patch
messages: 258189
nosy: brett.cannon, haypo, rhettinger, serhiy.storchaka
priority: normal
severity: normal
status: open
title: code.co_lnotab: use signed line number delta to support moving instructions in an optimizer
versions: Python 3.6
Added file: http://bugs.python.org/file41613/lnotab.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue26107>
_______________________________________


More information about the Python-bugs-list mailing list