|Title:||Precise line numbers for debugging and other tools.|
|Author:||Mark Shannon <mark at hotpy.org>|
- Backwards Compatibility
- Performance Implications
- Reference Implementation
Python should guarantee that when tracing is turned on, "line" tracing events are generated for all lines of code executed and only for lines of code that are executed.
The f_lineo attribute of frame objects should always contain the expected line number. During frame execution, the expected line number is the line number of source code currently being executed. After a frame has completed, either by returning or by raising an exception, the expected line number is the line number of the last line of source that was executed.
A side effect of ensuring correct line numbers, is that some bytecodes will need to be marked as artificial, and not have a meaningful line number. To assist tools, a new co_lines attribute will be added that describes the mapping from bytecode to source.
Users of sys.settrace and associated tools should be able to rely on tracing events being generated for all lines of code, and only for actual code. They should also be able to assume that the line number in f_lineno is correct.
The current implementation mostly does this, but fails in a few cases. This requires workarounds in tooling and is a nuisance for alternative Python implementions.
Having this guarantee also benefits implementers of CPython in the long term, as the current behaviour is not obvious and has some odd corner cases.
In order to guarantee that line events are generated when expected, the co_lnotab attribute, in its current form, can no longer be the soure of truth for line number information.
Rather than attempt to fix the co_lnotab attribute, a new method co_lines() will be added, which returns an iterator over bytecode offsets and source code lines.
Ensuring that the bytecode is annotated correctly to enable accurate line number information means that some bytecodes must be marked as artificial, and not have a line number.
Some care must be taken not to break existing tooling. To minimize breakage, the co_lnotab attribute will be retained, but lazily generated on demand.
Line events and the f_lineno attribute should act as an experienced Python user would expect in all cases.
Tracing generates events for calls, returns, exceptions, lines of source code executed, and, under some circumstances, instructions executed.
Only line events are covered by this PEP.
When tracing is turned on, line events will be generated when:
- A new line of source code is reached.
- A backwards jump occurs, even if it jumps to the same line, as may happen in list comprehensions.
Additionally, line events will never be generated for source code lines that are not executed.
- When a frame object is created, the f_lineno will be set to the line at which the function or class is defined; that is the line on which the def or class keyword appears. For modules it will be set to zero.
- The f_lineno attribute will be updated to match the line number about to be executed, even if tracing is turned off and no event is generated.
The co_lines() method will return an iterator which yields tuples of values, each representing the line number of a range of bytecodes. Each tuple will consist of three values:
- start -- The offset (inclusive) of the start of the bytecode range
- end -- The offset (exclusive) of the end of the bytecode range
- line -- The line number, or None if the the bytecodes in the given range do not have a line number.
The sequence generated will have the following properties:
- The first range in the sequence with have a start of 0
- The (start, end) ranges will be strictly increasing and consecutive. That is, for any pair of tuples the start of the second will equal to the end of the first.
- No range will be empty, that is end > start for all triples.
- The final range in the sequence with have end equal to the size of the bytecode.
- line will either be a positive integer, or None
The co_linetable attribute will hold the line number information. The format is opaque, unspecified and may be changed without notice. The attribute is public only to support creation of new code objects.
Historically the co_lnotab attribute held a mapping from bytecode offset to line number, but does not support bytecodes without a line number. For backward compatibility, the co_lnotab bytes object will be lazily created when needed. For ranges of bytecodes without a line number, the line number of the previous bytecode range will be used.
Tools that parse the co_lnotab table should move to using the new co_lines() method as soon as is practical.
The co_lnotab attribute will be deprecated in 3.10 and removed in 3.12.
Any tools that parse the co_lnotab attribute of code objects will need to move to using co_lines() before 3.12 is released. Tools that use sys.settrace will be unaffected, except in cases where the "line" events they receive are more accurate.
In the following examples, events are listed as "name", f_lineno pairs.
0. def spam(a): 1. if a: 2. eggs() 3. else: 4. pass
If a is True, then the sequence of events generated by Python 3.9 is:
"line" 1 "line" 2 "line" 4 "return" 4
From 3.10 the sequence will be:
"line" 1 "line" 2 "return" 2
0. def bar(): 1. pass 2. pass 3. pass
The sequence of events generated by Python 3.9 is:
"line" 3 "return" 3
From 3.10 the sequence will be:
"line" 1 "line" 2 "line" 3 "return" 3
Access to the f_lineno attribute of frame objects through C API functions is unchanged. f_lineno can be read by PyFrame_GetLineNumber. f_lineno can only be set via PyObject_SetAttr and similar functions.
Accessing f_lineno directly through the underlying data structure is forbidden.
In general, there should be no change in performance. When tracing, programs should run a little faster as the new table format can be designed with line number calculation speed in mind. Code with long sequences of pass statements will probably become a bit slower.
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.