[Python-Dev] about line numbers

Fri, 20 Aug 1999 21:54:10 +0100 (NFT)

I'll try to sketch here the scheme I'm thinking of for the
callback/breakpoint issue (without SET_LINENO), although some
technical details are still missing.

I'm assuming the following, in this order:

1) No radical changes in the current behavior, i.e. preserve the
   current architecture / strategy as much as possible.

2) We dont have breakpoints per opcode, but per source line. For that
   matter, we have sys.settrace (and for now, we don't aim to have
   sys.settracei that would be called on every opcode, although we might
   want this in the future)

3) SET_LINENO disappear. Actually, SET_LINENO are conditional breakpoints,
   used for callbacks from C to Python. So the basic problem is to generate
   these callbacks.

If any of the above is not an appropriate assumption and we want a radical
change in the strategy of setting breakpoints/ generating callbacks, then
this post is invalid.

The solution I'm thinking of:

a) Currently, we have a function PyCode_Addr2Line which computes the source
   line from the opcode's address. I hereby assume that we can write the
   reverse function PyCode_Line2Addr that returns the address from a given
   source line number. I don't have the implementation, but it should be
   doable. Furthermore, we can compute, having the co_lnotab table and
   co_firstlineno, the source line range for a code object.

   As a consequence, even with the dumbiest of all algorithms, by looping
   trough this source line range, we can enumerate with PyCode_Line2Addr 
   the sequence of addresses for the source lines of this code object.

b) As Chris pointed out, in case sys.settrace is defined, we can allocate
   and keep a copy of the original code string per frame. We can further
   dynamically overwrite the original code string with a new (internal,
   one byte) CALL_TRACE opcode at the addresses we have enumerated in a).

   The CALL_TRACE opcodes will trigger the callbacks from C to Python,
   just as the current SET_LINENO does.

c) At execution time, whenever a CALL_TRACE opcode is reached, we trigger
   the callback and if it returns successfully, we'll fetch the original
   opcode for the current location from the copy of the original co_code.
   Then we directly jump to the arg fetch code (or in case we fetch the
   entire original opcode in CALL_TRACE - we jump to the dispatch code).

Hmm. I think that's all.

At the heart of this scheme is the PyCode_Line2Addr function, which is
the only blob in my head, for now.

Christian Tismer wrote:
> 
> I didn't think of this before, but I just realized that
> I have something like that already in Stackless Python.
> It is possible to set a breakpoint at every opcode, for every
> frame. Adding an extra opcode for breakpoints is a good thing
> as well. The former are good for tracing, conditionla breakpoints
> and such, and cost a little more time since the is always one extra
> function call. The latter would be a quick, less versatile thing.

I don't think I understand clearly the difference you're talking about, 
and why the one thing is better that the other, probably because I'm a
bit far from stackless python.

> I'm going to finish and publish the stackless/continous package
> and submit a paper by end of September. Should I include this debugging
> feature?

Write the paper first, you have more than enough material to talk about
already ;-). Then if you have time to implement some debugging support,
you could always add another section, but it won't be a central point
of your paper.

-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252