Debugging confusion -- too many stacks!

Sun Apr 2 08:14:25 EDT 2000

Tim Peters wrote:
> 
> [Jason Stokes]
> > I'm confused about the difference between the C stack, the Frame
> > stack, and the stack *inside* each execution frame.
> 
> The difference between the first two is an artifact of the current
> implementation; indeed, getting rid of the distinction is the major point of
> Christian Tismer's "Stackless" Python variant (where "less" refers to
> getting rid of the C-stack component).

Yes, at first "less" referred to getting rid of the C stack.
Later, I realized that the resulting tree (see below) is
no stack at all, since it can now be modified at any point.

> > From reading the source, I believe that each new code object gets a new
> > frame allocated on the frame stack,
> 
> Each *invocation* of a code object, yes; and, e.g., if a function calls
> itself recursively, there's only one code object but a separate frame for
> each call level.

Meanwhile, functions feel to me a little like classes, and frames
are their instances.

> Note that in CT's Stackless Python, the frames actually form a tree
> (although at any leaf, the path back to the root is unique and can be viewed
> as a stack).

My current view is like chains, paths, with a common root.
These paths are the default sequence when an explicit or
implicit return is executed. "returning" from a function
instance is to move one step towards the common chain root.
"calling" a function is to extend the current chain by appending
a new function instance.
The other important role of these paths are the responsibility
for exceptions. This hierarchy is upside down: The closest
function instance is consulted first about exception handlers.
This has a bit of similarity to inheritance in classes, but just
a little: We have a dynamic chain of instances which propagate
exceptions, but the behavior is only like single inheritance.

> > within which is stored all kinds of useful information about the
> > context the code is executing in -- the global and local environment,
> > a tuple of constants, a tuple of arguments etc.  *Within* each frame
> > is *another* stack, upon which the Python virtual machine loads and
> > manipulates intermediate values.
> 
> This is another artifact of the current implementation, and is best viewed
> as an internal detail of no visible consequence.

This VM stack could also be replaced by a set of registers, since
the maximum stack size is always known at compile time.

> > However, I'm not sure what the terminology is to refer to this
> > third stack.
> 
> "The frame's eval stack" works for me <wink>.
> 
> > Understanding the distinction seems to be vital to using pdb, though.
> 
> I don't use pdb much, but don't see why this would be true.  The frame's
> eval stack is invisible, and the C and frame stacks happen to be intertwined
> one-for-one today because a Python call happens to invoke the interpreter
> recursively (at the C level) today.  Conceptually, there's only one stack
> involved in calls.
> 
> Perhaps you could be specific about what in pdb is confusing you, and
> someone who uses it could straighten it out.

Personally, I prefer to use the Visual Studio debugger. Debugging
is still not easy with any debugger, since you always have a
hard time to se what's in an object. Almost all source code uses
the general object interface, and especially the eval loop does
so. To figure out what type an object actually is and what it
contains, is quite difficult. In Visual Studio, you can set up some
watches with typecasts. That helps for frames (but not their stack),
integers, and seeing the size of lists and tuples. To see the contents
of variable sized objects like strings, you need to add watches like
((char*)v+20) or do proper casts like (PyStringObject*)v and then
open the structure (which takes more time). Inspection of other
dynamic objects is much harder, since there is no built-in support
for them like with strings.
One very useful watch for eval_loop is this: Cast the current frame
in a way that you can see the name of the running Python function.

The following watches are always active when I'm debuging:
(are there similar ways to do this with PDB?)

(char*)(f->f_code->co_name)+20       # current function name
(char*)(f->f_code->co_filename)+20   # current file name
f->f_lineno                          # line number

In order to see which frame is calling you, do the same, but
replace "f" by "f->f_back".

Well, as a last hint, how do I find a certain place in my
Python code?
The trick is this: Use a seldom used opcode and set a breakpoint.
I prefer to use "BINARY_LSHIFT". Put a line like "1 << 42" into
your Python script, right before the place you want to debug.
You will find yourself at the BINARY_LSHIFT breakpoint, just
a few opcodes apart from your place, and it is easy to single
step until you are there.

What I really wished were a way to extend a given debugger
with your own support code, to make PyObjects more visible.

cheers - chrs

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home