Python "byte code" description

Michael Hudson mwh at python.net
Sat Dec 7 10:46:39 EST 2002


Derek Thomson <derek at wedgetail.com> writes:

> Terry Reedy wrote:
> 
> > It is not a part of the language itself or its
> > definition.
> 
> 
> I think you could argue that it would be useful in some situations if
> it were well defined. Or how else how can you be sure other
> interpeters, debuggers, or other tools that operate on the byte code
> (or virtual machine) are correct?

You can't.  But there are really very few tools that operate on
bytecode.  You can expect them to break with every major release of
Python.  I don't personally think that avoiding this is a worthwhile
pursuit.

> Or, what if I wanted to convert Python byte codes to JVM byte codes
> directly? Or Parrot byte codes? Or [insert VM here] byte codes?

This strikes me as a silly thing to do: surely more sensible would be
compiling Python source to these alternative bytecodes.

> > >Is there anything more comprehensive, apart from the Python
> > >implementation itself? Not only do I need the actual file format,
> >
> > The internal .pyc format is an internal implementation detail
> > subject to change with each version.
> 
> Maybe, but that fact *itself* isn't documented.

I'm slightly surprised by that.  Patches welcome :)

> Also, that doesn't always have to be true ...  there may be
> advantages in defining it concretely and limiting change, or at
> least managing it.

There may be.  I personally don't think so.  For instance, people have
made occasional noises about rewriting the core VM to be a register
machine, not a stack machine.  If someone ever gets round to finishing
this and it turns out to be significantly faster, or significantly
more comprehensible code, I don't think we should disallow it on the
grounds that bytecode inspecting tools will break.

> A trivial example: how do you know if the current behaviour of the C
> implementation for a particular case is a bug, or the way it's
> supposed to be, unless it's well defined, or you are Guido? ;)

You ask Guido?  (Not any *entirely* facetious answer...)

> > Python grabs what it needs as long as the OS will give it.  Specifics
> > depend on your OS and hardware.  It 'assumes' that the system
> > resources are sufficient for the task you give it.
> 
> 
> That's not what I mean. I'm asking what assumptions the byte code
> makes about its environment (or, it's virtual
> machine/interpreter). For example, some of the operators assume the
> existence of a co_varnames variable. Others assume a stack.

The only way you can find this out is to read the code.

> > Perhaps you can restate your question to be more specific, and give a
> > bit of context.
> 
> 
> What if I wanted to implement an interpreter?

Why would you want to use the same bytecode as CPython currently does?
Duplicating the effort to that point seems, well, pointless.

> I'd need to know what the properties of the stack are, and other
> environmental assumptions that the byte code instruction set makes.
> 
> But, I think this answers my question anyway. I'm just going to have
> to reverse engineer it for myself :(

Yes, but

a) between the docs you've found and the code, it's not that hard.
b) if it were documented, you'd probably still have to read bits of
   code to understand what the docs meant, and to check their accuracy
c) there are plenty of far more important things which are lacking
   documentation (new-style classes and descriptors and so on spring
   to mind).

Good luck!

Cheers,
M.

-- 
  The ultimate laziness is not using Perl.  That saves you so much
  work you wouldn't believe it if you had never tried it.
                                        -- Erik Naggum, comp.lang.lisp



More information about the Python-list mailing list