[Tutor] bytecode primer, and avoiding a monster download

eryksun eryksun at gmail.com
Tue May 28 18:48:58 CEST 2013


On Tue, May 28, 2013 at 7:18 AM, Dave Angel <davea at davea.name> wrote:
>
> dis.dis(myfunction)
>
> will disassemble one function.
>
> That's not all that's in the byte-code file, but this is 98% of what you
> probably want out of it.  And you can do it in the debugger with just the
> standard library.

The argument for dis.dis() can be a module, class, function or code
object. It disassembles all the top-level code objects that it finds,
but it doesn't recursively disassemble code objects that are in the
co_consts.

I'm not sure what Dave means by 'byte-code file'. A .pyc? That's a
marshaled code object with a small header that has a magic number and
a timestamp. Here's an example of reading the pyc for the dis module
itself:

    import dis
    import marshal
    import struct
    import datetime

    # magic number for Python 2.7
    MAGIC27 = 62211 | (ord('\r') << 16) | (ord('\n') << 24)

    pyc = open(dis.__file__, 'rb')  # dis.pyc
    hdr = pyc.read(8)
    magic, tstamp = struct.unpack('<ll', hdr)
    tstamp = datetime.datetime.fromtimestamp(tstamp)

    # the rest of the file is the code object
    code = marshal.load(pyc)

    >>> magic == MAGIC27
    True
    >>> tstamp
    datetime.datetime(2013, 1, 2, 12, 45, 58)

    >>> code.co_consts[0]
    'Disassembler of Python byte code into mnemonics.'

The code object's co_consts tuple also has the code objects for the
defined functions, plus the anonymous functions that build class
objects. The latter are subsequently discarded, as is the .pyc code
object itself after the module is imported/executed. There's no point
in keeping it around.

My previous post is a light intro to instantiating code and function
objects. I think the arguments are mostly self-explanatory -- except
for co_lnotab, co_flags, and closure support (co_cellvars,
co_freevars, func_closure).

I assembled the bytecode with the help of opcode.opmap, to make it
more readable. But to be clear, CPython bytecode is simply a byte
string stored in the co_code attribute. Disassembling the bytecode
nicely with source line numbers requires co_lnotab. While I did
reference the text file that explains co_lnotab, I neglected to
provide the following link:

http://hg.python.org/cpython/file/687295c6c8f2/Objects/lnotab_notes.txt

co_flags indicates various aspects of how the bytecode was compiled
(e.g. optimized to use fastlocals). It's inherited from the current
context when you use exec or eval. compile() can disable this via the
argument dont_inherit.

The code for a class body or a function requires a new local namespace
(CO_NEWLOCALS). For a function, locals is also optimized
(CO_OPTIMIZED) to use the fastlocals array instead of a dict. On the
other hand, the code that creates a module is evaluated with locals
and globals set to the same namespace, so it won't have the
CO_NEWLOCALS flag.

Including the metadata that there are no free variables (CO_NOFREE)
can make a simple function call more efficient, but only if there are
no default arguments or keyword arguments in the call. Refer to
fast_function() in Python/ceval.c.

http://hg.python.org/cpython/file/ab05e7dd2788/Python/ceval.c#l4060

Here are all of the flags for code objects in 2.7:

    CO_OPTIMIZED    0x01
    CO_NEWLOCALS    0x02
    CO_VARARGS      0x04
    CO_VARKEYWORDS  0x08
    CO_NESTED       0x10
    CO_GENERATOR    0x20
    CO_NOFREE       0x40

    /* __future__ imports */
    CO_FUTURE_DIVISION         0x02000
    CO_FUTURE_ABSOLUTE_IMPORT  0x04000
    CO_FUTURE_WITH_STATEMENT   0x08000
    CO_FUTURE_PRINT_FUNCTION   0x10000
    CO_FUTURE_UNICODE_LITERALS 0x20000

For a high-level view on scoping, read the section on the execution
model in the language reference:

http://docs.python.org/2/reference/executionmodel

Subsequently, if you want, we can talk about how this is implemented
in the VM, and especially with respect to closures, cellvars,
freevars, and the opcodes LOAD_DEREF, STORE_DEREF, and MAKE_CLOSURE.


More information about the Tutor mailing list