Is it possible to 'compile' a script?

Fri Oct 4 03:57:59 EDT 2002

<posted & mailed>

solosnake wrote:
        ...
>> Note that Python already compiles scripts internally
>> to a bytecode before executing them, and does most
>> things that are reasonably practical to interpret
>> the bytecode as quickly as possible.
> 
> So *in theory* at least what I am asking after is not impossible, and at
> some level is already inside Python. My suggestion to those responsible

And it's exposed to Python programmers as the 'compile' built-in function.

> for maintaining Python is to differentiate between calling a script which
> is then parsed and compiled into bytecode and executed, and calling a
> function which will parse and create the bytecode, but return a handle to
> that (now faster) bytecode and allow it to be executed through this
> handle.

How would that differ from the 'compile' built-in function, which
returns a code-object?

When your script needs to load (once) and execute or evaluate (repeatedly)
a piece of source code obtained dynamically, it's faster to compile the
code once (with the built-in 'compile' function) -- the resulting code
object can be used with statement exec or built-in function eval just
like the original source could be (with similar security issues, but
faster -- another advantage is that syntax errors get diagnosed ASAP,
at compile-time, separately from other errors diagnosed at runtime).

One thing I've seen leave beginners perplexed sometimes in this regard
is how to perform the following task: they read from somewhere at
runtime a multiline string such as:

newcode = """def userfunc(x,y):
                 return x+y"""

which by convention of their program with its users must define a
function named userfunc that accepts two arguments (e.g. for plotting
purposes of some kind) and wonder how to best handle this.

Tip #1: this code is incorrect, or rather incomplete -- it has an
indent but no corresponding dedent.  You need to end the string
with a \n to make it complete.

Tip #2: def is a statement; if you trust your user totally you
COULD just exec this string, ONCE, and thus bind name userfunc
to a function which you can call normally.  *THIS IS NOT WISE*.
Quite apart from users of ill intent, if the user has made a
tiny typo such as defining 'usefrunc' instead of 'userfunc',
your diagnostics of his/her mistake will likely be feeble to
nonexistent.  exec can trample all over your namespace and thus
makes your code too hard to debug, AND much less efficient, as
the compiler must avoid the small but crucial optimization it
does wrt local variables in any function that uses an exec.

This applies to exec _without a specific dictionary_.  WITH a
specific dictionary, you're much better off (though security
can still be an issue -- see the rexec module, but it's not
_strong_ security, alas...).

So, e.g.:

def new_userfunc(newcode):
    tempdict = {}
    try: exec newcode in tempdict
    except SyntaxError, err:
        return None, err
    try: return tempdict['userfunc'], None, None
    except AttributeError:
        return None, NameError(
          "You defined no 'userfunc' -- the names you binded are %s"
          % [x for x in tempdict if x[:1]!='_'])

to be called e.g. as follows:

userfunc, err = new_userfunc(newcode)
if userfunc is None:
    diagnose_error(err)
else:
    plot_new_func(userfunc)

Alex