[Python-Dev] Is core dump always a bug? Advice requested

Michel Pelletier michel at dialnetwork.com
Mon May 17 22:07:09 EDT 2004


On Wednesday 12 May 2004 22:30, Tim Peters wrote:
> [Michel Pelletier]
>
> > ...
> > Would there be an interest in at least a PEP to consider a bytecode
> > verifier?
>
> I think so, yes.

I'm working on a rough draft now.  Initially I would like to start small and 
easy:  a verifier written in python that provides a single function along the 
lines of the code provided by Phillip Eby that accepts the same arguments as 
dis(), a class, method, function, or code object.   It throws an exception 
and explanation for invalid bytecode and it will be up to programmers to call 
this method explicitly and choose to execute the code object or not.  This 
will probably cover almost all uses.

> Python should be easier, in large part because you have to give up sooner.
>
> Checks for type correctness in the PVM are done at runtime instead.  I
> think it's fair to say that a bytecode verifier is overwhelmingly "just an
> optimization":  if bytecode properties can be verified from static code
> analysis, runtime code isn't needed to verify them dynamically, or, in the
> absence of such runtime checks, static analysis plugs ways to provoke
> segfaults.

I think it's likely that both run-time check elimination and segfault plugging 
can come out ot this.  I'm not sure if it's possible for us to know what they 
all are up front, so I think a good requirement for a verification package 
will be a way to register new or project-specific verifications.

do you think there is a risk of exploitation?  for example, STORE_FAST, which 
does a direct set into PyObject **fastlocals, could be used to overwrite 
beyond the bounds of the array.  Can this or a stack over/underflow be used 
to execute arbitrary machine code?

> or
> check that C-level indexing into the co_consts vector is in bounds.  
> Well,
> in the latter case, that is checked (at runtime, on every co_consts access)
> in a debug build, but not in a release build.

I've investigated that a bit, and STORE_CONST and LOAD_CONST use GETITEM, 
which uses PyTuple_GetItem, which does a bounds check at run time even in a 
production build.  I think.  LOAD_FAST and STORE_FAST seem to do no runtime 
bounds check.

-Michel




More information about the Python-Dev mailing list