[Python-Dev] Static analysis of CPython using coccinelle/spatch

Brett Cannon brett at python.org
Tue Nov 17 22:03:23 CET 2009


On Mon, Nov 16, 2009 at 12:27, David Malcolm <dmalcolm at redhat.com> wrote:
> Has anyone else looked at using Coccinelle/spatch[1] on CPython source
> code?

Not that has been mentioned on the list before.

>
> It's a GPL-licensed tool for matching semantic patterns in C source
> code. It's been used on the Linux kernel for detecting and fixing
> problems, and for autogenerating patches when refactoring
> (http://coccinelle.lip6.fr/impact_linux.php).  Although it's implemented
> in OCaml, it is scriptable using Python.
>
> I've been experimenting with using it on CPython code, both on the core
> implementation, and on C extension modules.
>
> As a test, I've written a validator for the mini-language used by
> PyArg_ParseTuple and its variants.  My code examines the types of the
> variables passed as varargs, and attempts to check that they are
> correct, according to the rules here
> http://docs.python.org/c-api/arg.html (and in Python/getargs.c)
>
> It can detect this old error (fixed in svn r34931):
> buggy.c:12:socket_htons:Mismatching type of argument 1 in ""i:htons"":
> expected "int *" but got "unsigned long *"
>
> Similarly, it finds the deliberate error in xxmodule.c:
> xxmodule.c:207:xx_roj:unknown format char in "O#:roj": '#'
>
> (Unfortunately, when run on the full source tree, I see numerous
> messages, and as far as I can tell, the others are false positives)
>
> You can see the code here:
> http://fedorapeople.org/gitweb?p=dmalcolm/public_git/check-cpython.git;a=tree
> and download using anonymous git in this manner:
> git clone git://fedorapeople.org/home/fedora/dmalcolm/public_git/check-cpython.git
>
> The .cocci file detects invocations of PyArg_ParseTuple and determines
> the types of the arguments.  At each matching call site it invokes
> python code, passing the type information to validate.py's
> validate_types.
>
> (I suspect it's possible to use spatch to detect reference counting
> antipatterns; I've also attempted 2to3 refactoring of c code using
> semantic patches, but so far macros tend to get in the way).
>
> Alternatively, are there any other non-proprietary static analysis tools
> for CPython?

Specific to CPython? No. But I had a chance to run practically every
major commercial static analysis tool over the code base back on 2006.
We also occasionally run valgrind over the code. But thanks to have we
have structured the code and taken performance shortcuts static
analysis tools easily get tripped up by CPython (as you have
discovered).

>
> Thoughts?

Running the tool over the code base and reporting the found bugs would
be appreciated.

-Brett


> Dave
>
> [1] http://coccinelle.lip6.fr/
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
>


More information about the Python-Dev mailing list