[Python-Dev] [Python-checkins] r79397 - in python/trunk: Doc/c-api/capsule.rst Doc/c-api/cobject.rst Doc/c-api/concrete.rst Doc/data/refcounts.dat Doc/extending/extending.rst Include/Python.h Include/cStringIO.h Include/cobject.h Include/datetime.h Include/py_curses.h Include/pycapsule.h Include/pyexpat.h Include/ucnhash.h Lib/test/test_sys.py Makefile.pre.in Misc/NEWS Modules/_ctypes/callproc.c Modules/_ctypes/cfield.c Modules/_ctypes/ctypes.h Modules/_cursesmodule.c Modules/_elementtree.c Modules/_testcapimodule.c Modules/cStringIO.c Modules/cjkcodecs/cjkcodecs.h Modules/cjkcodecs/multibytecodec.c Modules/cjkcodecs/multibytecodec.h Modules/datetimemodule.c Modules/pyexpat.c Modules/socketmodule.c Modules/socketmodule.h Modules/unicodedata.c Objects/capsule.c Objects/object.c Objects/unicodeobject.c PC/VS7.1/pythoncore.vcproj PC/VS8.0/pythoncore.vcproj PC/os2emx/python27.def PC/os2vacpp/python.def Python/compile.c Python/getargs.c

Larry Hastings larry at hastings.org
Thu Mar 25 18:38:54 CET 2010


M.-A. Lemburg wrote:
> Backporting PyCapsule is fine, but the changes you made to all
> those PyCObject uses does not look backwards compatible.
>
> The C APIs exposed by the modules (e.g. the datetime module)
> are used in lots of 3rd party extension modules and changing
> them from PyCObject to PyCapsule is a major change in the
> module API.

You're right, my changes aren't backwards compatible.  I thought it was 
reasonable for four reasons:

1. The CObject API isn't safe.  It's easy to crash Python 2.6 in just a 
few lines by mixing and matching CObjects.  Switching Python to capsules 
prevents a class of exploits.  I've included a script at the bottom of 
this message that demonstrates three such crashes.  The script runs in 
Python 2 and 3, but 3.1 doesn't crash because it's using capsules.

2. As I just mentioned, Python 3.1 already uses capsules everywhere 
instead of CObjects.  Since part of the purpose of Python 2.7 is to 
prepare developers for the to upgrade to 3.1, getting them to switch to 
capsules now is just one more way they are prepared.

3. Because CObject is unsafe, I want to deprecate it in 2.7, and if we 
ever made a 2.8 I want to remove it completely.

4. When Python publishes an API using a CObject, it describes the thing 
the CObject points to in a header file.  In nearly all cases that header 
file also provides a macro or inline function that does the importing 
work for you.  I changed those to use capsules too.  So if the 
third-party code uses the macro or inline function, all you need do is 
recompile it against 2.7 and it works fine.  Sadly I know of one 
exception: pyexpat.expat_CAPI.  The header file just describes the 
struct pointed to by the CObject, but callers


I can suggest four ways to ameliorate the problem.

First, we could do as Antoine Pitrou suggests on the bug (issue 7992): 
wherever the CObject used to be published as a module attribute to 
expose an API, we could provide both a CObject and a capsule; internally 
Python would only use the capsules.  This would allow third-party 
libraries to run against 2.7 unchanged.  The major problem with this is 
that third-party libraries would still be vulnerable to the 
mix-and-match CObject crash.  A secondary, minor concern: obviously we'd 
store the CObject attribute with the existing name, and the capsule 
attribute would have to get some new name.  But in Python 3.1, these 
attributes already expose a capsule.  Therefore, people who convert to 
using the capsules now would have to convert again when moving to 3.1.

Second, we could make CObject internally support unpacking capsules.  If 
you gave a capsule to PyCObject_AsVoidPtr() it would unpack it and 
return the pointer within.  (We could probably also map the capsule 
"context" to the CObject "desc", if any of the Python use cases needed 
it.)  I wouldn't change anything else about CObjects; creating and using 
them would continue to work as normal.  This would also allow 
third-party libraries to run against Python 2.7 unchanged.  The only 
problem is that it's unsafe, as indeed allowing any use of 
PyCObject_AsVoidPtr() is unsafe.

Third, I've been pondering writing a set of preprocessor macros, shipped 
in their own header file distributed independently of Python and 
released to the public domain, that would make it easy to use either 
CObjects or capsules depending on what version of Python you were 
compiling against.  Obviously, using these macros would require a source 
code change in the third-party library.  But these macros would make it 
a five-minute change.  This could compliment the first or second approaches.

Fourth, we could back out of the changes to published APIs and convert 
them back to CObjects.  -1.


Your thoughts?


/larry/

-----

import sys
def log(message):
    print(message)
    sys.stdout.flush()

def crash1():
    log("Running crash1...")
    try:
        import datetime
        import cStringIO
        cStringIO.cStringIO_CAPI = datetime.datetime_CAPI

        import cPickle
        s = cPickle.dumps([1, 2, 3])
    except ImportError:
        # This test isn't translatable to Python 3.
        pass
    log("Survived crash1!")


def crash2():
    log("Running crash2...")
    try:
        import unicodedata
        import _socket
        _socket.CAPI = unicodedata.ucnhash_CAPI
        import ssl
    except AttributeError:
        # Congratulations, you didn't crash.
        pass
    log("Survived crash2!")


def crash3():
    log("Running crash3...")
    try:
        import unicodedata
        import _multibytecodec
        _multibytecodec.__create_codec(unicodedata.ucnhash_CAPI)

    except ValueError:
        # Congratulations, you didn't crash.
        pass
    log("Survived crash3!")

import sys

if len(sys.argv) > 1:
    if sys.argv[1] == '1':
        crash1()
        sys.exit(0)
    elif sys.argv[1] == '2':
        crash2()
        sys.exit(0)
    elif sys.argv[1] == '3':
        crash3()
        sys.exit(0)

crash1()
crash2()
crash3()


More information about the Python-Dev mailing list