[Cython] Rewriting/compiling parts of CPython's stdlib in Cython

Robert Bradshaw robertwb at math.washington.edu
Tue Mar 22 08:14:45 CET 2011

On Mon, Mar 21, 2011 at 11:10 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Hi,
> there seems to be quite some interest in a project to get parts of CPython
> and specifically its stdlib rewritten in Cython. I've copied the latest
> python-dev mail below. The relevant part of the thread is here:
> http://thread.gmane.org/gmane.comp.python.devel/122273/focus=122798


> In short, we have strong supporters, but Guido has understandable doubts
> against a new (and quite large) dependency and potential semantic
> deviations.

Reading the list, I think others on the list overestimate the semantic
differences. Mostly we're talking about things like is vs. equality
for floating point numbers and tracebacks (at least for un-annotated
code). It's a valid point that Cython is still under such active

> But there seem to be cases where slight changes would be
> acceptable that Cython compiled modules might introduce, such as emitting
> different exception messages, changing Python classes into extension
> classes, or even preventing monkey patching in modules that are backed by C
> modules anyway.
> It would be helpful to get support from the side of external distributors
> that use Cython already, e.g. Sage, Enthought/SciPy, ActiveState, etc. If
> they agreed to test the Cython generated stdlib modules in their
> distributions, we could get user feedback that would allow python-dev to
> take a well founded decision.
> Do we have any volunteers for trying this out? Both on the side of
> distributors and implementors?

I think Sage might be willing to give it a try. I'll ask tomorrow as
part of a talk I'm giving. Note that due to the way Python's import
mechanism works, it would be easy (as a first pass) to make a
"cythonize this Python install" which would just compile (a subset of)
the .py files and drop .so files next to them. This would require no
messing with the Python build system or distribution, easy to test and
benchmark, and be easy to clean up.

> At the current state of affairs, the implementation could still be financed
> by a Python backed GSoC project, although it would be cool if more users
> could just step up and simply try to compile and optimise stdlib modules
> (preferably without major changes to the code). It's certainly a great way
> to show off your Cython skills :). I gave it a try with difflib and it
> turned out to be quite easy.
> http://blog.behnel.de/index.php?p=155
> Reimplementing existing C modules in Cython might, however, be more
> interesting for python-dev, but also be a larger undertaking. So a GSoC
> might be worth running on that.

I think that's a great idea. Would you be willing to mentor such a project.

> Note that the latest Cython release does not have generator support yet, and
> Vitja's branch on github is not very stable. We will try to get it up to
> speed and merged during the workshop next week, at which point it will make
> more sense to get this project started than right now.
> Stefan
> Guido van Rossum, 22.03.2011 00:04:
>> On Mon, Mar 21, 2011 at 3:44 PM, "Martin v. Löwis" wrote:
>>> Am 21.03.2011 11:58, schrieb Stefan Behnel:
>>>> Guido van Rossum, 21.03.2011 03:46:
>>>>> Thanks for the clarifications. I now have a much better understanding
>>>>> of what Cython is. But I'm not sold. For one, your attitude about
>>>>> strict language compatibility worries me when it comes to the stdlib.
>>>> Not sure what you mean exactly. Given our large user base, we do worry a
>>>> lot about things like backwards compatibility, for example.
>>>> If you are referring to compatibility with Python, I don't think anyone
>>>> in the project really targets Cython as a a drop-in replacement for a
>>>> Python runtime. We aim to compile Python code, yes, and there's a
>>>> hand-wavy idea in the back of our head that we may want a plain Python
>>>> compatibility mode at some point that will disable several important
>>>> optimisations.
>>> I think that's the attitude Guido worries about: if you don't have the
>>> desire to provide 100% Python compatibility under all circumstances
>>> (i.e. including if someone passes parameters of "incorrect" types),
>>> then there is very little chance that we would replace a Python module
>>> with a Cython-compiled one.
>>> The only exception would be cases where the Python semantics is murky
>>> (e.g. where Jython or so actually behaves differently for the same
>>>  Python code, and still claims language conformance). E.g. the exact
>>> message on a TypeError might change when compiling with Cython,
>>> but the cases in which you get a TypeError must not change.
>> One other significant use case is the situation where we have an
>> optional replacement module written in C (e.g. heapqmodule.c vs.
>> heapq.py). There are usually many semantic differences between the C
>> and pure-python module that we don't care about (e.g. monkeypatching
>> won't work).
>> The size of Cython as a dependency and its development speed are still
>> problems though. In general for the core I don't think we want the
>> repo to contain generated code that can only be regenerated using a
>> 3rd party dependency. (True, we have a few generated files, e.g.
>> configure; but in that case the generator -- autoconf --  is a
>> standard installed tool on Linux and is used by most open source
>> projects.)
>> Still, I think it would be great if someone tried something like this
>> for a specific stdlib module and came back with a story about the
>> experience, rather than having a theoretical discussion about possible
>> pros and cons.
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel

More information about the cython-devel mailing list