[pypy-dev] PyPy progress report
Armin Rigo
arigo at tunes.org
Wed Oct 3 20:01:21 CEST 2007
Hi all,
A quick note to tell you some new things that PyPy can do since the 1.0
release: more built-in modules, PyPy-on-Java-VMs, and secure sandboxing
for Python.
In more details:
* Threads and weakrefs are now working properly. (There is a GIL -
global lock - just like in CPython for now.) (As in the 1.0 release,
you can also have stackless & greenlets instead of threads.)
* Some missing built-in modules have been written, like zlib.
The complete list of known-to-be-working built-in modules is:
_codecs, gc, _weakref, array, marshal, errno,
math, _sre (regular expressions), operator, symbol, _random,
_socket, unicodedata, mmap, fcntl, time, select,
crypt, signal, readline (only readline.readline()),
termios, zlib, struct, md5, sha
Of course, all the standard library of Python 2.4, written in Python,
works out of the box. Also, the following modules have a working pure
Python implementation in PyPy: binascii, cPickle, cStringIO, cmath,
collections, datetime, functional, imp, itertools.
* These two facts together mean that pypy-c can now run some very
serious programs, like the server for the bub-n-bros game :-)
I've also successfully used py.execnet for a while.
* Antonio and Niko are doing some very good work on the Java backend.
With minor tweaks we can now build a pypy-jvm. FWIW it seems to give
better performance than Jython. It's not 100% comparable because
pypy-jvm is not thread-aware or thread-safe yet, while Jython gives
you free threading; OTOH pypy-jvm doesn't generate Java bytecodes
on-the-fly (although it might do so in the future). All in all
pypy-jvm is still experimental, but stay tuned :-)
* Sandboxing: it is possible to compile a version of pypy-c that runs
fully "virtualized", i.e. where an external process controls all
input/output. Such a pypy-c is a secure sandbox: it is safe to run
any untrusted Python code with it. The Python code cannot see or
modify any local file except via interaction with the external
process. It is also impossible to do any other I/O or consume more
than some amount of RAM or CPU time or real time. This works with no
OS support at all - just ANSI C code generated in a careful way. It's
the kind of thing you could embed in a browser plug-in, for example
(it would be safe even if it wasn't run as a separate process,
actually).
For comparison, trying to plug CPython into a special virtualizing C
library is not only OS-specific, but unsafe, because one of the known
ways to segfault CPython could be used by an attacker to trick CPython
into issuing malicious system calls directly. The C code generated by
PyPy is not segfaultable, as long as our code generators are correct -
that's a lower number of lines of code to trust. For the paranoid, in
this case we also generate systematic run-time checks against buffer
overflows.
For more information on this topic please see:
http://codespeak.net/pypy/dist/pypy/doc/sandbox.html
Thanks,
Armin & cfbolz around
More information about the Pypy-dev
mailing list