[pypy-dev] PyPy progress report

Armin Rigo arigo at tunes.org
Wed Oct 3 20:01:21 CEST 2007


Hi all,

A quick note to tell you some new things that PyPy can do since the 1.0
release: more built-in modules, PyPy-on-Java-VMs, and secure sandboxing
for Python.

In more details:


* Threads and weakrefs are now working properly.  (There is a GIL -
  global lock - just like in CPython for now.)  (As in the 1.0 release,
  you can also have stackless & greenlets instead of threads.)


* Some missing built-in modules have been written, like zlib.
  The complete list of known-to-be-working built-in modules is:

     _codecs, gc, _weakref, array, marshal, errno,
     math, _sre (regular expressions), operator, symbol, _random,
     _socket, unicodedata, mmap, fcntl, time, select,
     crypt, signal, readline (only readline.readline()),
     termios, zlib, struct, md5, sha

  Of course, all the standard library of Python 2.4, written in Python,
  works out of the box.  Also, the following modules have a working pure
  Python implementation in PyPy: binascii, cPickle, cStringIO, cmath,
  collections, datetime, functional, imp, itertools.


* These two facts together mean that pypy-c can now run some very
  serious programs, like the server for the bub-n-bros game :-)
  I've also successfully used py.execnet for a while.


* Antonio and Niko are doing some very good work on the Java backend.
  With minor tweaks we can now build a pypy-jvm.  FWIW it seems to give
  better performance than Jython.  It's not 100% comparable because
  pypy-jvm is not thread-aware or thread-safe yet, while Jython gives
  you free threading; OTOH pypy-jvm doesn't generate Java bytecodes
  on-the-fly (although it might do so in the future).  All in all
  pypy-jvm is still experimental, but stay tuned :-)


* Sandboxing: it is possible to compile a version of pypy-c that runs
  fully "virtualized", i.e. where an external process controls all
  input/output.  Such a pypy-c is a secure sandbox: it is safe to run
  any untrusted Python code with it.  The Python code cannot see or
  modify any local file except via interaction with the external
  process.  It is also impossible to do any other I/O or consume more
  than some amount of RAM or CPU time or real time.  This works with no
  OS support at all - just ANSI C code generated in a careful way.  It's
  the kind of thing you could embed in a browser plug-in, for example
  (it would be safe even if it wasn't run as a separate process,
  actually).

  For comparison, trying to plug CPython into a special virtualizing C
  library is not only OS-specific, but unsafe, because one of the known
  ways to segfault CPython could be used by an attacker to trick CPython
  into issuing malicious system calls directly.  The C code generated by
  PyPy is not segfaultable, as long as our code generators are correct -
  that's a lower number of lines of code to trust.  For the paranoid, in
  this case we also generate systematic run-time checks against buffer
  overflows.

  For more information on this topic please see:
  http://codespeak.net/pypy/dist/pypy/doc/sandbox.html


Thanks,

Armin & cfbolz around



More information about the Pypy-dev mailing list