[pypy-dev] about bz2 and more

Armin Rigo arigo at tunes.org
Wed Jul 26 11:18:25 CEST 2006


Hi Lawrence,

On Wed, Jul 26, 2006 at 10:50:40AM +0200, Lawrence Oluyede wrote:
> Bz2 is kinda a beast module doing heavy operations with function
> pointers, buffering and so on. The problem I encountered lies in the
> initialization of the bz2 file object. It has support for buffering,
> universal newlines and more. All this stuff is exposed in CPython via
> file()/open() factory functions. Those are not supported at RPython
> level and I can't write the class at app-level.

I think the first thing do to is support a "raw" interface, that gives
access to unbuffered binary data.  Then for PyPy we can simply plug the
existing buffering and universal-newline code, which is already written
at app-level in pypy/lib/_sio.py.

It means that if you write a class with the correct interface, and
expose it to app-level in a mixed-module, then at least in PyPy you can
write the rest at app-level: in bz2.open(), instantiate the "raw" class
from the interp-level part of the module, then import the _sio module
and build the correct stack of buffering and translating classes on top
of your "raw" class.

See _file.py, _inithelper(), for how to build such a stack from standard
mode/buffering arguments.  Maybe abstracting out some of this code in a
helper function in _sio or _file would be useful.

An example of "raw" class is _sio.DiskFile, which uses os.read()
internally.  For bz2 you'd build a class with the same interface, based
on ctypes calls instead.

IMHO this all shows that writing buffering and universal newlines
explicitly and at app-level is a great idea :-) Essentially, with almost
no more effort we could provide users with a PyPy-specific module that
plugs buffering and translating on top of any "raw" reader/writer that
she provides.


A bientot,

Armin



More information about the Pypy-dev mailing list