[pypy-dev] change of strategy for the py3k branch?

Wed May 30 10:07:46 CEST 2012

Hi all,

after some months of work on the py3k branch, I realized that the current
strategy/workflow does not scale well and thus I'd like to change it.

For those who are not aware, currently we have the default branch where the
main development is done, and which includes code for both the rpython
translator toolchain and the python 2 interpreter.

The py3k branch does not touch the translator toolchain, but modifies the
python 2 interpreter to make it a py3k interpreter. These changes are actually
destroying the python 2 semantics, which means that the long term goal is to
never merge py3k to default, but keeping the development in parallel, and
regularly merge default into py3k to make sure that py3k gets benefits of the
various improvements in the JIT, GC, etc.

In the past months, I ended up spending a considerable amount of time in
resolving merge conflicts. This happens all the time that someone modifies
something in the python 2 interpreter, for example to apply a new cool
optimizations.  While on the one hand it is cool to automatically have new
cool optimizations on py3k, on the other hand it is blocker which stops me to
work on it effectively.

After a bit of discussion on IRC, I propose to solve this problem by detaching
the development of the py3k interpreter from the development of the python 2 one.

Pros:
  - faster development of py3k
  - lower entry barrier for new contributors, because the relationship between
the various parts will be much simpler
  - it will be straightforward to apply the new features of the translator
toolchain to the py3k branch
  - it will be easier to split the toolchain from the actual interpreter the
day we will finally decide to do it

Cons:
  - we will need to manually port the optimizations done in the interpreter on
the default branch to py3k. Note that right now it's now really "automatic"
anyway, because merging is painful.

If we decide to go for this route, the next question is: where to store the
code? I think there are two main solutions:

1) add a new "pypy/py3k" directory where to copy all the relevant modules.
E.g. pypy/py3k/interpreter, pypy/py3k/objspace/std, "pypy/py3k/modules.

2) start a completely new repository which contains only the code for py3k.

Solution (2) is better and cleaner in theory. However I fear it would soon
become a mess to handle, because every change in the translator toolchain
would potentially break py3k. I don't want a situation in which we say "yes,
you can build py3k but only if you take revision XXX and you use revision YYY
of the toolchain, unless the phase of the moon is empty".

Solution (1) is more practical and it would probably lead to less problems in
the short term.  For now, I would still keep the code in the py3k branch, so
the normal development of pypy would not be affected.

Before doing it, I'd like to hear opinions and comments, in particular of
people who already worked on py3k and/or are generally interested in it.
Please be constructive :-).

ciao,
Anto