[pypy-dev] request for sprint planning discussion

Mon Dec 1 17:24:54 CET 2003

Hello pypy and especially hello Amsterdam sprinters (apparently 13),

the Amsterdam sprint is only two weeks off and i think we need some more
discussion and overview about what we intend to do in Amsterdam. First
let me note that while Armin and me were hoping to make a first public
release in Amsterdam this is not of paramount importance. I think that
it's more important to have a sprint that everybody enjoys. 

Judging from IRC discussions and feedback from some sprinters this may
imply that we take a somewhat different approach than announced before:
revisiting various parts of the current architecture, learning about and
improving/documenting them. It's probably more fun to hack together at
some documentation than doing it alone :-) Please feel encouraged
(especially sprint participants) to ask questions, comment or suggest
different/more ideas, goals and wishes for our sprint. 

Let me present some ideas that are currently on the table:

- revisit/complete support for interpreter<->applevel interactions [1]
  This code (interpreter/gateway.py and friends) was mostly hacked by
  Armin and me and would enjoy wider involvement and improvement.  For one,
  interpreter-level objects like frame and function objects need to be
  correctly exposed at application level (some bits and pieces missing). 
  The "import dis; dis.dis(dis.dis)" goal finally needs to work!
  (still only "import dis; dis.dis(dis.dis.func_code)" works. 

- currently you can mix interp-level code and app-level code (see e.g.
  module/builtin.py).  While you can access app-level objects from
  interp-level the reverse is not true: there is no general way to access
  interp-level objects directly from app-level -- unless the interp-level
  objects provide specific hooks (e.g. pypy_getattr() in pyframe.py). 

  Accessing interpreter level objects from app-level would e.g. be useful 
  if we want to *define* the complete python type hiearchy including 
  __new__ methods in our standard "types.py" file.  Note that e.g. 
  StdObjSpace doesn't really care about this: most of it would work 
  just fine without having such a type hierarchy.  However, types.py would 
  then *define* the actual types and their relations completly at app-level. 
  The interp-level objects would of course need to correlate to this 
  definition. In the course, it might be nice if we could access 
  interp-level objects directly from app-level 

  class int(type):
    def __new__(cls, ...):
        from ... import W_IntObject  # our stdobjspace-implementation 
        return W_IntObject(arg)      # gets invoked here ... 

  this obviously needs more thought but i hope the idea is understandable. 

- Of course, there is still a lot of work with Annotation/Translation 
  but this has been mentioned in previous postings. Note that this part 
  of the source tree is pretty independent from the rest of pypy. Recently 
  Armin and me have started to refactor annotation code after we also 
  implemented the "Berlin model" of the flowgraph-structures 
  (objspace/flow/model.py). The idea is to have a rather 
  general and somewhat efficient annotation/query engine. 
  The beginnings are in the new pypy/annotation directory including
  a (non-complete) README.txt 

- frontends: i don't know how many people have experiences hacking with
  pygame and/or game architectures.  It probably doesn't make much sense
  if only Armin and me want to or can do it. Of course, Michael Hudson has 
  done some stuff in that area, too, but he had to cancel his participation.
  However, we can start with writing tools that e.g. list all 
  space operations for a given python function. There also is 
  tool/methodChecker.py which tries to list the "implementedness"  
  of app-visible functions/methods of types.  Doing tools like this 
  is helpful for understanding how pypy works -- both writing and using it. 

  These tools could easily be reused from whatever frontend with the
  following approach: Let PyPy run in a most flexible, insecure but 
  simple 15-liner application-server that simply receives and executes 
  remote python code/string objects.  Thus you can send the 
  "methodChecker/showspaceops" cmdline tools to a remote server
  and receive the results (e.g. over a redirected sys.stdout/err).
  (i might commit some simple code for this mechanism before the 
  sprint to a src/pyappserver directory and notify the list). 

- completing stdobjspace and builtins, there is a lot to do still.  
  Rocco Morretti recently worked in the direction of (and suggested 
  as a goal) getting 'regrtest.py' to pass on PyPy as much as possible. 
  An important missing piece probably is getting a PyPy implementation 
  of the cpython import mechanism (our current module/builtin module
  __import__ implementation is just a simplistic hack). 

Basically i suggest we try to plan the sprint so that everybody 
gets educated enough to feel at home with code and concepts of PyPy. 
At best everyone feels able to take initiatives of their own. 
It probably would make sense to prepare some introductional talks to

  the interpreter (byte code dispatching/implementation/exceptions ...)
  the stdobjspace (multimethods, and type implementations ...)
  annotation/translation (flowgraphs/annotations via space-operations)

on the first two days.  At least we can make a few question/answer 
screen-sessions on these topics.  At the moment, there are very
few people who know most/all of the areas of PyPy and can take
initiatives to fix/improve stuff.  This needs to change and is
more important than getting a release out (although we need not
give up this idea, yet). 

please comment away,

    holger

[1] application level objects are the usual objects/structures you
    see in/from a python program. Under the hood, these objects
    have interpreter-level implementations. In interp-level source
    code you see lots of w_* names indicating that they reference
    a 'wrapped object' aka an application-level object.  These
    wrapped objects are manipulated by object space operations
    like objspace.getitem/getattr/type/add and are opaque to the 
    interpreter.  IOW, an objectspace is usually in complete 
    control of object layout, app-level representation and other
    details.  Only the interpreter frame/function/code/... objects can 
    control their app-level representation themselves by defining
    certain hooks like 'pypy_getattr' which the objectspace 
    dispatches to if it encounters a getattr object on an internal 
    object. The nice effect is that objectspaces don't need to reimplement 
    e.g.  function/generator/code/module types with all the app-level
    representation again and again.