[pypy-dev] Restricted language

Armin Rigo arigo at tunes.org
Tue Jan 21 01:30:31 CET 2003


Hello Holger,

On Mon, Jan 20, 2003 at 04:40:04AM +0100, holger krekel wrote:
> using Bengt's suggestion we could say that an "immediate"
> C-program-level value is represented by a C-compiler-level value.

It fells like you made exactly the confusion that I was trying to prevent
people from making.  A C-program-level value is *not* *just* a
C-compiler-level value.  I mean, your "immediate value" in your C program
might be in a variable 'x', but in the C compiler it is not in any
variable.  Instead, it is embedded within a complex structure, e.g.

struct program_variable {
  bool is_constant;
  union {
    struct {    // non-constant case
      int in_which_register;
      ...
    }
    struct {    // constant case
      long immediate_value;
      ...
    }
  }
};

> ImmediateObjects would be tagged CPython objects.  
> I don't see immediate use, though :-)

You cannot use real CPython objects to represent application-level objects.  
If you do, you are confusing the two levels.  It will lead you into tons of
problems; for example, you can only use immediates for the application-level
objects.  It is similar to a C compiler in which the above struct
program_variable would not exist; all program variables would be represented
as a long.  You can only represent immediates like this.

For example, if you want later to design your own "class MyList" implementing
lists, and use this instead of real list objects to represent
application-level lists, it looks fine; you change the implementation of
BUILD_LIST to create an instance of MyList, and work with that instead of a
real list.  But then, the class must work *exactly* like a list for this to
work.  The problem is that it cannot.  If you want to give another
implementation you cannot inherit from the built-in "list" type.  You are
stuck.  At best you will change *all* your interpreter to contain tests like
"if isinstance(x, MyList)" to know if the object 'x' is an immediate object or
something that needs special care.

Hence the class ImmediateObject is absolutely essential.  If you get confused
think about implementing a Python interpreter not in Python but in another
similar language.  You will be forced to define classes like lists, tuples,
dicts and so on, and give direct mappings between what the interpreted
application wants to do on instances of these classes and what the interpreter
itself can do with the underlying implementing object.

The case of Python-in-Python gets confusing because these mappings are trivial
in the case of the above class ImmediateObject.  But you cannot remove them.

You have a similar problem with exceptions: you must not confuse the
exceptions that the application wants to see and the exceptions that are used
internally in the interpreter because it is a nice programming technique to
use in your interpreter.  Here again, think about
Python-in-some-similar-language.  You will see that you need a way to say
which Python exception must be thrown in the application.  Then you need a way
for the interpreter to come back from a couple of nested calls to the main
loop, and so you create an "EPython" exception to raise for this purpose.  As
above it is easy to get confused because ImmediateObjects have a trivial
mapping between what the application wants to see and what the operation on
the underlying implementing object actually raises.

In other words, if you implement a list with a CPython list 'a', then when you
try to do the PyObject_GetItem operation, you will end up doing this:

  try:
    return a[index]
  except Exception, e:
    SetException(ImmediateObject(e))
    raise EPython

Note how 'e' is embedded inside an ImmediateObject().

> But i'd like to implement BINARY_SUBSCR like this: 
> 
>     def BINARY_SUBSCR(self):
>         w = self.valuestack.pop()
>         v = self.valuestack.pop()
>         self.valuestack.push(w.__getitem__(v))

Here we are using Python's __getitem__ protocol to implement Python's
__getitem__ protocol.  I see nothing wrong in that, but it is easy to get
confused.  To make things clearer I would way:

    self.valuestack.push(w.getitem(v))

where all non-abstract classes inheriting from PyObject should have a
getitem() method; for example, in class ImmediateObject:

    def getitem(self, index):
      try:
        return self.ob[index.ob]
      except Exception, e:
        SetException(ImmediateObject(e))
        raise EPython

This level of indirection is quite necessary.  Seen otherwise, you cannot
store arbitrary CPython objects into the self.valuestack list, because
otherwise you can only store CPython objects representing themselves, and you
are stuck as soon as you want to represent things differently.  Think about
the type() function; it could not return the real type of the implementing 
object, because you couldn't implement lists or ints with a custom class.  And
it cannot call a new method __type__() of the object, because you cannot add
such a new method to all already-existing built-in objects.  You could hack
something that calls __type__() if it exists and returns the real type
otherwise, but you are running into trouble when interpreting programs that
define __type__() methods for their own purpose.  This is the kind of
confusion we are bound to run into if we are not careful.

With all correctly set up it is trivial to catch the EPython exception in the
main loop and, in its exception handler, unwind the block stack just like
CPython does in its main loop.  If another exception (not EPython) is raised
in the interpreter, it will not be caught by default; it is the normal
behavior of Python programs and means there is a bug (in this case, a bug in
the interpreter).


A bientôt,

Armin.



More information about the Pypy-dev mailing list