CRAZY: First class namespaces

Sat May 27 20:14:52 EDT 2000

I have to warn you.  This is a bit wacky, and I have no illusion that
any of this will go into Python.  It's just some stuff that Python has
inspired me to think about, mainly because Python exposes namespaces
to the programmer, and programmers do cool things with it.  How far
can that go?  (You're supposed to run away yelling when you hear
someone say "How far can something go?")  This idea touches on
stackless (because stack frames become objects, which are on the heap)
and nested scopes, too, if that'll help you run screaming.  But if
you're ready to read some crazy stuff, here it is:

==

Python and many other OO languages have a nice way of looking up field
names in objects -- they look in an object, then in the class, then
superclass, and so on.  It's a chained lookup system.  Python also
uses a chained lookup for variable names.  It first looks in locals()
and then looks in globals().  If we have nested scopes in P3k, it will
look in the local namespace, then in the parent namespace, then in its
parent, and so on until we reach the global namespace.  Jim Fulton's
ExtensionClass implements a chained lookup system that can go from one
object to its "container" to find values.

I'd like to see all of the chained namespace implementations unified.
Objects, classes, modules, activation records ("stack frames") -- all
are namespaces underneath.

In this proposal, the __parent__ name is treated specially in an
object.  For looking up a name, you first check the namespace and then
if you don't have the name, you ask your parent.  If you have no
__parent__, you give a NameError.  For assigning to a name, you always
assign the name in the current namespace, and you never look at the
parent.

Note that this is just like what we do now, except we call the special
field "__class__" if it's in an object and "__bases__" if it's a
class.  In this system, "ordinary" objects would simply have their
class as the __parent__.  An ordinary class would have its base class
as its __parent__, as long as there's no multiple inheritance [1].
Environmental acquisition (e.g., ExtensionClass) would have its
container has its __parent__.  At this point, an ordinary class or an
acquisition class is no different than a "normal" object, so we may
not need classes.  (Already in Python, classes and objects aren't so
different.)

The lookup semantics (chaining for getattr and not chaining for
setattr) almost match both how fields in objects work and how local
variables work [2].  So we could make namespaces into objects too.  In
the following example:

  # module spam.py

  x = 5

  def foo(y):
     print x
     x = 8
     print x

     def cheese(z):
        print x, y, z
     return spam

  bar = foo(10)
  bar(35)

The global (module) namespace is an object.  (In fact, it's the module
object.)  It's the object <x = 5, foo = function....>.  The function
foo needs to remember its scope, so its .co_namespace points to the
spam module object.

When we *call* foo(10), we need to create a new namespace for the local
variables.  This is an object <y=10, __parent__ = spam-module>.
When we print x, we look for x in this object, and it isn't there.  So
we look for x in the __parent__, which is the spam module object.
It's there, so we return 5 and print it.  The next thing we do is
assign x to 8.  Since there's no chaining on assignment, we create an
entry x=8 in the namespace object, which becomes <y=10, x=8, __parent__ =
spam-module>.  Now we try to print x again.  We look in the
namespace object and find x is 8, so that's what we print.  Next we
define a function cheese, which gets its .co_namespace set to the
current namespace object.  Then we return that function object.

The global module now gets bar=function... added to it.

And we call bar(35).  What happens here?  When we call a function, we
have to create a new namespace object where the parent is set to the
co_namespace.  So that's <z=35, __parent__ = <y=10, x=8, __parent__ =
spam-module>>.  So when we try to print x, y, z, we find 8, 10, 35.

Does that all make sense?

Summary: rules for variable lookup look and smell a lot like rules for
object lookup.  If we combine the two, we can also put in classes,
acquisition classes, and modules.  It's "everything's an object"
pushed a bit farther, into the area of local variables.  I would keep
the syntax that distinguishes classes from objects, but it'd be just
syntax -- nothing deeply different underneath.

There are few more notes I have:

* Methods on objects:

  We have to resolve the issue of "self" being a magic argument.  If
  objects and classes are the same, then either def foo(self, x) or
  self.foo = lambda x: .. isn't consistent.

  We might want to say "method" introduces an unbound method and
  "def" introduces a bound method (i.e., a function that isn't going
  to take 'self').

* A magic "self" variable:

  We could have it so that when you invoke a method, it sets not only
  __parent__ to  the co_namespace, but also sets self to the object.
  That way, you'd define foo() without listing "self", but you'd still
  have a "self" object and you'd access fields with "self.f".

     method foo(x):
         print self.f + x

  I'm not convinced this is the right thing to do.

* Inline objects:

  It'd be nice to have a lightweight way of declaring objects, like
  the <name=value, name=value, ...> syntax I used in the examples
  above.  If objects are everything, you want to be able to make them
  all the time.  It's possible to use {} even though {} is used for
  dictionaries.  If it's {key:value, ..} then it's a dictionary and if
  it's {name=value, ..} then it's an object.  But that may be too
  confusing.

* The "global" keyword:

  We still want some way of getting to the next level of scope.  Since
  objects and namespaces are combined, and the object has a special
  __parent__, the __parent__ is exposed as a local variable.  So we
  could do:

  x = 5
  def foo():
     __parent__.x = 10

  to assign to the global x.  If it's nested more:

  x = 5
  def foo():
     def cheese():
        __parent__.__parent__.x = 5

  Not ideal, but then, I don't deal with globals too much anyway.
  Maybe 'global' can be a magic variable (not keyword) that gets set
  to the last non-empty __parent__.  Then you'd end up doing:

  x = 5
  def foo():
     def cheese():
        global.x = 5

* Cycles and refcounting

  If we do nested scopes, we already get the cycles problem.  I don't
  think making namespaces into objects makes anything worse.

* Modules:

  Modules already exhibit this unification of namespaces and objects.
  From inside the module, you can refer to variables.  From otuside
  the module, you refer to fields.  Unifying namespaces and objects
  wouldn't really do anything to modules.  Instead, it'd make
  all namespaces behave like a module in that they're objects.

* Methods on modules:

  I think you could get methods (including things like __repr__) on
  modules free here.  I haven't thought about it enough, but my
  intuition tells me that since classes have gone away, and you can put
  methods on objects, you should be able to put methods on modules too.
  We just have resolve the bound/unbound method issue.

* The "with" statement from Pascal:

  I suspect you can do "with" pretty easily but I'm not sure about the
  details.  It may require multiple inheritance (aiee) if you want to
  be able to see local variables at the same time you see an object's
  fields.

* Assignment to __parent__:

  So far I've been assuming that __parent__ is something set by Python
  for internal use.  For objects, it's set when you call a class
  constructor.  For classes, it's set when you declare the class using
  the "class" construct.  For namespaces, it's set when you call a
  function.

  But really, the cool stuff is really when the user can assign to
  __parent__!

  For example, __parent__ = self would make it so that you can assign
  to fields in your object without using "self.".  (You'd also lose
  your local *and* global variables, which would discourage anyone
  from doing this.)  The environmental acquisition aspect of
  ExtensionClass is pretty easy now -- you just create a raw object
  and set its __parent__ to its container.  You could "reparent" an
  object on the fly to change its class.  Or you could use
  prototype-based programming techniques, as in Self.

* Efficiency:

  It'll probably be harder to optimize name lookup with something like
  this (especially if you can change __parent__).  However ..  with
  only one underlying implementation for namespaces, objects, modules,
  and classes, any optimization work you perform here will benefit all
  of those language features.

* Related language features:

  Simula's objects were originally activation records / stack frames.
  So making the two similar or even the same is a really old idea.
  Smalltalk makes everything an object, so my guess is that activation
  records too were objects.  Smalltalk blocks are objects too.
  JavaScript tries to do something that unifies the two, but gets many
  aspects totally wrong, by mixing up static scopes and dynamic
  activation records.  (For example, functions get an object with
  local variables when they're *declared*, even though they haven't
  been *called* yet!)  Pascal and JavaScript have a "with" statement
  that lets you view an object's fields as variables.

       -crazily-yr's - Amit

FOOTNOTES

[1]  One snag is multiple inheritance.  I don't think we really
     need multiple inheritance once we have nested scopes, but MI is
     a really controversial issue.  Here's an example of MI:

     class TCPServer:
        ...
     class ForkingMixIn:
        ...
     class ForkingTCPServer(ForkingMixIn, TCPServer):
        pass

     What I'd do instead is write ForkingMixIn this way:

     def ForkingMixIn(AnyServer):
        class ForkingMixIn(AnyServer):
           ...
        return ForkingMixIn

     [Note: I wouldn't want to use this syntax, but this illustrates
     the underlying construct.]

     Now I can write ForkingTCPServer this way:

     ForkingTCPServer = ForkingMixIn(TCPServer)

     The advantage of this over MI (in Python) is that you can call
     the methods from the base class.  Right now, ForkingMixIn can't
     easily define method foo() to call TCPServer's foo() and then do
     one more thing.  With the above setup, ForkingMixIn has a handle
     to the base class, and can call AnyServer.foo(self).

     Alternatively, a namespace could have a __parents__ tuple, and it
     could search each one of them.  However, this would be
     unnecessary for variable scoping (unless your mind is really
     twisted!).  It'd be useful for the "with" statement, but it seems
     confusing.

[2]  One thing I find confusing in Python is that a local name 
     definition applies even before it's been executed:

     x = 5
     def foo():
        print x
        x = 3
        print x

     IMO, this should print 5 and then 3.  The unified namespace 
     proposal would do this.  However, Python currently gives a 
     NameError.
-- 
--
Amit J Patel, Computer Science Department, Stanford University
http://www-cs-students.stanford.edu/~amitp/