GC in OO (was Re: Python 1.6 The balanced language)

Tue Sep 5 13:45:24 EDT 2000

Alex Martelli wrote:
> That's the function of analysis and design: "defining the right `should`"
> for
> each program.  Preconditions, postconditions, invariants.

Right. But not the job of a language designer. That's my point.

> You can set things up (depending on the environment) so that a
> program survives 'crashes' (ideally, with a rollback of whatever
> atomic transactions are in progress, of course); whether you do

Then it isn't a crash. It's a caught exception. But then you knew that, or
you wouldn't have put "crash" in quotes there. :-)

> I reiterate my claim that it makes no significant difference
> whether the cause of the crash (whether it's caught or not)
> lies in violating a language spec (that isn't enforced before
> runtime), or in a precondition which the language did not
> specify (e.g. because it's inherent in a certain FP unit, but
> not all FP hardware can enforce it, and the language's specs
> chose to let the language be run decently on a variety of
> hardware).

I already answered this, for the OO case. You chose to ignore it. 

> > Well, I'd say that one of the basic tennants of OO programming is that the
> > only operations that change the private data of a class are the methods of
> > that class, yes?
> 
> No, that's only one style of OO programming.  In some OO styles, data
> never changes (O'Haskell, for example: it's both OO and pure functional,
> so all data are immutable -- let's *please* forget monads for the moment,
> as they're quite something else again).

Which meets my critera, then. The point was that a programming error should
not corrupt unrelated data. That breaks the data encapsulation, the
association of data to objects.

> In others, encapsulation is not enforced.

But you still need (supposedly) a reference to the object to access it. It
can be viewed as the object exporting setter and getter methods by default.

> In particular, it's very common for object-persistence
> frameworks (and similar doodads) to be allowed special licence to
> persist _and depersist_ data of any object whatsoever 

Which, of course, breaks all talk about OO to start with, especially if
you're going to start invoking DbC and such.

> Anyway, it's a popular style in OO to have this encapsulation, whether
> enforced or by convention: an object 'owns' some data (state) and
> that data is only accessed by going through the object.

Yes. And I expect that Alan Kay saw that as one of the fundamental
properties of OO at the time he made that quote in the early 70's.

> > Or at least that you need to operate on a reference of an
> > instance in order to change the value of the instance, which definition
> > allows for things like Python and Java and such.
> 
> You need to obtain an accessor from the instance-reference (if
> encapsulation is to hold)

Right. And my point is that if you allow one routine to free memory another
routine is holding to, you've lost the accessor to the instance-reference
and possibly replaced it with something entirely different and potentially
illegal according to the language specs.

> > Now, if sending a "divide" message to the integer "5" causes it to corrupt
> > the "Oracle_system_table_interface" object, I'd say you've got a bit of a
> > data hiding problem.
> 
> You have a problem, maybe, but it need not be one of data-hiding.  Of
> course, if anything is "corrupted", then, by definition of "corruption",
> it's a problem.  But consider the unreliable-timing of (asynchronous)
> floating-point exceptions: if one gets triggered, after you have executed
> a few more (you don't know how many more...) operations, some
> of those operations (who have now left the system in an invalid
> overall state) may have been in methods of whatever other instance.
> This has little to do with data hiding...

Yes? And?

> > Yes. That semantics, however, is allowed to be throwing an exception, or
> > "returns a random number", or "exits the program", just as rand() and
> exit()
> > are allowed to do. What it's not allowed to do is "changes private memory
> of
> > other instances" or "modifies control flow in ways disallowed by the
> > language specification."
> 
> Then, by your (absurd, I surmise) definition, no "object oriented
> language" can ever run effectively on an architecture whose floating
> point exceptions are asynchronous and non-deterministic -- unless
> it forces execution to terminate on any such exception (how could
> allowable control flow be specified under such circumstances?).

Nope. It might not be as efficient as you like, but you can do it. You just
have to have the language deal with handling the interrupt and doing the
right thing. It's no more difficult than any other asynchronous interrupt in
something like C. When I program in C, I don't sit there worrying about
hardware interrupts making me jump out of a for loop in the middle, because
the OS and such takes care of putting me back where I started. If your
floating point chip can generate interrupts at various times, you need to
build the libraries to handle it without scribbling over *whatever* floating
point number my code *happens* to be using at the time.

What would you suggest instead?  That if I have a piece of code like

  a := b * c
  d := e * f
  i := 8
  print i
that's allowed to print 27?  That doesn't seem like a very useful language.

> What about other exceptions yet -- must a language fully specify
> behaviour on floating-point underflow, overflow, etc etc, to be
> "object oriented" by your definition? 

Nope. It just has to stick within the language definition. Namely, if
there's overflow,underflow, etc, it has to affect only the operands that
overflowed. If it's allowed to affect other random components of the system,
then it's not very OO.

> What if somebody (or some accident) functionally removes a
> piece of writable disk on which your program is currently paging --
> must your very hypothetical language, to gain the coveted title
> of "object-oriented", fully specify THAT, too, and NOT allow
> any recovery...?  There is really no difference between such
> occurrences and other kinds of asynchronous exceptions...

Now you're just being silly. Does "x += 1" really increment x if the CPU is
on fire?

> *REAL* languages, fully including any object-oriented ones, are
> fortunately designed (most of the time) with some input from
> people who know a little about such issues.  As a consequence,
> they _explicitly_ allow *UNDEFINED BEHAVIOUR* under these
> and similar kinds of circumstances: specific implementations of
> the language are allowed to do *WHATEVER IS MOST USEFUL*
> in the context of their environment when such-and-such things
> happen.  This lets effective, speedy implementations exist _as
> much as feasible given hardware, OS, etc_.  Even Java, who
> opts for full specification most of the time (cutting itself off
> from hardware that won't satisfy that spec), has a bit more
> flexibility than you allow -- as it is, after all, a real-world language.

As *I* allow?  You're setting up all kinds of strawmen yourself. :-)

I'm only talking about operations that happen with functioning hardware
inside the allowable semantics of the language. You're the one asking
whether a program running on a flaming CPU is capable of being OO.

> No.  It's about the *CONTRACT*: the SEMANTIC SPECIFICATION of
> the receiving class.  Is it ALLOWED to make a copy of the object
> I'm passing to it?  Is it REQUIRED to? 

We'll have to agree to disagree. If the desire of the routine is "keep this
bitmap refreshed on the screen", then whether the routine is allowed to make
a copy of the bitmap would be irrelevant, I would think. (Keep a copy and
use that in future refreshes perhaps would be relevant.) In a language with
Observers and weak pointers, the semantic specification of whether we're
allowed to hold a pointer to the bitmap, or to an observer who holds a
pointer to the bitmap, or whatever, becomes irrelevant to the
implementation. With weak pointers, one could even have a contract that says
"I'll keep the bitmap refreshed until you free the bitmap." 

> Get it?  _Specification_ constrains _implementation possibilities_.

And vica versa, which is the part you're ignoring. It's rather pointless to
specify something that implementation possibilities disallow, is it not?

> And, specification constrains *how the so-specified class can be
> used*.  *NOT* the other way around!  The implementation does
> NOT constrain anything.

Yes, it does, in that wonderful real world you're talking about. Take, for
example, your FPU. The implementation of the hardware constrains the
specification of languages that can run on it efficiently. Isn't that what
all your examples are about?

Ideally, the implementation would match only the nice specification. In
reality, some specifications are too strict to be implemented, so the
implementation changes the specification.

Indeed, cryptography is to a great extent all about specifying things in a
way that makes it impossible to do other things based on implementation
constraints.

> Specification always constrainst both uses and implementation.

And vica versa. You try not to specify things you can't implement. You try
not to use things that aren't specified. I've never successfully used *any*
program that's not implemented. 

> changes that aren't contractually allowed.  And I strongly doubt
> Mr Kay could have failed to see this obvious point, which has
> nothing to do with 'leaking information'.

Then perhaps you'd care to offer an alternative reason why Mr Kay thought
that GC was one of only two fundamental properties of OO? Seriously, if you
answer any of this, answer this one. I only have the quote. He didn't
include his reasoning, and it took me a few weeks before I figured out *why*
I thought he thought GC was fundamental. 

> Many things in life are very nice, but having or lacking them is an
> orthogonal issue to "being object oriented". 

I would suspect that the variety of things that "being object oriented" can
mean has evolved considerably in the last 30 years. Everyone has their own
definition. If it bugs you that someone famous for OO language design
thought that GC was a fundamental part of OO-ness, then... like... use C++
or something. :-) I personally would prefer to say that C++ is, say, loosely
OO, whereas Smalltalk is, say, strictly OO.  By which I mean that C++ has OO
concepts, but does not enforce them, while Smalltalk has OO concepts and
does enforce them. 

> > If you free the reference, and later attempts to use it cause a
> > "ReferenceFreedException" (for example), then I think I'd agree. But that
> > isn't how C or C++ (for example) works.
> 
> It causes *undefined behaviour*, because of performance
> considerations: see above.

Right. That makes it unsafe. Nothing wrong with that in the right
situations. It lets you violate most or all precepts of OO, just as it lets
you violate most or all precepts of structured programming. 

> Does Smalltalk specify *synchronous* exceptions for every possible
> occurrence?

As far as I know, yes. I haven't looked at Smalltalk in a while, either. If
it allows asynchronous events, I'd expect it would do it in a safe and OO
way.

For example, upon catching a floating-point overflow, the interrupt invokes
the FloatingPointOverflow message on the value that is the result of the
overflow. This allows anyone using the value in a later step to know it
overflowed. Of course, if you use the value *before* the overflow interrupt
is handled, you're pretty screwed anyway, I'd think.

> > But the *contract* is based on the *implementation*. I believe that you'll
> 
> ***NOOOOO***!!!!  This is utterly absurd.  Do you REALLY program
> that way -- you choose an implementation, then you build your specs
> around it?!  *SHUDDER*.

No. But when I find that my spec is impossible to implement or too
inefficient to be useful, I change the specification and make it a bit less
agressive. Just like you do. Note that I didn't say that "all contracts are
derived from implementations". I said "the particular aspect of the contract
for this particular hypothetical routine under consideration is a result of
a particular implementation choice." Trying to generalize it to a universal
qualifier and then saying "See? Here's a counter-example" doesn't disprove
the existance proof.

> > find *some* cases where the only reason that information is exposed in the
> > contract is because of your implementation choice.
> 
> A *WELL-DESIGNED* specification will carefully leave things
> *EXPLICITLY UNDEFINED* where this is the best compromise
> between freedom for the client-code and for the implementer.

So "explicitly undefined" is not part of the contract? If it *is* part of
the contract, then you're exposing information because of the implementation
choices/limitations. 

> The "immutable" tidbit here is the key: you've rigged the dice by
> specifying that there is exactly one mutation possible, 'freeing'.

I've simplified. In any case, if the bitmap is mutable, then it's allowed to
control whether (say) the refresh routine sees the changes, etc. I.e., all
the OO stuff is still there. If you've discarded the bitmap, then there's no
OO, no polymorphic dispatch on the reference to that object, etc.

> I know of no language that works that way; functional languages,
> with immutable data, invariably have no concept of 'freeing'.

Uh... "const" anyone?  Or maybe strings in numerous languages? Tuples in
Python, just to get things back on topic? Or bitmaps that are explicitly
coded with only a getter and a creation routine?

> So, say that data ARE mutable, as in almost all of today's languages.

Of course *some* data are mutable. 

Look, I am giving an example of where the ability to free memory explicitly
means you have expose an implementation detail at the specification level
where with GC you wouldn't need to do so.  You can't go changing the example
to prove that in other situations you wouldn't need to, and draw conclusions
from that. That just isn't how logic works. I'm saying "there exists X" and
you're arguing "No it doesn't, because not all X."

> If it's compatible with OO to have to specify this when data are
> mutable, it surely doesn't suddenly become incompatible with OO
> when you decide the only mutation is 'freeing'; just as it would
> not if the only mutation was 'rotation', say.  It's exactly the same
> kind of issue, after all; whether all mutations are allowed, or
> just some specific subset, is quite clearly secondary.

I disagree. Because if I can mutate the bitmap, the code will still
correctly do what I programmed it to do, even if that puts undesired results
on the screen. If I free the bitmap, the code will *not* continue to
correctly do what I told it to do, as there isn't anything I could tell it
to do that would be correct.

> > form the same basis for extensions and inheritance and such that classes
> do.
> > Not in the same way, of course.
> 
> _Definitely_ "not in the same way"...

No, not. Right. OK, to clarify, by "some sort of class concept" I meant a
mechanism whereby code can be associated with more than one instance, and
get to the proper instance data automatically. Fair? Sheesh.

I.e., "OO" code where you can only have one instance that uses each method
is something I'd have to think about (and use) before I could decide whether
I'd call it OO; it clearly could have polymorphic dispatch.  Of course,
since *you* are already the definitive source of knowledge on what is and is
not OO, I'd just have to ask you I guess. ;-)

-- 
Darren New / Senior MTS & Free Radical / Invisible Worlds Inc.
San Diego, CA, USA (PST).  Cryptokeys on demand.
"No wonder it tastes funny. 
            I forgot to put the mint sauce on the tentacles."