GC in OO (was Re: Python 1.6 The balanced language)

Mon Sep 4 13:38:26 EDT 2000

Alex Martelli wrote:
> > Depends on what you mean by "safe". If by "safe" you mean "according to
> > specificiation of the problem", then yes. If by "safe" you mean what
> > language designers mean, which is "defined behavior according to the
> > language specification", then no.
> 
> "safe" as in "will not make the program do things it shouldn't". 

That's begging the question. Now you have to define "should". My payroll
program "shouldn't" pay out a million bucks to the janitor, but it certainly
will if that's what I typed in the source code.

> Whether
> a program crashes because the language specification is violated, or
> because a semantic constraint (precondition) is, it's still a crash 

Err, no. Throwing an exception is not a crash, any more than running
"shutdown" crashes a UNIX computer. On the other hand...

>(and
> not quite as bad as a program NOT crashing but corrupting persistent
> data in subtly-wrong ways...).

... that would be a "crash".

> > In other words, if you change an object to which I'm holding a reference,
> > I'll still do what you told me to. If you GC an object to which I'm still
> > holding a reference, you can't define the semantics of that operation.
> It's
> 
> But neither can I change the semantics of what happens in other
> cases, e.g. a divide by zero; whether that raises a catchable exception
> or not depends on the language (and possibly on the implementation,
> if the language leaves it as implementation-defined).

Well, I'd say that one of the basic tennants of OO programming is that the
only operations that change the private data of a class are the methods of
that class, yes? Or at least that you need to operate on a reference of an
instance in order to change the value of the instance, which definition
allows for things like Python and Java and such.

Now, if sending a "divide" message to the integer "5" causes it to corrupt
the "Oracle_system_table_interface" object, I'd say you've got a bit of a
data hiding problem.

> Will you therefore argue that a language, to be "object-oriented", must
> give certain fixed semantics to divide-by-zero...?

Yes. That semantics, however, is allowed to be throwing an exception, or
"returns a random number", or "exits the program", just as rand() and exit()
are allowed to do. What it's not allowed to do is "changes private memory of
other instances" or "modifies control flow in ways disallowed by the
language specification."

> Note that this ties back to the issue about whether I can mutate a
> reference after giving it out -- i.e. whether the receiving class will
> hold to that reference (and if so whether it will rely on it to stay
> as it was) or make its own copies.  That's part of the receiving
> class's semantics; I need to know the contract obligations of the
> receiving class to perform my part of the bargain.  It's NOT an issue
> of having to know the _implementation_ of the receiving class,

Well, yes, kind of. It is.  If I write a class that you pass a bitmap and I
draw it on the screen (say), I can no longer subclass that class to cache
that bitmap. I think Kay's point was that without some sort of automatic
memory control, you have to keep track of who is using which object, which
means you're leaking information about the instance variables. *You* call it
fundamental semantics, but it isn't always fundamental semantics. Sometimes
it's just implementation details.

> but one of design-by-contract... and please don't tell me, or B.M.,
> that dbc is not OO:-).

Given that Eiffel is safe and has GC, this is just a strawman. Unless you'd
like to show me how to declare "argument A does not get garbage collected
before you call procedure B" as a postcondition. Or would that be a
precondition? Probably a class invariant. But you still can't do it, as
Eiffel doesn't have any sort of temporal mechanisms in the assertions.
("old" doesn't count for a number of reasons.)

Look at it this way: What does Eiffel do when you run off the end of an
array? It raises an exception. Perfectly defined, no memory corrupted. The
fact that some compilers let you bypass this check for performance reasons
is irrelevant for a number of reasons.

> 
> Consider:
> 
> class datastuff:
>     def __init__(self,denominator, formatstring="%s"):
>         self.denominator=denominator
>         self.formatstring=formatstring
> 
> class divider:
>     def __init__(self,datastuff):
>         self.denom=getattr(datastuff,'denominator',0)
>         if not self.__denom:
>             raise PreconditionViolation
>         self.format=validateformat(getformat(
>             datastuff,'formatstring',"%s"))
>         self.theds=datastuff
>     def oper1(self,numer):
>         return self.format%(numer/self.__denom)
>     def oper2(self,numer):
>         return self.format%(numer/self.theds.denominator)
> 
> Can I rely on divider.oper1 not dividing-by-zero?  Yes,
> presumably (assuming I haven't violated contracts e.g.
> by messing with the private __denom field): __init__
> tests to ensure the __denom it saves is not zero, and
> that's what oper1 divides by.  But can I rely on oper2?
> No, that's subject to what I may have been doing to
> the .denominator field of the datastuff -- because the
> divider class is keeping a reference to that specific
> instance and using it without re-testing.

This is perfectly safe code. Exceptions don't make code "unsafe".

And anyway, you *can* rely on not dividing by zero. An exception will be
raised if you try it, so no, you can't divide by zero. Just like you can't
run off the end of a list in Python, unlike how you can run off the end of
an array in C. (Altho later standards made it illegal, nobody really treats
it that way. *In theory* C is now a pretty safe language. In practice,
nobody actually implements any of the checks, except maybe in Purify or
something like that.)

> So, I need to know what divider's contract says about
> keeping a reference to the datastuff instance passed to
> __init__, or not.  If divider is free to keep such a
> reference, and to assume it stays valid by its semantic
> criteria (non-zero denominator), then the client code is
> not free to muck with that field; and vice versa.  The
> situation is not in the least different if the "mucking" is
> the 'freeing' of the instance, or if it's changing the
> nature of the object or its fields (a del on the
> denominator field could cause an AttributeError, etc).

If you free the reference, and later attempts to use it cause a
"ReferenceFreedException" (for example), then I think I'd agree. But that
isn't how C or C++ (for example) works.

> It's sure nice if all errors are specified to cause some
> trappable exception rather than an outright crash, but
> surely that's not a part of "object-orientedness" 

See above.  Of course there are lots of parts of OOness.  If everything is
an object, and the only way to interact with objects is via messages (as in
Smalltalk, say), then yes, it is impossible to violate the semantics of the
language. The semantics of an object might not be what you want; the
semantics of passing "0" as the argument to the "/" message sent to "5"
might include exitting the program, but it would be well defined.

> -- it's
> just a nice feature that a language can specify (with a
> cost, that can be huge, for its ports to platforms without
> good hardware/OS support for trapping certain kinds of
> invalid operations, to be sure).

Again, the cost would depend on the semantics of the messages.

> But the point is that "having to know about how a
> class is implemented" is not connected with the ability
> to free/dereference/change an object, its fields, etc.
> Rather, such freedom is ensured to client-software by
> a mix of language-behaviour *and semantic contract
> of the class being used*.  Not *implementation*, note:
> *contract*.  An important difference.

But the *contract* is based on the *implementation*. I believe that you'll
find *some* cases where the only reason that information is exposed in the
contract is because of your implementation choice. 

Take, for example, a class that puts a bitmap up on the screen. Assume,
also, that bitmaps are immutable, but freeable. If my class does not save a
copy of the bitmap, you can't free it as long as I may need to refresh the
screen. If my class *does* save a copy, you don't need to worry about what
you do with your copy. If there's GC, you don't need to worry either way.
Hence, the lack of GC causes you to have to expose implementation details in
your contract, contrary to how ADTs work. ;-)

>  like the difference between saying "indexing off an array throws an
> > exception" and "indexing off an array overwrites the interrupt vectors
> with
> > random garbage". The first is still "safe", while the second isn't.
> 
> But neither issue defines whether a language is OO.

I believe it does, in part.  See above.  In any case, I don't believe there
*is* a "definition" of what makes a language OO, or we'd not be having this
thread. :-)

> > > but tying the "OO"ness
> > > of a language to it seems to be a typical case of "hidden agenda"
> >
> > Uh... As I said, that was an Alan Kay quote.  If you don't know, Alan Kay
> > was arguably the guy who *invented* OO. It's hard to see how he could have
> a
> > hidden agenda against languages that weren't invented at the time.
> 
> He co-invented Smalltalk, but Simula had been around for a while (and
> Kay had used it at University).  And, is the quote dated from before the
> birth of yet other languages which I'd call object-oriented? 

1971 copyright. Hence, probably written early on.  And while Simula used OO
concepts, I don't think anyone was talking about OO as such at the time,
were they? I could be wrong on that.

In any case, Kay certainly popularized it, and did so early enough on that
claiming he did it to bash other OO languages is kind of silly. 

> You may disagree, but I think the onus of proof is definitely on you.

Fair enough. Prototypes also work. I was more thinking along the lines of
objects without any sort of language support inheritance or automation.
Classes, prototypes, all that stuff count. 

> I don't mind the class-ic approach to OO, but prototype-based OO
> would also have its advantages, and it seems peculiar to rule it
> out as being OO at all because it lacks 'some sort of class concept'.

I would say prototypes are "some sort of class concept". I.e., prototypes
form the same basis for extensions and inheritance and such that classes do.
Not in the same way, of course.

> Dynamic dispatch, polymorphism, is what I'd call *the* one and
> only real discriminant of 'OO'.

I think you would need to define this more strictly, or you'll wind up with
BASIC's "ON GOTO" statement being object-oriented. :-)

Anyway, to slide this a little back on target, I just started learning
Python, and it really is one of the nicer languages I've seen for doing what
it does. :-) Now I need to go buy more books and such.

-- 
Darren New / Senior MTS & Free Radical / Invisible Worlds Inc.
San Diego, CA, USA (PST).  Cryptokeys on demand.
"No wonder it tastes funny. 
            I forgot to put the mint sauce on the tentacles."