destructors order not guaranteed?

Alex Martelli aleaxit at yahoo.com
Wed Nov 1 14:11:03 EST 2000


"William S. Lear" <rael at see.sig.com> wrote in message
news:878zr5ez0i.fsf at lisa.zopyra.com...
    [snip]
> the problem I have is with the behavior of the test() function itself
> in the original example:
>
>     def test():
>        a = foo(1)
>        if 1:
>            b = foo(2)
>
>     if __name__ == '__main__':
>        test()
>        ...
>
> What I dislike is that test() builds a, then b, but a is then
> destroyed before b.

There may be some confusion here between objects and
references.  a and b are just references.  There are two
instances of class foo, to which said references (and
others yet) may be bound to at a given point in time.

There is no "building" or "destroying" of the references,
but rather of the objects.  You may think of a reference
as close to a C++ pointer (not to a C++ reference, as
that cannot be re-bound, but rather is only bound once).
    a=foo(1)
has similar semantics to C++'s
    a=new foo(1)
(plus implicit reference-counting on the object).

> I prefer the C++ behavior in this case:
    [snip]
>     void test() {
>         foo a(1);
>         if (true) {
>             foo b(2);
>         }
>     }

This is very different, as C++ is giving you objects
that live on the stack, rather than having dynamic
lifetimes.  I think it's the only OO language that does
that, btw -- OO languages tend to be based on the
paradigm that an object 'lives in the heap', with some
provision (reference counting or other forms of
garbage collection) to limit their lifetimes implicitly.

The actual C++ equivalent would be closer to:

    void test() {
        foo* a = new foo(1);
        if (true) {
            foo* b = new foo(2);
       }
    }

except that this C++ code would "leak" the two
instances of class foo thus generated -- you would
have to arrange for their explicit deletion at need.

The price that C++ programmers have to pay for the
admittedly great speed (and sometimes convenience)
of stack-allocated ("auto") objects is large.  The
object ceases to exist _whether or not references to
it still exist_ -- the dangling-pointers problem.

Suppose there was some other function fun (that,
in C++ terms, takes as its argument a foo*; in
Python, you just need to know it takes 1 arg).  Can
you safely call from test, passing it a, or b?  In
Python (or Java, Modula-3, etc), the answer is
"why, of course".  If function fun just uses its
argument and then it's done with it, fine.  If it
chooses to cache a reference to its argument for
later use, fine anyway; that's _fun_'s business.

The object is not going to go away as long as
references to it still exist, anyway (and, in the
current C-Python version, it IS going to get
destroyed _exactly_ when the last reference to
it goes -- the _big_ advantage of reference
counting vs other, 'advanced' forms of GC...).

In C++, *you cannot be sure*.  You can only call
fun if it *does NOT* cache a copy of its pointer
argument for later use; if it does, then passing it
the address of a stack-allocated object will later
cause a dangling-pointer bug -- a frequent chance
of hard-to-debug problems.


This is, in good part, why 'stack-resident' objects
are almost unknown outside of C++: other
languages consider the bug-risk engendered by
such stack residence to be far too high.  Thus,
it is and remains' C++'s hallmark to give the
programmer total control, and total responsibility,
over the lifetimes of objects.  It's quite debatable
whether this is an optimal decision for system-level
work (I would claim it is, but I admit it's definitely
not an open-and-shut case!).

For a higher-level language, such as Python, it
would be folly to assign such responsibility to the
programmer; it would, in fact, go totally against
the grain of _being_ a VHLL....!  Reference counting,
or other forms of garbage collection, are really a
must here.


Within this framework, your (aesthetic?) preference
for 'LIFO order' needs to be assessed.  LIFO of
_what_?  _Earliest_ binding of a reference...? Why?

I.e., consider:
    a=foo(1)
    b=foo(2)
    a,b=b,a
now do you want the instance that was generated
by foo(2) [and is currently bound to a] to be
destroyed before, or after, the one that was
generated by foo(1) [and is currently bound to b]?

How much overhead and other difficulties would
you be willing to pay to get (whatever specific
meaning of 'LIFO' you specify)?  I've already shown
in another post how to implement both "LIFO of
earliest-binding" and "LIFO of latest-rebinding"
by just switching to the syntax:
    k.a=foo(1)
    k.b=foo(2)
    k.a,k.b=k.b,k.a
for some suitably defined k.  Of course, it slows
things down too -- inevitably, since somebody
must be keeping track of the temporal ordering
of earliest or latest bindings -- pure overhead,
since absolutely nothing else in Python has any
need for such ordering.

But would you want that performance price to
be paid by *every* Python program, including
those written by the 99% of programmers that
couldn't care less about the esthetics of FIFO?
Or are you willing to switch to the k.a syntax
and be the only one to pay (with a slightly less
handy syntax sugar)...?


> I agree with you that the order of construction/destruction in the
> global namespace is (and probably should be) indeterminate.

For C++, it's actually an issue of storage class.  Only auto
(stack-resident) variables are destroyed in predictable order
(static-lifetime, and dynamic-lifetime, variables are not --
except in as much as the destruction of dynamic lifetime
variables, i.e. those allocated with new, must explicitly
be requested with a corresponding delete).  It's NOT an
issue of 'global namespace' -- just place some static variables
inside a few functions/blocks, and there are no guarantees
any more about their order of destruction at program end
(any program erroneously relying on such ordering is at
the mercy of a specific release of a given C++ compiler...:-).

Python doesn't give you different storage classes -- everything
lives 'on the heap' (i.e., is dynamic) in C++ terms.  Not
having 'auto' storage class, you never risk 'dangling pointers'
or the equivalent -- but you don't automatically get any 'LIFO'
behaviour either, of course.


> > Are you talking about _the class_, or _a specific instance_ of that
> > class?  If the object being destroyed has a reference to another
> > object ("a specific instance of class D", or "class D itself" -- do
> > remember that classes ARE objects in Python), that other object
> > WILL remain available as long as the reference exists.  Period.
> > (This also holds for a *module* object, which is more likely to
> > be "providing services").
>
> Yes, I realize that classes ARE objects in Python.  What I don't like
> is having to add references in an ad-hoc manner, when I would prefer,
> within function scope, to rely on LIFO construction/destruction.

A reference to an object has to be available if any method of that
object is to be called.  We do agree on this, yes?

So, there is nothing "ad hoc" about this: the reference to object D
must exist if anything is to be done with that object.  Well, if
the reference exists, so does the object.  It's as simple as this:
if you can try to do anything with the object, then the object still
exists; in NO case will it disappear 'from under you' if you still
can access a reference to it (and if no references to it exist, then
how did you propose to ask the object for any service...?)

I realize you have a strong esthetic preference for LIFO.  But,
as you had expressed as a motivation for it "being able to use
the services provided by object A in the destructor of object
B", I'm trying to clarify that getting the LIFO you want would
buy you ***absolutely nothing*** in that regard: there is
*nothing* you would become able to do, that you can't do
now, in terms of object B (in any of its methods, __del__
included) using services provided by object A.


> > Perhaps you can sketch a sample case that worries you, where
    [snip]
>     class D:
>         def __init__(self, id):
>            self.id=id
>            print 'Constructing D(%d)' % id
>         def __del__(self):
>            print 'Destroying D(%d)' % self.id
>         def foo(self, msg):
>            print msg

So far, so good.  Each instance x of class D provides the
service x.foo(msg).

>     class C:
>         def __init__(self, id, a_d):
>            self.id=id
>            self.d=a_d
>            print 'Constructing C(%d)' % id
>         def __del__(self):
>            print 'Destroying C(%d)' % self.id
>            self.d.foo("Au revoir")

And any instance x of class C needs to use that service
in its destructor.  So it keeps a reference to a D instance
around, as self.d.  Excellent; that D instance is NOT
going to go away prematurely, _because a reference to
it exists_...


>     def test():
>         aD = D(1)
>         aC = C(1, aD)
>         if 1:
>             a_D2 = D(2)
>             a_C2 = C(2, a_D2)

Good!  Of course, the named-references aD and a_D2
are actually superfluous here -- the more concise:

def test():
    aC = C(1, D(1))
    if 1:
        a_C2 = C(2, D(2))

is equivalent (and the "if 1" of course is also a no-op,
but I'm sure this is well understood:-).

> Which is what I would like, but I guess this will take some getting
> used to ... I really do prefer LIFO order.

It's a perfectly understandable preference -- stack semantics
_are_ quite nice (when there is no interference with desires
to have objects outlive the function that generated them, &c).
But, esthetics apart, there is really no added value of giving
LIFO constraints in ordinary Python usage.

Harder to get used to, for experienced C++'ers: since no
"stack-allocated, LIFO-lived, destructor execution is
guaranteed even in the presence of exceptions" objects
exist, you cannot (safely and portably) rely on __del__
for general-purpose finalization operations -- it may work
on C-Python right now, but it may break on Jython, or
Python .NET, or other future Python version -- or, right
now if by mistake you end up 'leaking' a reference to
your meant-as-generic-finalizer object.

So, the beautiful C++ idiom...:

    template <class Resource>
    struct Locker {
        Resource& pr;
        void (*closer)(Resource&);
        Locker(Resource& pr, void (*closer)(Resource&)):
            pr(pr), closer(closer) {}
        ~Locker() { closer(pr); }
    };

has no appropriate Python translation (nor Java, etc).

The try/finally construct (less elegant, though no doubt
more direct and explicit) is what takes its place.  I.e.,
the C++ code...:

    void funz() {
        Locker lock1(myDatabase.acquire(), dbRelease);
        Locker lock2(myResource.getit(), dropIt);

        fonz();
    }

becomes something like...:

    void funz:
        try:
            lock1 = myDatabase.acquire()
            lock2 = myResource.getit()
            fonz()
        finally:
            dropIt(lock2)
            dbRelease(lock1)

Yeah, it WOULD be lovely to have a special _object_ with the
kind of semantics try/finally (or C++'s autos' dtors) ensure...
but I can't find a all-Pythons-portable way to architect it!-)


Alex






More information about the Python-list mailing list