fork()

Hisao Suzuki suzuki611 at okisoft.co.jp
Sun Jun 13 19:03:54 EDT 1999


In article <199906110443.AAA02408 at eric.cnri.reston.va.us>,
Guido van Rossum <guido at CNRI.Reston.VA.US> wrote:
> Could we get away with not calling (user) finalizers on objects in
> trash cycles at all?  Since the finalization order is problematic at
> best, this almost seems acceptable: in the formal semantics, objects
> that are part of cycles would live forever, and as an invisible
> optimization we recycle their memory anyway if we know they are
> unreachable.  (I've got the feeling that I've seen this rule before
> somewhere.)  We'd still get complaints "my __del__ doesn't get called"
> from some users, and the answer would still be "it's in a cycle -- use
> an explicit close"; but those who know what they are doing can be
> guaranteed that their memory cycles get recycled.  In other words it
> would be no worse than today and for some people it would be better.

In article <Pine.SUN.3.95-heb-2.07.990611075346.4942G-100000 at sunset.ma.huji.ac.il>,
Moshe Zadka <moshez at math.huji.ac.il> wrote:
| No, that can't be. If we did, it were in Scheme, and after you put it in,
| you'd be morally obliged to add full-fledged lambda, tail-recursion and 
| user visible continuations. 

Certainly the R5RS (1998) says:
   "All objects created in the course of a Scheme computation,
    including procedures and continuations, have unlimited
    extent.  No Scheme object is ever destroyed.  The reason
    that implementations of Scheme do not (usually!) run out of
    storage is that they are permitted to reclaim the storage
    occupied by an object if they can prove that the object
    cannot possibly matter to any future computation. ..."

However, (strangely enough :-) this idea is found also in C++
literature.  B. Stroustrup wrote in the C++ Programming Language
2nd Ed. (1991):
   "Garbage collection can be seen as a way of simulating an
    infinite memory in a limited memory.  With this in mind, we
    can answer a common question:  Should a garbage collector
    call the destructor for an object it recycles?  The answer
    is no, because an object placed on free store and never
    deleted is never destroyed. ..." (13.10.1 Garbage
    Collection)

and in the C++ 3rd Ed. (1997):
   "When an object is about to be recycled by a garbage
    collector, two alternatives exist:
      [1] Call the destructor (if any) for the object.
      [2] Treat the object as raw memory (don't call its
          destructor).
    By default, a garbage collector should choose option (2)
    because objects creatd using _new_ and never _delete_d are
    never destroyed.  Thus, one can see a garbage collector as a
    mechanism for simulating an infinite memory." (C.9.1.3
    Destructors)

Python's __del__ is conceptually equivalent to C++'s destructor.
(Otherwise we would not rely on self.close() at __del__ in
Lib/tempfile.py!)  The above phrase `(be) never deleted' for C++
can be safely translated as `(be) part of cycles' for Python.
Thus, Guido's idea is quite consistent and orthodox in this
regard.  It will be acceptable to most of the current working
Pythoneers, including me, even if they are not fans of the Lots
of Intolerable Stupid Parentheses ;-)

By the way, C++'s destructor and Python's __del__ have been
proved useful in practice, particularly in conjunction with the
`resource acquisition is initialization' technique.  On the
other hand, such a finalizer as found in Java is almost useless
and unreliable because of its unpredictability.  I have never
made use of finalize() in Java for years, and I don't know any
skilled Java programmer substantially making use of finalize().

Further the C++ 3rd Ed. says:
   "It is possible to design a garbage collector to invoke the
    destructors for objects that have been specifically
    `registered' with the collector.  However, there is no
    standard way of `registering' objects.  Note that it is
    always important to destroy objects in an order that ensures
    that the destructor for one object doesn't refer to an
    object that has been previously destroyed.  Such ordering
    isn't easily achieved by a garbage collector without help
    from the programmer."

All in all, we need a sort of control or programmability over
the behavior of our programs when a fixed, built-in mechanism of
the language cannot solely address the problem very well.
Suppose that Python recycles the memory of unreachable cycles of
objects without calling __del__.  What if cycles that contain
`registered' objects are saved from deallocation so that
clean-up or other arbitrary actions can be performed?

Such an action will break, say, an unreachable doubly-linked
tree into separated nodes.  Then the nodes will be destroyed by
a normal process including invocations of __del__.  To make this
occur,
 (a) the programmer must register the root of the tree with a
     function to break up the tree, which will be invoked later
     automatically by the garbage collector, or

 (b) the programmer must register the root of the tree, and at
     certain point of the program she/he must explicitly invoke
     a function which retrieves the trees unreachable except
     from the `registration' server and breaks them up.

 (In any case, unreachable trees that contain no `registered'
  objects will be recycled with no invocations of __del__.)

I am not sure which option is more useful in practice, but
either of them would be acceptable since they are entirely
optional and compatible with the current semantics of Python.
With no zapping in the formal semantics, they are conceptually
clean and their implementations will be straightforward (thus
they are also very Pythonic ;-)

As for a reference to option (b), see, say, R. Kent Dybvig's
ftp://ftp.cs.indiana.edu/pub/scheme-repository/doc/pubs/guardians.ps.gz
which I have mentioned before in this newsgroup, though it is
written for Scheme and actually the coming version of GNU guile
implements guardians, i.e. `registration' servers described in
it.  Note that the guardians may be implemented somehow in 100%
pure Python even if it doesn't have a garbage collector.  The
sys.getrefcount built-in function will be used then.

--===-----========------------- Sana esprimo naskas sanan ideon.
SUZUKI Hisao            suzuki611 at okisoft.co.jp, suzuki at acm.org.




More information about the Python-list mailing list