Python's "only one way to do it" philosophy isn't good?

John Nagle nagle at animats.com
Thu Jun 28 02:41:54 EDT 2007


Douglas Alan wrote:
> "Chris Mellon" <arkanes at gmail.com> writes:
>>On 6/27/07, Douglas Alan <doug at alum.mit.edu> wrote:

>>This totally misrepresents the case. The with statement and the
>>context manager is a superset of the RAII functionality.
> 
> 
> No, it isn't.  C++ allows you to define smart pointers (one of many
> RAII techniques), which can use refcounting or other tracking
> techniques.  Refcounting smart pointers are part of Boost and have
> made it into TR1, which means they're on track to be included in the
> next standard library.  One need not have waited for Boost, as they can
> be implemented in about a page of code.
> 
> The standard library also has auto_ptr, which is a different sort of
> smart pointer, which allows for somewhat fancier RAII than
> scope-based.

    Smart pointers in C++ never quite work.  In order to do anything
with the pointer, you have to bring it out as a raw pointer, which makes
the smart pointer unsafe.  Even auto_ptr, after three standardization
attempts, is still unsafe.

    Much handwaving around this problem comes from the Boost crowd, but
in the end, you just can't do safe reference counted pointers via
C++ templates. It requires language support.

    This is off topic, though, for Python.  If anybody cares,
look at my postings in comp.lang.c++.std for a few years back.

    Python is close to getting it right, but not quite.  Python destructors
aren't airtight; you can pass the "self" pointer out of a destructor, which
"re-animates" the object.  This generally results in undesirable behavior.

    Microsoft's "managed C++" has the same problem.  They explicitly addressed
"re-animation" and consider the possibility that a destructor can be called
twice.  To see the true horror of this approach, read

http://www.codeproject.com/managedcpp/cppclidtors.asp

Microsoft Managed C++ ended up having destructors, finalizers,
explicit destruction, scope-based destruction of locals, re-animation,
and nondeterministic garbage collection, all in one language.
(One might suspect that this was intended to drive people to C#.)

    In Python, if you have reference loops involving objects
with destructors, the objects don't get reclaimed at all.  You don't
want to call destructors from the garbage collector.  That creates
major problems, like introducing unexpected concurrency and wierd
destructor ordering issues.

    Much of the problem is that Python, like Perl and Java, started out
with strong pointers only, and, like Perl and Java, weak pointers
were added as afterthoughts.  Once you have weak pointers, you can
do it right.  Because weak pointers went in late, there's a legacy
code problem, mostly in GUI libraries.

    One right answer would be a pure reference counted system where
loops are outright errors, and you must use weak pointers for backpointers.
I write Python code in that style, and run with GC in debug mode,
to detect leaks. I modified BeautifulSoup to use weak pointers
where appropriate, and passed those patches back to the author.
When all or part of a tree is detached, it goes away immediately,
rather than hanging around until the next GC cycle.  The general
idea is that pointers toward the leaves of trees should be strong
pointers, and pointers toward the root should be weak pointers.

    For a truly sound system, you'd want to detect reference loops
at the moment they're created, and handle them as errors.  This
is quite possible, although inefficient for certain operations.
Reversing a linked list that has depth counts is expensive.  But then,
Python lists aren't implemented as linked lists; they're variable sized arrays
with one reference count for the whole array.  So, in practice,
the cases where maintaining depth counts gets expensive
are rare.

    Then you'd want a way to limit the scope of "self" within a destructor,
so that you can't use it in a context which could result in it
outliving the destruction of the object.  This is a bit tricky,
and might require some extra checking in destructors.
The basic idea is that once the reference count has gone to 0,
anything that increments it is a serious error.  (As mentioned
above, Microsoft Managed C++ allowed "re-animation", and it's
clear from that experience that you don't want to go there.)

    With those approaches, destructors
would be sound, order of destruction would be well defined, and
the "here be dragons" notes about destructors could come out of
the documentation.

    With that, we wouldn't need "with".  Or a garbage collector.

    If you like minimalism, this is the way to go.

				John Nagle



More information about the Python-list mailing list