Python 2.0

adjih at technologist.com adjih at technologist.com
Fri Jun 4 12:49:24 EDT 1999


In article <7j6j8d$58f$2 at cronkite.cc.uga.edu>,
  graham at sloth.math.uga.edu (Graham Matthews) wrote:
> adjih at technologist.com wrote:
> :   But there will be cases where full garbage collection will be
> : definitly a nuisance.
> :   First, you must scan all the address space of the C code, which
can be
> : a problem if you're just using the Python interpreter in a library
(i.e.
> : don't have the ability to control or modifying the main program) ;
or if
> : your address space is too big (I use python for instance in a
program
> : that does an mmap of a 350 MB file).
>
> a) you can turn gc off

Yes this is a good solution. But then you must be careful that very few
of the modules publicly released for Python rely on GC  unless they
are already very specific.

> b) I routinely use garbage collection on 1 Gig memory spaces with no
> 	 speed problems.

The problem is not the size of address space as such, but the ratio
[non-paged out memory] / [total memory accessible by the program]
compared to the speed at which your application generate cyclic
structures. If you have to read 350 MB from disk each time you do
a GC, then you will have a huge pause, but still may be forced
to GC often if your application create many cycles at a high rate.

Disabling the GC is the solution.

>
> adjih at technologist.com wrote:
> :   Second you kill the portability: you could use a subset of Python
with
> : ANSI-C alone ; but you can't make a incremental GC in ANSI-C.
>
> Then don't use an incremental collector! Use a mark sweep collector.
> I use such a thing almost every day -- written entirely in ANSI-C.

  It is true that it shouldn't be hard to make a mark-sweep collector
that is working in practice on many platforms.

  Pedantically I doubt that ANSI-C tells you the current stack top
address, or guarantees that the whole address space between the current
stack pointer and this top stack address is stack space. But I agree
that actual C compilers wouldn't do something too stupid, we
shouldn't care about this :-)

>
> :   You'd get problems when embedding Python in unusual environments ;
for
> : instance if you embed Python in the Linux kernel, then the GC should
> : also trace all the pages in the swap, or be rewritten specially for
that
> : case.
>
> Why do you need to do any of these things?

Use Python in the Linux kernel or in an unusual environment ? Why not ?
Mark-and-sweep pages in swap ? Well only if you could have references
from pages in swap (which is the case if you use the generic vmalloc in
2.2+).

> adjih at technologist.com wrote:
> :   The advantages of GC must be traded with its drawbacks:
> : - advantages: reclaims cyclic structures.
> : - drawbacks: non-portable, requires a special handling when linking
with
> : C, high complexity for decent GC (multi-thread incremental with
> : finalization and support for C/C++ structures referencing Python
> : structures).
>
> The drawbacks you list are all misleading. Mark sweep collection (with
> refcounts) is highly portable. Non compacting (or even partially
> compacting) collectors require no special code to link with C.
> Complexity of a mark sweep collector is not very high at all
>(registering
> roots and traversing foreign objects being the main areas of
> complexity).

Registering the roots isn't necessarily straightforward: you need
control and information about all the allocators that could be used in
the program.
You'd have problems if you created a shared library that created
internally an instance a Python interpreter and offered objects
with internal pointers to Python objects.

> I am sure that for *some* collection schemes some of
> the drawbacks you list are true, but not for all.

  Non incremental mark-sweep collection should be portable, but
then of course you'll have to cope with the GC pauses. This is
another reason to disable GC by default.

  What I would object to, is a mandatory GC (such as in many Lisps),
or the existence of an important proportion of C or Python code
relying on GC.

-- Cedric


Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.




More information about the Python-list mailing list