[Python-Dev] Re: RFC: malloc cleanup

Vladimir Marangozov Vladimir.Marangozov@inrialpes.fr
Fri, 18 Feb 2000 22:45:47 +0100 (CET)


[Detailed RFC on a bunch of malloc interfaces]

OK, I'll comment first as I'm probably one of the few who understands
(what I'm talking about :-) what's this mess is all about in its deepest
details while having an overall vision on the issue. So I feel obliged to
expose a more friendly explanation.

(This turned out to be only 2 screens long, so you may leave it for
 tomorrow morning ;-)

I'll make it simple:

What really happens if the RFC gets implemented and how it affects you?
-----------------------------------------------------------------------

1) All C extension modules, without exceptions, would be silently redirected
   to use the Python malloc wrappers. Silently means that no one line of code
   would need to be modified and the modules will continue to compile and run
   as before.  (i.e. all macros and functions used in C code remain the same)

   Consequences:

	a) You *won't* be required to modify anything, to change your programming
           habits or to adopt new mandatory interfaces. You could continue the same
           way as before.

	b) All *user-defined* Python objects (not the "core" ones, like ints, dicts
           or strings) would start using the wrappers automatically, which may result
           in a tiny degradation in their performance.

           This is the price to pay if we want to make Python user-malloc friendly --
           that is, decouple the core from its strong dependency on libc malloc
           and make 3rd party extensions depend only on Python's memory interfaces
           when they manipulate Python objects. Currently, the extensions depend
           on libc malloc (they use "public" memory offered by the system) and there's
           no way to make them use "private Python memory" for Python objects.

           BTW, user-defined objects are used far less frequently than the "core" ones.

2) All "core" objects will run as before at the same speed, but they'll use
   "private Python memory".

3) There would be an opportunity to make the code fancier. This is the whole point
   of the discussion about the NEW/DEL pairs. Fancier == more logical from the
   programmer's point of view. This is not mandatory, but it would be desirable
   to adopt these pairs so that mallocs don't mess again.

   And I'm volunteering to rename the core and the modules in the distribution
   so that they look fancier and that they serve as examples for future Python
   development. But all this remains optional.

4) Last, but not least, this will open the way to change (optionally) Python's
   core allocator to specialized and more appropriate mallocs than libc malloc.
   There's only one I know of for the moment (mine <wink>) and it's not "so good".
   But it looks promising, because it already serves Python better than libc malloc.

   The current state, however, prevents people to work in this area and to experiment
   alternatives, because it's very hard to get things working without an expert level
   knowledge of the internals. I personally find this regrettable and am willing to
   "repair" it, because I think that Python can make a noticeable profit from a
   specialized malloc (which may well outperform the tiny degradation caused by the
   wrappers).

------------

Currently I have a patch which does this the "easy way" -- it does not truly decouple
the core from the extensions through the wrappers. However, one can already make
experiments by recompiling Python *and* all user extensions with a new malloc. With
the patch, Python doesn't dump core anymore, which may be a little gain in your eyes,
but a huge one in mine <wink>.

FYI, I've corresponded in private with Neil Shemenauer who's working on GC, and he
tested & validated the present "easy" version of the patch by saying:

> On Sun, Feb 13, 2000 at 04:06:05PM +0100, Vladimir Marangozov wrote:
> > I'm appending another patch suite, so that you could test it directly
> > on your setup with your malloc. Let me know how it goes. Hope we're
> > making progress on this.
> 
> It works for me.  It passes the regression test and also does not
> leak memory when cycles are created. :)
> 
> 
>     Neil
> 

------------

So what I'm looking for here is your approval or your objections to the principles
I've exposed in this msg and in the RFC.

I can presently send a whole bunch of patched files affected by the "easy" way,
but it does not solve the real problem. The "easy way" does not redirect the
extensions to the wrappers. i.e. it doesn't have the "private Python memory" concept.
It *is* a step in the right direction, though.

Final word: I'm okay for deferring this issue for future Python releases, if there's
no time, no resources, no understanding, whatever (lately, I see a big increase in
Python's contributions since the announce of the tight schedule for 1.6). That's why
I won't push too much and I don't want to burded the atmosphere with still undecided
interfaces -- better take some more time and get them right. OTOH, the sooner we get
them right, the better.

-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252