Bug report: memory leak in python 1.5.2

Fred L. Drake fdrake at cnri.reston.va.us
Wed May 26 17:12:16 EDT 1999


Michael P. Reilly writes:
 > There is also the situation where some UNIX systems put the environment
 > initially in the u area, and it is difficult to programmatically determine
 > where different runtime segments are (where is the heap vs. where is the
 > u area).
 > 
 > Fred, your solution should work because it takes the problem case: what
 > to do with the string initially, but I think it might be better to copy
 > the values at module initialization time.  I've included an addition to
 > Fred's patch to be called instead of the PyDict_New() function (in the
 > module init function).

  If I understand correctly, your patch avoids the problem of memory
leaked from the initial environment (a static size).  Is this correct?
  If so, I'm not sure it's worth the extra code.  My intention was to
avoid the Python-induced leak that would allow a long-running Python
script that occaisionally created a subprocess to become a MemoryError
traceback.  ;-)  In the case of systems without a lot of memory
available, the environment should be kept small to begin with (making
the additional data structures created by the startup code more of a
problem).
  I don't think I've ever checked the size of the "typical" UNIX
environment; "printenv | wc -c" tells me I'm running under 2Kb in a
fresh shell.  Is that enough to worry about, and slow down
initialization?

 > Also, how should we deal with this in terms of C applications who might
 > change the environment?  (Embedders beware!)

  In this case, we may not clear all the possible garbage, but we only 
leak for keys that are:

     1.  Changed from Python at least once, then
     2.  Changed from C, and
     3.  Never changed from Python again.

  Note that only one copy of the variable gets leaked, not an infinite 
succession.
  In the case of two Python putenv() calls with C putenv() calls
inbetween, we don't introduce any new leaks; the effect is that the
data from the first Python putenv() isn't collected until the second
Python putenv().  This is acceptable.

 > From a programming standpoint, I don't think that it should be "proper"
 > to be changing the environment all that much.  It's purpose is to
 > propragate values to child processes, not to store runtime values.

  I agree.  Processes that run a lot of children, like HTTP servers
running CGI scripts, won't be using their own environment to do this,
but will create the desired environments on the fly.  (Especially if
they're threaded!)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives




More information about the Python-list mailing list