circular references?

Roy Smith roy at popmail.med.nyu.edu
Sat Dec 18 10:55:53 EST 1999


dj trombley <badzen at yifan.net> wrote:
> Introspection: That all depends on what one views as 'bad'. =)

In this case, bad means I spend two days being in a really grumpy mood because 
my program isn't working and I can't figure out why :-)
 
> What it _does_ actually do is prevent the garbage collector from
> cleaning up the memory, because there will always be a valid reference
> chain.

OK, I understand how the circular pointer chain will cause a memory leak.  I 
suppose this is bad, but it doesn't explain what's going on, since the failures 
occur after just a very small number of object creations, long before memory 
exhaustion could possibly be a problem.

Here's what's going on.  I've got a web server, using a subclass of 
BaseHTTPServer.  It starts out getting called from a cgi script, but it forks to 
detach itself and run in the background listening on a new port.  This child 
process logs into an oracle database using oracledb.  The overall application is 
a web front end to the database.  It sounds complicated (and it is), but I've 
gotten that part to work.  Or at least I thought I did until I made the most 
recent changes :-)

I have a page class which takes care of creating most of a generic page of HTML, 
and subclasses of that for each specific page type.  Each time my server 
responds to a HTTP request, it creates a new object of one of the page 
subclasses, then calls that object's show() method, which will typically perform 
some database access and produce HTML output.

My latest page subclass utilizes a data object to do some of the lower level 
work.  The page subclass looks essentially like this, where main_content is a 
callback function from an ancestor class's show() method:

class new_domain (netdbpage):
    def main_content (self):
        domain = db_record.domain (self, self.handler.form['f_hostname'][0], 1)
        self.record_set = [domain]
        self.display_record_set ()
        self.add_menu_command ('save')
        self.add_menu_command ('cancel')

Inherited from the netdbpage superclass is:

    def display_record_set (self):
        i = 0
        for record in self.record_set:
            record.display(i, self)
            i = i + 1

and my domain class looks like the following.  The db_record superclass doesn't 
do anything for now, but it's a placeholder; eventually there will be a number 
of different subclasses, with some shared functionality.

class db_record:
    pass

class domain (db_record):
    def __init__ (self, page, name, new = 0):
        name = string.lower (name)
        cur = page.handler.server.db.cursor()
        if (new):
            cur.execute ("select id_seq.nextval from dual")
            id = cur.fetchone()[0]
            cur.execute ("insert into domain (id, name) values (:1, :2)",
                         [id, name])

        cur.execute ("select id, name, comments from domain where id = :1", [id])
        id, name, comments = cur.fetchone()
        cur.close()
        self.data = {}
        self.data["id"] = id
        self.data["name"] = name
        self.data["comments"] = comments

    def display (self, n, page):
        page.put ('<h1>%d</h1>\n' % n)

The put method that's called is in one of netdbpage's ancestor classes:

    def put (self, s):
        self.handler.wfile.write (s)

As shown above, this works.  But, it's ugly passing the "page" argument into 
each method of domain that needs it, just so domain can reference page's put() 
method (and a few other things).  The initial plan was to have domain.__init__ 
store a copy of page in self.page, and then things like domain.display could 
have one less argument and call self.page.put() to produce output. This of 
course, is the circular reference; the page object contains a link to the domain 
object, and the domain object contains a link back to the page.  That's where 
things start to go bad.  If I do that, I get no output.  I know my put() 
function gets called, because if I change it to:

    def put (self, s):
        self.handler.wfile.write (s)     # wfile connected to network socket
        sys.stderr.write (s)

The strings all show up in stderr (which is connected to a log file) just like 
they should.  If I change display to be any of:

    def display (self, n, page):
        self.page = page                     # circular reference
        self.page.put ('<h1>%d</h1>\n' % n)

    def display (self, n, page):
        self.page = page                     # circular reference
        page.put ('<h1>%d</h1>\n' % n)       # but I use the local copy

    def display (self, n, page):
        self.foo = page                      # member name "page" is not magic
        page.put ('<h1>%d</h1>\n' % n)

I get no output to my browser (even though I know put() gets called because I 
can see the stderr output).  If I make it:

    def display (self, n, page):
        self.page = page
        self.page.put ('<h1>%d</h1>\n' % n)
        self.page = None

then I get output like I should.  Clearly the circular reference is doing 
something worse than simply causing a memory leak.  It's almost as if somewhere 
deeper in the file code than I can see, a copy is being made of a file 
descriptor or buffer or something like that and if I have the circular link, my 
output goes to the wrong copy, which isn't connected to anything I can see.  The 
fact that breaking the circle *after* the call to put fixes something makes me 
think output buffering.

Also, even after this fails, I can continue to access my server through a 
browser, and use other functions, which create additional objects.  This says to 
me that memory exhaustion is not an issue.  Obviously, I will need to address 
the memory leak problem, because it *will* become an issue at some point, but 
for now I havn't hit that wall yet.

So, does that fit your definition of "bad" :-)



More information about the Python-list mailing list