[ python-Bugs-1208304 ] urllib2's urlopen() method causes a memory leak

SourceForge.net noreply at sourceforge.net
Thu Jun 2 01:13:58 CEST 2005


Bugs item #1208304, was opened at 2005-05-25 05:20
Message generated for change (Comment added) made by akuchling
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1208304&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Extension Modules
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Petr Toman (manekcz)
Assigned to: Nobody/Anonymous (nobody)
Summary: urllib2's urlopen() method causes a memory leak

Initial Comment:
It seems that the urlopen(url) methd of the urllib2 module 
leaves some undestroyable objects in memory.

Please try the following code:
==========================
if __name__ == '__main__':
  import urllib2
  a = urllib2.urlopen('http://www.google.com')
  del a # or a = None or del(a)
  
  # check memory on memory leaks
  import gc
  gc.set_debug(gc.DEBUG_SAVEALL)
  gc.collect()
  for it in gc.garbage:
    print it
==========================

In our code, we're using lots of urlopens in a loop and 
the number of unreachable objects grows beyond all 
limits :) We also tried a.close() but it didn't help.

You can also try the following:
==========================
def print_unreachable_len():
  # check memory on memory leaks
  import gc
  gc.set_debug(gc.DEBUG_SAVEALL)
  gc.collect()
  unreachableL = []
  for it in gc.garbage:
    unreachableL.append(it)
  return len(str(unreachableL))
  
if __name__ == '__main__':
  print "at the beginning", print_unreachable_len()

  import urllib2
  print "after import of urllib2", print_unreachable_len()

  a = urllib2.urlopen('http://www.google.com')
  print 'after urllib2.urlopen', print_unreachable_len()

  del a
  print 'after del', print_unreachable_len()
==========================

We're using WindowsXP with latest patches, Python 2.4
(ActivePython 2.4 Build 243 (ActiveState Corp.) based on
Python 2.4 (#60, Nov 30 2004, 09:34:21) [MSC v.1310 
32 bit (Intel)] on win32).

----------------------------------------------------------------------

>Comment By: A.M. Kuchling (akuchling)
Date: 2005-06-01 19:13

Message:
Logged In: YES 
user_id=11375

Confirmed.  The objects involved seem to be an HTTPResponse and the 
socket._fileobject wrapper; the assignment 'r.recv=r.read' around line 1013 
of urllib2.py seems to be critical to creating the cycle.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1208304&group_id=5470


More information about the Python-bugs-list mailing list