[ python-Bugs-1497962 ] Leak in tarfile.py

SourceForge.net noreply at sourceforge.net
Mon Jun 19 04:21:43 CEST 2006


Bugs item #1497962, was opened at 05/30/06 23:42
Message generated for change (Settings changed) made by sf-robot
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1497962&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
Resolution: None
Priority: 5
Submitted By: Jens Jørgen Mortensen (jensj)
Assigned to: Nobody/Anonymous (nobody)
Summary: Leak in tarfile.py

Initial Comment:
There is a leak when using the tarfile module and the
extractfile method.  Here is a simple example:
 
$ echo "grrr" > x.txt
$ tar cf x.tar x.txt
$ python
Python 2.4.2 (#2, Sep 30 2005, 21:19:01)
[GCC 4.0.2 20050808 (prerelease) (Ubuntu
4.0.1-4ubuntu8)] on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>> import gc
>>> import tarfile
>>> tar = tarfile.open('x.tar', 'r')
>>> f = tar.extractfile('x.txt')
>>> f.read()
'grrr\n'
>>> del f
>>> gc.set_debug(gc.DEBUG_LEAK)
>>> print gc.collect()
gc: collectable <ExFileObject 0xb73d4acc>
gc: collectable <dict 0xb73dcf0c>
gc: collectable <instancemethod 0xb7d2daf4>
3
>>> print gc.garbage
[<tarfile.ExFileObject object at 0xb73d4acc>, {'name':
'x.txt', 'read': <bound method ExFileObject._readnormal
of <tarfile.ExFileObject object at 0xb73d4acc>>, 'pos':
0L, 'fileobj': <open file 'x.tar', mode 'rb' at
0xb73e67b8>, 'mode': 'r', 'closed': False, 'offset':
512L, 'linebuffer': '', 'size': 5L}, <bound method
ExFileObject._readnormal of <tarfile.ExFileObject
object at 0xb73d4acc>>]
>>>


----------------------------------------------------------------------

>Comment By: SourceForge Robot (sf-robot)
Date: 06/18/06 19:21

Message:
Logged In: YES 
user_id=1312539

This Tracker item was closed automatically by the system. It was
previously set to a Pending status, and the original submitter
did not respond within 14 days (the time period specified by
the administrator of this Tracker).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 06/01/06 16:09

Message:
Logged In: YES 
user_id=31435

There's no evidence of a leak here -- quite the contrary. 
As the docs say, DEBUG_LEAK implies DEBUG_SAVEALL, and
DEBUG_SAVEALL results in  _all_ cyclic trash getting
appended to gc.garbage.  If you don't mess with
gc.set_debug(), you'll discover that gc.garbage is empty at
the end.

In addition, note that the DEBUG_LEAK output plainly says:

gc: collectable ...

That's also telling you that it found collectable cyclic
trash (which it would have reclaimed had you not forced it
to get appended to gc.garbage instead).  If gc had found
uncollectable cycles, these msgs would have started with

gc: uncollectable ...

instead.

Most directly, if I run your tarfile open() and file
extraction in an infinite loop (without messing with
gc.set_debug()), the process memory use does not grow over time.

Unless you have other evidence of an actual leak, this
report should be closed without action.  Yes, there are
reference cycles here, but they're of kinds cyclic gc reclaims.

----------------------------------------------------------------------

Comment By: Jens Jørgen Mortensen (jensj)
Date: 06/01/06 13:08

Message:
Logged In: YES 
user_id=716463

Problem is that the ExfileObject hat an attribute
(self.read) that is a method bound to itself
(self._readsparse or self._readnormal).  One solution is to
add "del self.read" to the close method, but someone might
forget to close the object and still get the leak.  Another
solution is to change the end of __init__ to:

  if tarinfo.issparse():
      self.sparse = tarinfo.sparse
  else:
      self.sparse = None

and add a read method:

  def read(self, size=None):
      if self.sparse is None:
          return self._readnormal(size)
      else:
          return self._readsparse(size)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1497962&group_id=5470


More information about the Python-bugs-list mailing list