This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Massive memory leak in array module
Type: Stage:
Components: Extension Modules Versions: Python 2.3
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: nnorwitz, rhettinger, skip.montanaro, tim.peters
Priority: normal Keywords:

Created on 2003-08-03 16:15 by tim.peters, last changed 2022-04-10 16:10 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
array.diff rhettinger, 2003-08-04 06:02 Diff to arraymodule.c
Messages (8)
msg17588 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2003-08-03 16:15
Simplified variant of a test program from c.l.py:

.import array
.
.def test():    
.    for x in xrange(1000000):
.        b = array.array('B', range(120, 120+64))
.
.test()

On Win98 under 2.3, it zooms to 200MB in a few 
seconds.  No growth under 2.2.3 (it stays under 2MB 
until it finishes).

Problem goes away if

.        b = range(120, 120+64)

instead.  Problem also goes away if

.        b = array.array('B', range(64))

instead.  I expect (but ave  not confirmed) that in the 
latter cases it's leaking refcounts (to the immortal "little 
integers").
msg17589 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2003-08-03 17:50
Logged In: YES 
user_id=33168

I think the problem may be due to the change on line 1760 of
arraymodule from PyList_GetItem (2.2) to PySequence_GetItem
(2.3) in 2.88 by Raymond.  

Adding a Py_XDECREF(v) at line 1765 "fixes" the problem, but
I don't think that's necessarily the correct solution for
all types.
msg17590 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2003-08-03 23:02
Logged In: YES 
user_id=44345

Added a test case to test_array.py (1.25)
msg17591 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2003-08-03 23:21
Logged In: YES 
user_id=31435

Neal, yup, that's the leak.  PySequence_GetItem() either 
returns NULL or a new reference, and in the latter case the 
result must always be decrefed; PyList_GetItem() returned a 
borrowed reference (or NULL).  The code should also check 
PySequence_GetItem() for an error (NULL) return, although 
the laziness of not checking for that was probably inherited 
from earlier code.

Skip, tests that use a refcount-revealing function should be 
conditionalized on hasattr(sys, 'name_of_function'), to avoid 
creating headaches for Jython testing.  You shouldn't check 
in failing tests independent of fixes:  the intent is that 
someone getting a fresh checkout should never see a test 
failure.

Leaving assigned to Raymond for now.
msg17592 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2003-08-03 23:32
Logged In: YES 
user_id=44345

Sorry - thought I was saving Raymond about 30 seconds of
work.  I'll remember in the future.  Also, I checked in a version
with the required hasattr() check.
msg17593 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-08-04 06:01
Logged In: YES 
user_id=80475

Attaching a patch for a second review.

The refcnt difference between Py_SequenceGetItem and 
Py_ListGetItem bit me in the behind.  Added the DECREF 
where Neal suggested but used a regular DECREF instead of 
an XDECREF because it is preceded by a guard.

The question then arose as to where the decreffing would 
affect the subsequent code, so I took another look at the two 
following sections.  It appears to me (and I really could use a 
second confirmation on this) that the String and UniCode 
sections were unnecessarily recopying the initializer when 
the length was greater than zero.  So, I changed the "if" to 
an "else if" to preclude the double copy and to resolve the 
question about the validity of the decref.

Also, I added the NULL check for the getitem as Tim 
suggested.

With the patch, all test cases (including Skip's test) and the 
OP's code run fine.  
msg17594 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2003-08-04 21:34
Logged In: YES 
user_id=33168

Patch looks good to me.  Do you know if there were any other
changes from Py_ListGetItem to Py_SequenceGetItem?
msg17595 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-08-05 11:25
Logged In: YES 
user_id=80475

Committed as Modules/arraymodule.c 2.91

I don't recall any other changes from Py_ListGetItem to 
PySequence_GetItem but will scan for them this weekend.
History
Date User Action Args
2022-04-10 16:10:27adminsetgithub: 39000
2003-08-03 16:15:18tim.peterscreate