[Cython] Memory leak when using Typed Memory View and np array of objects
Haijie Gu
gu.haijie at gmail.com
Wed Oct 2 18:36:01 CEST 2013
Hi,
I'm new to cython's typed memory view, and found some cases where the
function that uses typed memory view has memory leaking.
The leak happens when I pass a numpy array of objects, where each object
itself is a numpy array.( You can get this 'weird' object from construction
a pandas Series with a list of numpy arrays).
Please see the following code snippet or use the attached code to reproduce
the case.
I appreciate any help and suggestions in advance! (I also posted on the
cython-users google group. Apologize for the redundancy.)
# BEGIN CONTENT OF test.pyx
# this does not leak
cpdef int do_nothing(arr):
return 0
# this does leak
cpdef int do_nothing_typed(double[:] arr):
return 0
# this does leak
cpdef int do_nothing_but_copy(arr):
cdef double[:] _arr = arr
return 0
# END CONTENT OF test.pyx
# BEGIN CONTENT OF runtest.py
# ... omit all the imports here
def gc_obj_hist():
"""
Returns a sorted map from type to the counts
of in memory objects with the type
"""
hst = defaultdict(lambda: 0)
for v in gc.get_objects():
hst[type(v)] += 1
l = sorted(hst.iteritems(), key=operator.itemgetter(1), reverse=True)
return l
# NOT LEAK
def test1(n=10000):
s = pd.Series([np.random.randn(10) for i in range(n)])
for i in range(n):
do_nothing(s[i])
print "Top 5 object types after test 1: " + str(gc_obj_hist()[:5])
# LEAK
def test2(n=10000):
s = pd.Series([np.random.randn(10) for i in range(n)])
for i in range(n):
do_nothing_typed(s[i])
print "Top 5 object types after test 2: " + str(gc_obj_hist()[:5])
# LEAK
def test3(n=10000):
s = pd.Series([np.random.randn(10) for i in range(n)])
for i in range(n):
do_nothing_but_copy(s[i])
print "Top 5 object types after test 3: " + str(gc_obj_hist()[:5])
# NOT LEAK
def test4(n=10000):
s = pd.Series([np.random.randn(10) for i in range(n)])
for i in range(n):
do_nothing_but_copy(np.array(s[i]))
print "Top 5 object types after test 4: " + str(gc_obj_hist()[:5])
if __name__ == "__main__":
n = 100000
test1(n)
test2(n)
test3(n)
test4(n)
# END CONTENT OF runtest.py
Thanks,
-jay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20131002/de68939d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: leaktest.tar.gz
Type: application/x-gzip
Size: 893 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20131002/de68939d/attachment-0001.bin>
More information about the cython-devel
mailing list