[Python-Dev] Add a new tracemalloc module to trace memory allocations

Victor Stinner victor.stinner at gmail.com
Sun Sep 1 00:24:30 CEST 2013


Le 31 août 2013 19:09, "Gregory P. Smith" <greg at krypto.org> a écrit :
> First, I really like this.  +1

Current votes: +3 (i also want tracemalloc!). No opposition against such
addition.

> We should be consistent with faulthandler's options. Why do you not want
to support both the env var and enable()/disable() functions?

The reason was technical but I have issues with enabling tracemalloc before
Python is fully initialized. Enabling tracemalloc when Python is almost
initialized and disabling it before Python exit is more reliable.

In the last implementation, enable() and disable() are back.
PYTHONTRACEMALLOC is still available. I will also add -X tracemalloc
command line.

> Users are likely to want snapshots captured by enable()/disable() around
particular pieces of code just as much as whole program information.

enable()/disable() are useful for tests of the tracemalloc module!

> Taking that further: file and line information is great, but what if you
extend the concept: could you allow for C API or even Python hooks to
gather additional information at the time of each allocation or free? for
example: Gathering the actual C and Python stack traces for correlation to
figure out what call patterns lead allocations is powerful.

There is no portable function to retrieve the C traceback.

For the Python traceback: it may be possible to get it but you have to
remember that the hook is on PyMem_Malloc(), a low level memory allocator.
The GIL is hold. It is possible to call Python code in some cases, but they
are corner cases where it would lead to a loop, deadlock or worse.

I prefer to only read available Python data without calling any Python code
and try to write a reliable debug tool. I still have some issues like
issues with reentrant calls to the hook, but I'm working on them.

A nice improvement compared to the implementation on PyPI would be to hook
PyMem_RawMalloc(), to trace also the memory used by gzip, bz2, lzma and
other C modules. Locking the GIL in PyMem_RawMalloc() to get the filename
and locking internals tracemalloc structures causes also new issues (like a
funny deadlock related to subinterpreter).

> (Yes, this gets messy fast as hooks should not trigger calls back into
themselves when they allocate or free, similar to the "fun" involved in
writing coverage tools)

tracemalloc uses a "reentrant" variable to do nothing on a reentrant call
to the hook.

> let me know if you think i'm crazy. :)

You are not crazy. It is just hard to implement it. I'm not sure that it is
possible.

Victor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20130901/d9caffd6/attachment.html>


More information about the Python-Dev mailing list