[Python-checkins] peps: PEP 454: rationale

victor.stinner python-checkins at python.org
Tue Oct 22 13:58:28 CEST 2013


http://hg.python.org/peps/rev/efe592075b52
changeset:   5204:efe592075b52
user:        Victor Stinner <victor.stinner at gmail.com>
date:        Tue Oct 22 13:57:53 2013 +0200
summary:
  PEP 454: rationale

files:
  pep-0454.txt |  34 ++++++++++++++++++++++++----------
  1 files changed, 24 insertions(+), 10 deletions(-)


diff --git a/pep-0454.txt b/pep-0454.txt
--- a/pep-0454.txt
+++ b/pep-0454.txt
@@ -20,18 +20,23 @@
 Rationale
 =========
 
-Common debug tools tracing memory allocations record the C filename
-and line number where the allocation occurs.  Using such tools to
-analyze Python memory allocations does not help because most memory
-blocks are allocated in the same C function, in ``PyMem_Malloc()`` for
-example.
+Classic generic tools like Valgrind can get the C traceback where a
+memory block was allocated. Using such tools to analyze Python memory
+allocations does not help because most memory blocks are allocated in
+the same C function, in ``PyMem_Malloc()`` for example. Moreover, Python
+has an allocator for small object called "pymalloc" which keeps free
+blocks for efficiency. This is not well handled by these tools.
 
 There are debug tools dedicated to the Python language like ``Heapy``
-and ``PySizer``. These tools analyze objects type and/or content.
-They are useful when most memory leaks are instances of the same type
-and this type is only instantiated in a few functions. Problems arise
-when the object type is very common like ``str`` or ``tuple``, and it
-is hard to identify where these objects are instantiated.
+``Pympler`` and ``Meliae`` which lists all live objects using the
+garbage module (functions like ``gc.get_objects()``,
+``gc.get_referrers()`` and ``gc.get_referents()``), compute their size
+(ex: using ``sys.getsizeof()``) and group objects by type. These tools
+provide a better estimation of the memory usage of an application.  They
+are useful when most memory leaks are instances of the same type and
+this type is only instantiated in a few functions. Problems arise when
+the object type is very common like ``str`` or ``tuple``, and it is hard
+to identify where these objects are instantiated.
 
 Finding reference cycles is also a difficult problem.  There are
 different tools to draw a diagram of all references.  These tools
@@ -63,6 +68,15 @@
 `documentation of the faulthandler module
 <http://docs.python.org/3/library/faulthandler.html>`_.
 
+The idea of tracing memory allocations is not new. It was first
+implemented in the PySizer project in 2005. PySizer was implemented
+differently: the traceback was stored in frame objects and some Python
+types were linked the trace with the name of object type. PySizer patch
+on CPython adds a overhead on performances and memory footprint, even if
+the PySizer was not used. tracemalloc attachs a traceback to the
+underlying layer, to memory blocks, and has no overhead when the module
+is disabled.
+
 The tracemalloc module has been written for CPython. Other
 implementations of Python may not be able to provide it.
 

-- 
Repository URL: http://hg.python.org/peps


More information about the Python-checkins mailing list