[pypy-commit] pypy default: write docs
fijal
pypy.commits at gmail.com
Thu Feb 15 08:06:23 EST 2018
Author: fijal
Branch:
Changeset: r93824:ed765f557481
Date: 2018-02-15 14:05 +0100
http://bitbucket.org/pypy/pypy/changeset/ed765f557481/
Log: write docs
diff --git a/pypy/doc/gc_info.rst b/pypy/doc/gc_info.rst
--- a/pypy/doc/gc_info.rst
+++ b/pypy/doc/gc_info.rst
@@ -5,8 +5,87 @@
Incminimark
-----------
+PyPy's default garbage collector is called incminimark - it's an incremental,
+generational moving collector. Here we hope to explain a bit how it works
+and how it can be tuned to suit the workload.
+
+Incminimark first allocates objects in so called *nursery* - place for young
+objects, where allocation is very cheap, being just a pointer bump. The nursery
+size is a very crucial variable - depending on your workload (one or many
+processes) and cache sizes you might want to experiment with it via
+*PYPY_GC_NURSERY* environment variable. When the nursery is full, there is
+performed a minor collection. Freed objects are no longer referencable and
+just die, without any effort, while surviving objects from the nursery
+are copied to the old generation. Either to arenas, which are collections
+of objects of the same size, or directly allocated with malloc if they're big
+enough.
+
+Since Incminimark is an incremental GC, the major collection is incremental,
+meaning there should not be any pauses longer than 1ms.
+
+There is a special function in the ``gc`` module called
+``get_stats(memory_pressure=False)``.
+
+``memory_pressure`` controls whether or not to report memory pressure from
+objects allocated outside of the GC, which requires walking the entire heap,
+so it's disabled by default due to its cost. Enable it when debugging
+mysterious memory disappearance.
+
+Example call looks like that::
+
+ >>> gc.get_stats(True)
+ Total memory consumed:
+ GC used: 4.2MB (peak: 4.2MB)
+ in arenas: 763.7kB
+ rawmalloced: 383.1kB
+ nursery: 3.1MB
+ raw assembler used: 0.0kB
+ memory pressure: 0.0kB
+ -----------------------------
+ Total: 4.2MB
+
+ Total memory allocated:
+ GC allocated: 4.5MB (peak: 4.5MB)
+ in arenas: 763.7kB
+ rawmalloced: 383.1kB
+ nursery: 3.1MB
+ raw assembler allocated: 0.0kB
+ memory pressure: 0.0kB
+ -----------------------------
+ Total: 4.5MB
+
+In this particular case, which is just at startup, GC consumes relatively
+little memory and there is even less unused, but allocated memory. In case
+there is a high memory fragmentation, the "allocated" can be much higher
+than "used". Generally speaking, "peak" will more resemble the actual
+memory consumed as reported by RSS, since returning memory to the OS is a hard
+and not solved problem.
+
+The details of various fields:
+
+* GC in arenas - small old objects held in arenas. If the amount of allocated
+ is much higher than the amount of used, we have large fragmentation issue
+
+* GC rawmalloced - large objects allocated with malloc. If this does not
+ correspond to the amount of RSS very well, consider using jemalloc as opposed
+ to system malloc
+
+* nursery - amount of memory allocated for nursery, fixed at startup,
+ controlled via an environment variable
+
+* raw assembler allocated - amount of assembler memory that JIT feels
+ responsible for
+
+* memory pressure, if asked for - amount of memory we think got allocated
+ via external malloc (eg loading cert store in SSL contexts) that is kept
+ alive by GC objects, but not accounted in the GC
+
+
.. _minimark-environment-variables:
+Environment variables
+---------------------
+
PyPy's default ``incminimark`` garbage collector is configurable through
several environment variables:
More information about the pypy-commit
mailing list