Tracemalloc overhead when profiling

Juris __ dev_20192019 at outlook.com
Mon Jan 14 11:19:45 EST 2019


Hi,

I was looking for a way to profile memory usage for some script which 
deals with log message parsing. Looking through Python's stdlib I 
stumbled upon tracemalloc module. So I tried my hand on profiling my 
script. A few things I noticed that I am not 100% sure I can explain.

Tracemalloc memory overhead when tracing seems somewhere 3x-4x. Is that 
expected? The dumb example that demonstrates behavior:

---8<---
# memprof.py
import tracemalloc

def expensive():
     return [str(x) for x in range(1_000_000)]

if __name__ == '__main__':

     if not tracemalloc.is_tracing():
         tracemalloc.start()

     snapshot1 = tracemalloc.take_snapshot()

     _ = expensive()

     snapshot2 = tracemalloc.take_snapshot()
     tracemalloc.stop()

     for stat in snapshot2.compare_to(snapshot1, key_type="lineno"):
         print(stat)
---8<---


Script output with naive GNU time program profiling:

$ /usr/bin/time python3.7 memprof.py
memprof.py:6: size=60.6 MiB (+60.6 MiB), count=1000001 (+1000001), 
average=64 B
...snip...
1.40user 0.10system 0:01.51elapsed 99%CPU (0avgtext+0avgdata 
280284maxresident)k
0inputs+0outputs (0major+62801minor)pagefaults 0swaps


Same script but without actually tracing with tracemalloc:

$ /usr/bin/time python3.7 memprof.py
0.26user 0.03system 0:00.29elapsed 100%CPU (0avgtext+0avgdata 
72316maxresident)k
0inputs+0outputs (0major+17046minor)pagefaults 0swaps


So, when not tracing with tracemalloc memory used by script is 72MiB 
(credible since tracemalloc reports 60.6MiB allocated in hot spot). But 
then when tracemalloc is tracing script uses almost 4x memory e.g. 280MiB.

Is this expected? Any other tools for memory profiling you can recommend?

Running Python 3.7.2 on x86_64 Linux system.

BR,
Juris


More information about the Python-list mailing list