[Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!

Victor Stinner victor.stinner at gmail.com
Thu Oct 24 14:52:52 CEST 2013


2013/10/24 Kristján Valur Jónsson <kristjan at ccpgames.com>:
>> Test 1. With the Python test suite, 467,738 traces limited to 1 frame:
> ...
>> I'm surprised: it's faster than the benchmark I ran some weeks ago.
>> Maybe I optimized something? The most critical operation, taking a snapshot
>> takes half a second, so it's enough efficient.
>
> Well, to me anything that happens in under a second is fast :)

In a previous version, taking a snapshot and computing the "top 10"
took up to 30 seconds. So it was justified to only store "stats"
(which is much smaller).

> (...) To do this, you need a top-down view of the application.  You
> need to break it down from the "main" call down towards the leaves.

We don't have to agree, I may add an option to decide which frames are
stored. I have to check how truncated traceback would fit with the
current API, especially with Snapshot.group_by('line').

Which API do you propose to decide which frames are kept (most recent
frames or oldest frames)? It may be a new parameter of
set_traceback_limit() for example.

I have an idea how to implement it in C. But I'm not convinced yet
that there is a need for it :-)

> Anyway, this is not so important.  I would run this with full traceback myself and truncate
> the tracebacks during the display stage anyway.

FYI tracemalloc has currently has arbitrary limit for 100 frames for a
technical reason: traces are stored in the stack in some C functions.

It should be possible to drop the limitation (without killing
performances). But storing 100 frames is already very slow.
Performances depends directly on the number of frames, because the
whole traceback is compared at each memory allocation to check if the
traceback is new or known in a hash table.

>>     @staticmethod
>>     def load(filename, traces=True):
>>         with open(filename, "rb") as fp:
>>             return pickle.load(fp)
>>
>
> What does the "traces" argument do in the load() function then?

A ghost from the past :-) I will remove it.

> Often, yes.  But there are big black boxes that remain.  The most numerous
> of those are those big mysterious allocations that can happen as a
> result of
> "import mymodule"

Yes, tracemalloc is not perfect. You may for example find a huge
allocation of 768 KB in a random "import module", it's the dictionary
of interned Unicode strings... It's hard to understand why "import
module" allocated 768 KB (and depending on the file modification time,
the allocation may occur somewhere else...).

> Not that we are likely to change PEP 445 at this stage, but this was the use
> case for my suggestion.

In tracemalloc, a traceback is a tuple of (str, int): you can probably
hack the the malloc API and tracemalloc to add C filename and line
number. (But I don't want to do it, because the Python traceback is
enough in my opinion.)

Victor


More information about the Python-Dev mailing list