[pypy-dev] Pypy garbage collection

Mon Mar 17 17:39:13 CET 2014

not sure how more we can help without looking into the code

On Mon, Mar 17, 2014 at 6:05 PM, Martin Koch <mak at issuu.com> wrote:
> Thanks :)
>
> /Martin
>
>
>> On 17/03/2014, at 16.41, Maciej Fijalkowski <fijall at gmail.com> wrote:
>>
>> ok.
>>
>> so as you can probably see, the max is not that big, which means the
>> GC is really incremental. What happens is you get tons of garbage that
>> survives minor collection every now and then. I don't exactly know
>> why, but you should look what objects can potentially survive for too
>> long.
>>
>>> On Mon, Mar 17, 2014 at 5:37 PM, Martin Koch <mak at issuu.com> wrote:
>>> Ah - it just occured to me that the first runs may be slow anyway: Since we
>>> take the average of the last 100 runs as the benchmark, then the first 100
>>> runs are not classified as slow. Indeed, the first three runs with many
>>> collections are in the first 100 runs.
>>>
>>>
>>>> On Mon, Mar 17, 2014 at 4:35 PM, Martin Koch <mak at issuu.com> wrote:
>>>>
>>>> Here are the total and max times in millions of units; 30000 units is
>>>> approximately 13 seconds. I have extracted the runs where there are many
>>>> gc-collect-steps. These are in execution order, so the first runs with many
>>>> gc-collect-steps aren't slow.
>>>>
>>>> Totals: gc-minor:418 gc-minor-walkroots:0 gc-collect-step:28797 Max:
>>>> gc-minor:10 gc-collect-step:247
>>>> Totals: gc-minor:562 gc-minor-walkroots:0 gc-collect-step:30282 Max:
>>>> gc-minor:10 gc-collect-step:245
>>>> Totals: gc-minor:434 gc-minor-walkroots:0 gc-collect-step:31040 Max:
>>>> gc-minor:11 gc-collect-step:244
>>>> Totals: gc-minor:417 slow:1 gc-minor-walkroots:0 gc-collect-step:31270
>>>> Max: gc-minor:17 gc-collect-step:244
>>>> Totals: gc-minor:435 slow:1 gc-minor-walkroots:0 gc-collect-step:30365
>>>> Max: gc-minor:11 gc-collect-step:248
>>>> Totals: gc-minor:389 slow:1 gc-minor-walkroots:0 gc-collect-step:31235
>>>> Max: gc-minor:8 gc-collect-step:299
>>>> Totals: gc-minor:434 slow:1 gc-minor-walkroots:0 gc-collect-step:31124
>>>> Max: gc-minor:11 gc-collect-step:246
>>>> Totals: gc-minor:386 slow:1 gc-minor-walkroots:0 gc-collect-step:30541
>>>> Max: gc-minor:8 gc-collect-step:244
>>>> Totals: gc-minor:410 slow:1 gc-minor-walkroots:0 gc-collect-step:31427
>>>> Max: gc-minor:36 gc-collect-step:248
>>>> Totals: gc-minor:390 slow:1 gc-minor-walkroots:0 gc-collect-step:30743
>>>> Max: gc-minor:8 gc-collect-step:244
>>>> Totals: gc-minor:380 slow:1 gc-minor-walkroots:0 gc-collect-step:30207
>>>> Max: gc-minor:8 gc-collect-step:245
>>>> Totals: gc-minor:387 slow:1 gc-minor-walkroots:0 gc-collect-step:30837
>>>> Max: gc-minor:8 gc-collect-step:244
>>>> Totals: gc-minor:412 slow:1 gc-minor-walkroots:0 gc-collect-step:30898
>>>> Max: gc-minor:38 gc-collect-step:244
>>>> Totals: gc-minor:415 slow:1 gc-minor-walkroots:0 gc-collect-step:30407
>>>> Max: gc-minor:23 gc-collect-step:245
>>>> Totals: gc-minor:380 slow:1 gc-minor-walkroots:0 gc-collect-step:30591
>>>> Max: gc-minor:8 gc-collect-step:246
>>>> Totals: gc-minor:387 slow:1 gc-minor-walkroots:0 gc-collect-step:31193
>>>> Max: gc-minor:9 gc-collect-step:244
>>>> Totals: gc-minor:379 slow:1 gc-minor-walkroots:0 gc-collect-step:30026
>>>> Max: gc-minor:8 gc-collect-step:246
>>>> Totals: gc-minor:388 slow:1 gc-minor-walkroots:0 gc-collect-step:31179
>>>> Max: gc-minor:8 gc-collect-step:248
>>>> Totals: gc-minor:378 slow:1 gc-minor-walkroots:0 gc-collect-step:30674
>>>> Max: gc-minor:8 gc-collect-step:250
>>>> Totals: gc-minor:385 slow:1 gc-minor-walkroots:0 gc-collect-step:30413
>>>> Max: gc-minor:8 gc-collect-step:245
>>>> Totals: gc-minor:915 slow:1 gc-minor-walkroots:0 gc-collect-step:30830
>>>> Max: gc-minor:543 gc-collect-step:244
>>>> Totals: gc-minor:405 slow:1 gc-minor-walkroots:0 gc-collect-step:31153
>>>> Max: gc-minor:20 gc-collect-step:246
>>>> Totals: gc-minor:408 slow:1 gc-minor-walkroots:0 gc-collect-step:29815
>>>> Max: gc-minor:25 gc-collect-step:245
>>>>
>>>> Thanks,
>>>> /Martin
>>>>
>>>>
>>>>> On Mon, Mar 17, 2014 at 3:24 PM, Martin Koch <mak at issuu.com> wrote:
>>>>>
>>>>> Ah. I had misunderstood. I'll get back to you on that :) thanks
>>>>>
>>>>> /Martin
>>>>>
>>>>>
>>>>>> On 17/03/2014, at 15.21, Maciej Fijalkowski <fijall at gmail.com> wrote:
>>>>>>
>>>>>> eh, this is not what I need
>>>>>>
>>>>>> I need a max of TIME it took for a gc-minor and the TOTAL time it took
>>>>>> for a gc-minor (per query) (ideally same for gc-walkroots and
>>>>>> gc-collect-step)
>>>>>>
>>>>>>> On Mon, Mar 17, 2014 at 4:19 PM, Martin Koch <mak at issuu.com> wrote:
>>>>>>> Here are the collated results of running each query. For each run, I
>>>>>>> count
>>>>>>> how many of each of the pypy debug lines i get. I.e. there were 668
>>>>>>> runs
>>>>>>> that printed 58 loglines that contain "{gc-minor" which was eventually
>>>>>>> followed by "gc-minor}". I have also counted if the query was slow;
>>>>>>> interestingly, not all the queries with many gc-minors were slow (but
>>>>>>> all
>>>>>>> slow queries had a gc-minor).
>>>>>>>
>>>>>>> Please let me know if this is unclear :)
>>>>>>>
>>>>>>>   668 gc-minor:58 gc-minor-walkroots:58
>>>>>>>    10 gc-minor:58 gc-minor-walkroots:58 gc-collect-step:5
>>>>>>>   140 gc-minor:59 gc-minor-walkroots:59
>>>>>>>     1 gc-minor:8441 gc-minor-walkroots:8441 gc-collect-step:8403
>>>>>>>     1 gc-minor:9300 gc-minor-walkroots:9300 gc-collect-step:9249
>>>>>>>     9 gc-minor:9643 slow:1 gc-minor-walkroots:9643
>>>>>>> gc-collect-step:9589
>>>>>>>     1 gc-minor:9644 slow:1 gc-minor-walkroots:9644
>>>>>>> gc-collect-step:9590
>>>>>>>    10 gc-minor:9647 slow:1 gc-minor-walkroots:9647
>>>>>>> gc-collect-step:9609
>>>>>>>     1 gc-minor:9663 gc-minor-walkroots:9663 gc-collect-step:9614
>>>>>>>     1 jit-backend-dump:5 gc-minor:58 gc-minor-walkroots:58
>>>>>>>     1 jit-log-compiling-loop:1 gc-collect-step:8991
>>>>>>> jit-backend-dump:78
>>>>>>> jit-backend:3 jit-log-noopt-loop:6 jit-log-virtualstate:3
>>>>>>> gc-minor:9030
>>>>>>> jit-tracing:3 gc-minor-walkroots:9030 jit-optimize:6
>>>>>>> jit-log-short-preamble:2 jit-backend-addr:3 jit-log-opt-loop:1
>>>>>>> jit-mem-looptoken-alloc:3 jit-abort:3 jit-log-rewritten-bridge:2
>>>>>>> jit-log-rewritten-loop:1 jit-log-opt-bridge:2
>>>>>>> jit-log-compiling-bridge:2
>>>>>>> jit-resume:84
>>>>>>>     1 jit-log-compiling-loop:1 jit-backend-dump:13 jit-backend:1
>>>>>>> jit-log-noopt-loop:2 gc-minor:60 jit-tracing:1 gc-minor-walkroots:60
>>>>>>> jit-optimize:2 jit-log-short-preamble:1 jit-backend-addr:1
>>>>>>> jit-log-opt-loop:1 jit-mem-looptoken-alloc:1 jit-log-rewritten-loop:1
>>>>>>> jit-resume:14
>>>>>>>     1 jit-log-compiling-loop:1 jit-backend-dump:73 jit-backend:3
>>>>>>> jit-log-noopt-loop:6 jit-log-virtualstate:3 gc-minor:60 jit-tracing:3
>>>>>>> gc-minor-walkroots:60 jit-optimize:6 jit-log-short-preamble:2
>>>>>>> jit-backend-addr:3 jit-log-opt-loop:1 jit-mem-looptoken-alloc:3
>>>>>>> jit-abort:3
>>>>>>> jit-log-rewritten-bridge:2 jit-log-rewritten-loop:1
>>>>>>> jit-log-opt-bridge:2
>>>>>>> jit-log-compiling-bridge:2 jit-resume:84
>>>>>>>     2 jit-log-compiling-loop:1 jit-backend-dump:78 jit-backend:3
>>>>>>> jit-log-noopt-loop:6 jit-log-virtualstate:3 gc-minor:61 jit-tracing:3
>>>>>>> gc-minor-walkroots:61 jit-optimize:6 jit-log-short-preamble:2
>>>>>>> jit-backend-addr:3 jit-log-opt-loop:1 jit-mem-looptoken-alloc:3
>>>>>>> jit-abort:3
>>>>>>> jit-log-rewritten-bridge:2 jit-log-rewritten-loop:1
>>>>>>> jit-log-opt-bridge:2
>>>>>>> jit-log-compiling-bridge:2 jit-resume:84
>>>>>>>     1 jit-log-short-preamble:2 jit-log-compiling-loop:2
>>>>>>> jit-backend-dump:92 jit-log-noopt-loop:7 jit-log-virtualstate:3
>>>>>>> gc-minor:61
>>>>>>> jit-tracing:4 gc-minor-walkroots:61 jit-optimize:7 jit-backend:4
>>>>>>> jit-backend-addr:4 jit-log-opt-loop:2 jit-mem-looptoken-alloc:4
>>>>>>> jit-abort:3
>>>>>>> jit-log-rewritten-bridge:2 jit-log-rewritten-loop:2
>>>>>>> jit-log-opt-bridge:2
>>>>>>> jit-log-compiling-bridge:2 jit-resume:104
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> /Martin
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Mar 17, 2014 at 2:23 PM, Maciej Fijalkowski <fijall at gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On Mon, Mar 17, 2014 at 3:20 PM, Maciej Fijalkowski
>>>>>>>> <fijall at gmail.com>
>>>>>>>> wrote:
>>>>>>>>> are you *sure* it's the walkroots that take that long and not
>>>>>>>>> something else (like gc-minor)? More of those mean that you allocate
>>>>>>>>> a
>>>>>>>>> lot more surviving objects. Can you do two things:
>>>>>>>>>
>>>>>>>>> a) take a max of gc-minor (and gc-minor-stackwalk), per request
>>>>>>>>> b) take the sum of those
>>>>>>>>>
>>>>>>>>> and plot them
>>>>>>>>
>>>>>>>> ^^^ or just paste the results actually
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Mon, Mar 17, 2014 at 3:18 PM, Martin Koch <mak at issuu.com> wrote:
>>>>>>>>>> Well, then it works out to around 2.5GHz, which seems reasonable.
>>>>>>>>>> But
>>>>>>>>>> it
>>>>>>>>>> doesn't alter the conclusion from the previous email: The slow
>>>>>>>>>> queries
>>>>>>>>>> then
>>>>>>>>>> all have a duration around 34*10^9 units, 'normal' queries 1*10^9
>>>>>>>>>> units, or
>>>>>>>>>> .4 seconds at this conversion. Also, the log shows that a slow
>>>>>>>>>> query
>>>>>>>>>> performs many more gc-minor operations than a 'normal' one: 9600
>>>>>>>>>> gc-collect-step/gc-minor/gc-minor-walkroots operations vs 58.
>>>>>>>>>>
>>>>>>>>>> So the question becomes: Why do we get this large spike in
>>>>>>>>>> gc-minor-walkroots, and, in particular, is there any way to avoid
>>>>>>>>>> it :)
>>>>>>>>>> ?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> /Martin
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Mar 17, 2014 at 1:53 PM, Maciej Fijalkowski
>>>>>>>>>> <fijall at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> I think it's the cycles of your CPU
>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Mar 17, 2014 at 2:48 PM, Martin Koch <mak at issuu.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> What is the unit? Perhaps I'm being thick here, but I can't
>>>>>>>>>>>> correlate it
>>>>>>>>>>>> with seconds (which the program does print out). Slow runs are
>>>>>>>>>>>> around 13
>>>>>>>>>>>> seconds, but are around 34*10^9(dec), 0x800000000 timestamp units
>>>>>>>>>>>> (e.g.
>>>>>>>>>>>> from
>>>>>>>>>>>> 0x2b994c9d31889c to 0x2b9944ab8c4f49).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Mar 17, 2014 at 12:09 PM, Maciej Fijalkowski
>>>>>>>>>>>> <fijall at gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The number of lines is nonsense. This is a timestamp in hex.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Mar 17, 2014 at 12:46 PM, Martin Koch <mak at issuu.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Based On Maciej's suggestion, I tried the following
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> PYPYLOG=- pypy mem.py 10000000 > out
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This generates a logfile which looks something like this
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> start-->
>>>>>>>>>>>>>> [2b99f1981b527e] {gc-minor
>>>>>>>>>>>>>> [2b99f1981ba680] {gc-minor-walkroots
>>>>>>>>>>>>>> [2b99f1981c2e02] gc-minor-walkroots}
>>>>>>>>>>>>>> [2b99f19890d750] gc-minor}
>>>>>>>>>>>>>> [snip]
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>> <--stop
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It turns out that the culprit is a lot of MINOR collections.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I base this on the following observations:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I can't understand the format of the timestamp on each logline
>>>>>>>>>>>>>> (the
>>>>>>>>>>>>>> "[2b99f1981b527e]"). From what I can see in the code, this
>>>>>>>>>>>>>> should
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>> output
>>>>>>>>>>>>>> from time.clock(), but that doesn't return a number like that
>>>>>>>>>>>>>> when I
>>>>>>>>>>>>>> run
>>>>>>>>>>>>>> pypy interactively
>>>>>>>>>>>>>> Instead, I count the number of debug lines between start--> and
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> corresponding <--stop.
>>>>>>>>>>>>>> Most runs have a few hundred lines of output between start/stop
>>>>>>>>>>>>>> All slow runs have very close to 57800 lines out output between
>>>>>>>>>>>>>> start/stop
>>>>>>>>>>>>>> One such sample does 9609 gc-collect-step operations, 9647
>>>>>>>>>>>>>> gc-minor
>>>>>>>>>>>>>> operations, and 9647 gc-minor-walkroots operations.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> /Martin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Mar 17, 2014 at 8:21 AM, Maciej Fijalkowski
>>>>>>>>>>>>>> <fijall at gmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> there is an environment variable PYPYLOG=gc:- (where - is
>>>>>>>>>>>>>>> stdout)
>>>>>>>>>>>>>>> which will do that for you btw.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> maybe you can find out what's that using profiling or
>>>>>>>>>>>>>>> valgrind?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Mar 16, 2014 at 11:34 PM, Martin Koch <mak at issuu.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> I have tried getting the pypy source and building my own
>>>>>>>>>>>>>>>> version
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>> pypy. I
>>>>>>>>>>>>>>>> have modified
>>>>>>>>>>>>>>>> rpython/memory/gc/incminimark.py:major_collection_step()
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> print out when it starts and when it stops. Apparently, the
>>>>>>>>>>>>>>>> slow
>>>>>>>>>>>>>>>> queries
>>>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>> NOT occur during major_collection_step; at least, I have not
>>>>>>>>>>>>>>>> observed
>>>>>>>>>>>>>>>> major
>>>>>>>>>>>>>>>> step output during a query execution. So, apparently,
>>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>>> else
>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>> blocking. This could be another aspect of the GC, but it
>>>>>>>>>>>>>>>> could
>>>>>>>>>>>>>>>> also
>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>> anything else.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Just to be sure, I have tried running the same application in
>>>>>>>>>>>>>>>> python
>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>> garbage collection disabled. I don't see the problem there,
>>>>>>>>>>>>>>>> so
>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>> somehow
>>>>>>>>>>>>>>>> related to either GC or the runtime somehow.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> /Martin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Mar 14, 2014 at 4:19 PM, Martin Koch <mak at issuu.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We have hacked up a small sample that seems to exhibit the
>>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>> issue.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We basically generate a linked list of objects. To increase
>>>>>>>>>>>>>>>>> connectedness,
>>>>>>>>>>>>>>>>> elements in the list hold references (dummy_links) to 10
>>>>>>>>>>>>>>>>> randomly
>>>>>>>>>>>>>>>>> chosen
>>>>>>>>>>>>>>>>> previous elements in the list.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We then time a function that traverses 50000 elements from
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>> from a
>>>>>>>>>>>>>>>>> random start point. If the traversal reaches the end of the
>>>>>>>>>>>>>>>>> list,
>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>> instead
>>>>>>>>>>>>>>>>> traverse one of the dummy links. Thus, exactly 50K elements
>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>> traversed
>>>>>>>>>>>>>>>>> every time. To generate some garbage, we build a list
>>>>>>>>>>>>>>>>> holding
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> traversed
>>>>>>>>>>>>>>>>> elements and a dummy list of characters.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Timings for the last 100 runs are stored in a circular
>>>>>>>>>>>>>>>>> buffer. If
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> elapsed time for the last run is more than twice the average
>>>>>>>>>>>>>>>>> time,
>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>> print
>>>>>>>>>>>>>>>>> out a line with the elapsed time, the threshold, and the 90%
>>>>>>>>>>>>>>>>> runtime
>>>>>>>>>>>>>>>>> (we
>>>>>>>>>>>>>>>>> would like to see that the mean runtime does not increase
>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> number of
>>>>>>>>>>>>>>>>> elements in the list, but that the max time does increase
>>>>>>>>>>>>>>>>> (linearly
>>>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>>>> number of object, i guess); traversing 50K elements should
>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>> independent of
>>>>>>>>>>>>>>>>> the memory size).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We have tried monitoring memory consumption by external
>>>>>>>>>>>>>>>>> inspection,
>>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>> cannot consistently verify that memory is deallocated at the
>>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>> time
>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>> we see slow requests. Perhaps the pypy runtime doesn't
>>>>>>>>>>>>>>>>> always
>>>>>>>>>>>>>>>>> return
>>>>>>>>>>>>>>>>> freed
>>>>>>>>>>>>>>>>> pages back to the OS?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Using top, we observe that 10M elements allocates around
>>>>>>>>>>>>>>>>> 17GB
>>>>>>>>>>>>>>>>> after
>>>>>>>>>>>>>>>>> building, 20M elements 26GB, 30M elements 28GB (and grows to
>>>>>>>>>>>>>>>>> 35GB
>>>>>>>>>>>>>>>>> shortly
>>>>>>>>>>>>>>>>> after building).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Here is output from a few runs with different number of
>>>>>>>>>>>>>>>>> elements:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> pypy mem.py 10000000
>>>>>>>>>>>>>>>>> start build
>>>>>>>>>>>>>>>>> end build 84.142424
>>>>>>>>>>>>>>>>> that took a long time elapsed: 13.230586  slow_threshold:
>>>>>>>>>>>>>>>>> 1.495401
>>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.421558
>>>>>>>>>>>>>>>>> that took a long time elapsed: 13.016531  slow_threshold:
>>>>>>>>>>>>>>>>> 1.488160
>>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.423441
>>>>>>>>>>>>>>>>> that took a long time elapsed: 13.032537  slow_threshold:
>>>>>>>>>>>>>>>>> 1.474563
>>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.419817
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> pypy mem.py 20000000
>>>>>>>>>>>>>>>>> start build
>>>>>>>>>>>>>>>>> end build 180.823105
>>>>>>>>>>>>>>>>> that took a long time elapsed: 27.346064  slow_threshold:
>>>>>>>>>>>>>>>>> 2.295146
>>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.434726
>>>>>>>>>>>>>>>>> that took a long time elapsed: 26.028852  slow_threshold:
>>>>>>>>>>>>>>>>> 2.283927
>>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.374190
>>>>>>>>>>>>>>>>> that took a long time elapsed: 25.432279  slow_threshold:
>>>>>>>>>>>>>>>>> 2.279631
>>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.371502
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> pypy mem.py 30000000
>>>>>>>>>>>>>>>>> start build
>>>>>>>>>>>>>>>>> end build 276.217811
>>>>>>>>>>>>>>>>> that took a long time elapsed: 40.993855  slow_threshold:
>>>>>>>>>>>>>>>>> 3.188464
>>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.459891
>>>>>>>>>>>>>>>>> that took a long time elapsed: 41.693553  slow_threshold:
>>>>>>>>>>>>>>>>> 3.183003
>>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.393654
>>>>>>>>>>>>>>>>> that took a long time elapsed: 39.679769  slow_threshold:
>>>>>>>>>>>>>>>>> 3.190782
>>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.393677
>>>>>>>>>>>>>>>>> that took a long time elapsed: 43.573411  slow_threshold:
>>>>>>>>>>>>>>>>> 3.239637
>>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.393654
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Code below
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --------------------------------------------------------------
>>>>>>>>>>>>>>>>> import time
>>>>>>>>>>>>>>>>> from random import randint, choice
>>>>>>>>>>>>>>>>> import sys
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> allElems = {}
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> class Node:
>>>>>>>>>>>>>>>>>   def __init__(self, v_):
>>>>>>>>>>>>>>>>>       self.v = v_
>>>>>>>>>>>>>>>>>       self.next = None
>>>>>>>>>>>>>>>>>       self.dummy_data = [randint(0,100)
>>>>>>>>>>>>>>>>>                          for _ in xrange(randint(50,100))]
>>>>>>>>>>>>>>>>>       allElems[self.v] = self
>>>>>>>>>>>>>>>>>       if self.v > 0:
>>>>>>>>>>>>>>>>>           self.dummy_links = [allElems[randint(0,
>>>>>>>>>>>>>>>>> self.v-1)]
>>>>>>>>>>>>>>>>> for _
>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>> xrange(10)]
>>>>>>>>>>>>>>>>>       else:
>>>>>>>>>>>>>>>>>           self.dummy_links = [self]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>   def set_next(self, l):
>>>>>>>>>>>>>>>>>       self.next = l
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> def follow(node):
>>>>>>>>>>>>>>>>>   acc = []
>>>>>>>>>>>>>>>>>   count = 0
>>>>>>>>>>>>>>>>>   cur = node
>>>>>>>>>>>>>>>>>   assert node.v is not None
>>>>>>>>>>>>>>>>>   assert cur is not None
>>>>>>>>>>>>>>>>>   while count < 50000:
>>>>>>>>>>>>>>>>>       # return a value; generate some garbage
>>>>>>>>>>>>>>>>>       acc.append((cur.v,
>>>>>>>>>>>>>>>>> [choice("abcdefghijklmnopqrstuvwxyz")
>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>> x
>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>> xrange(100)]))
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>       # if we have reached the end, chose a random link
>>>>>>>>>>>>>>>>>       cur = choice(cur.dummy_links) if cur.next is None
>>>>>>>>>>>>>>>>> else
>>>>>>>>>>>>>>>>> cur.next
>>>>>>>>>>>>>>>>>       count += 1
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>   return acc
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> def build(num_elems):
>>>>>>>>>>>>>>>>>   start = time.time()
>>>>>>>>>>>>>>>>>   print "start build"
>>>>>>>>>>>>>>>>>   root = Node(0)
>>>>>>>>>>>>>>>>>   cur = root
>>>>>>>>>>>>>>>>>   for x in xrange(1, num_elems):
>>>>>>>>>>>>>>>>>       e = Node(x)
>>>>>>>>>>>>>>>>>       cur.next = e
>>>>>>>>>>>>>>>>>       cur = e
>>>>>>>>>>>>>>>>>   print "end build %f" % (time.time() - start)
>>>>>>>>>>>>>>>>>   return root
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> num_timings = 100
>>>>>>>>>>>>>>>>> if __name__ == "__main__":
>>>>>>>>>>>>>>>>>   num_elems = int(sys.argv[1])
>>>>>>>>>>>>>>>>>   build(num_elems)
>>>>>>>>>>>>>>>>>   total = 0
>>>>>>>>>>>>>>>>>   timings = [0.0] * num_timings # run times for the last
>>>>>>>>>>>>>>>>> num_timings
>>>>>>>>>>>>>>>>> runs
>>>>>>>>>>>>>>>>>   i = 0
>>>>>>>>>>>>>>>>>   beginning = time.time()
>>>>>>>>>>>>>>>>>   while time.time() - beginning < 600:
>>>>>>>>>>>>>>>>>       start = time.time()
>>>>>>>>>>>>>>>>>       elem = allElems[randint(0, num_elems - 1)]
>>>>>>>>>>>>>>>>>       assert(elem is not None)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>       lst = follow(elem)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>       total += choice(lst)[0] # use the return value for
>>>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>       end = time.time()
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>       elapsed = end-start
>>>>>>>>>>>>>>>>>       timings[i % num_timings] = elapsed
>>>>>>>>>>>>>>>>>       if (i > num_timings):
>>>>>>>>>>>>>>>>>           slow_time = 2 * sum(timings)/num_timings # slow
>>>>>>>>>>>>>>>>> defined
>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2*avg run time
>>>>>>>>>>>>>>>>>           if (elapsed > slow_time):
>>>>>>>>>>>>>>>>>               print "that took a long time elapsed: %f
>>>>>>>>>>>>>>>>> slow_threshold:
>>>>>>>>>>>>>>>>> %f  90th_quantile_runtime: %f" % \
>>>>>>>>>>>>>>>>>                   (elapsed, slow_time,
>>>>>>>>>>>>>>>>> sorted(timings)[int(num_timings*.9)])
>>>>>>>>>>>>>>>>>       i += 1
>>>>>>>>>>>>>>>>>   print total
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Mar 13, 2014 at 7:45 PM, Maciej Fijalkowski
>>>>>>>>>>>>>>>>> <fijall at gmail.com>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Mar 13, 2014 at 1:45 PM, Martin Koch
>>>>>>>>>>>>>>>>>> <mak at issuu.com>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> Hi Armin, Maciej
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks for responding.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I'm in the process of trying to determine what (if any) of
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>>>>>>> in a
>>>>>>>>>>>>>>>>>>> position to share, and I'll get back to you.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Allowing hinting to the GC would be good. Even better
>>>>>>>>>>>>>>>>>>> would be
>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>> means
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> allow me to (transparently) allocate objects in unmanaged
>>>>>>>>>>>>>>>>>>> memory,
>>>>>>>>>>>>>>>>>>> but I
>>>>>>>>>>>>>>>>>>> would expect that to be a tall order :)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> /Martin
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Martin.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Note that in case you want us to do the work of isolating
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> problem,
>>>>>>>>>>>>>>>>>> we do offer paid support to do that (then we can sign NDAs
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> stuff).
>>>>>>>>>>>>>>>>>> Otherwise we would be more than happy to fix bugs once you
>>>>>>>>>>>>>>>>>> isolate
>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>> part you can share freely :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>
>>>>
>>>