[pypy-issue] Issue #1984: Continulet switch results in a segfault (pypy/pypy)
Jakub Stasiak
issues-reply at bitbucket.org
Sun Feb 15 00:21:27 CET 2015
New issue 1984: Continulet switch results in a segfault
https://bitbucket.org/pypy/pypy/issue/1984/continulet-switch-results-in-a-segfault
Jakub Stasiak:
This is a piece of code that crashes PyPy 2.4.0, 2.5.0 and version built from the latest source code (I didn't write it like this, this is a test case I managed to isolate starting with something larger). It segfaults every time I run it on OS X 10.9.5 (homebrew build and my own builds) and Ubuntu 14.04 (official Linux x64 packages).
```
#!python
from _continuation import continulet
c1 = continulet.__new__(continulet)
c2 = continulet(lambda g: None)
continulet.switch(c1, to=c2)
continulet.switch(c1, to=c2)
```
lldb session (OS X, using pyinteractive, latest source code):
```
* thread #1: tid = 0x782cc, 0x00000001031fd9d8 externmod_0.dylib`stacklet_switch + 24, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x1f)
frame #0: 0x00000001031fd9d8 externmod_0.dylib`stacklet_switch + 24
externmod_0.dylib`stacklet_switch + 24:
-> 0x1031fd9d8: movq 0x20(%r14), %rbx
0x1031fd9dc: leaq (%rsp), %rax
0x1031fd9e0: cmpq %rax, 0x8(%rbx)
0x1031fd9e4: ja 0x1031fd9ef ; stacklet_switch + 47
(lldb) disassemble
externmod_0.dylib`stacklet_switch:
0x1031fd9c0: pushq %r14
0x1031fd9c2: pushq %rbx
0x1031fd9c3: pushq %rax
0x1031fd9c4: movq %rdi, %r14
0x1031fd9c7: leaq 0x38e(%rip), %rdi ; "Got target %p\n\n"
0x1031fd9ce: xorl %eax, %eax
0x1031fd9d0: movq %r14, %rsi
0x1031fd9d3: callq 0x1031fdd18 ; symbol stub for: printf
-> 0x1031fd9d8: movq 0x20(%r14), %rbx
0x1031fd9dc: leaq (%rsp), %rax
0x1031fd9e0: cmpq %rax, 0x8(%rbx)
0x1031fd9e4: ja 0x1031fd9ef ; stacklet_switch + 47
0x1031fd9e6: leaq 0x1(%rsp), %rax
0x1031fd9eb: movq %rax, 0x8(%rbx)
0x1031fd9ef: movq %r14, 0x20(%rbx)
0x1031fd9f3: leaq 0x26(%rip), %rdi ; g_save_state
0x1031fd9fa: leaq 0x11f(%rip), %rsi ; g_restore_state
0x1031fda01: movq %rbx, %rdx
0x1031fda04: callq *0x62e(%rip) ; _stacklet_switchstack
0x1031fda0a: movq 0x18(%rbx), %rax
0x1031fda0e: addq $0x8, %rsp
0x1031fda12: popq %rbx
0x1031fda13: popq %r14
0x1031fda15: retq
0x1031fda16: nopw %cs:(%rax,%rax)
(lldb) register read
General Purpose Registers:
rax = 0x000000000000001f
rbx = 0x00007fff5fbfac10
rcx = 0xd8001e3171326a64
rdx = 0x00007fff71227638 __sFX + 248
rdi = 0x0000040000000503
rsi = 0x0000050000000500
rbp = 0x00007fff5fbfab00
rsp = 0x00007fff5fbfaae0
r8 = 0x0000000000000040
r9 = 0x00007fff5fbfa900
r10 = 0x00000001031fdd0a externmod_0.dylib`symbol stub for: free + 4
r11 = 0x0000000000000246
r12 = 0x00007fff5fbfaa50
r13 = 0x0000000000000001
r14 = 0xffffffffffffffff
r15 = 0x00007fff7122e420 libsystem_c.dylib`__stack_chk_guard
rip = 0x00000001031fd9d8 externmod_0.dylib`stacklet_switch + 24
rflags = 0x0000000000010206
cs = 0x000000000000002b
fs = 0x0000000000000000
gs = 0x0000000000000000
(lldb) thread backtrace
* thread #1: tid = 0x782cc, 0x00000001031fd9d8 externmod_0.dylib`stacklet_switch + 24, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x1f)
* frame #0: 0x00000001031fd9d8 externmod_0.dylib`stacklet_switch + 24
frame #1: 0x000000010310c7ef _ctypes.so`ffi_call_unix64 + 79
frame #2: 0x000000010310d00b _ctypes.so`ffi_call + 815
frame #3: 0x00000001031085a3 _ctypes.so`_ctypes_callproc + 643
frame #4: 0x0000000103102aee _ctypes.so`PyCFuncPtr_call + 1092
frame #5: 0x000000010000f0ea Python`PyObject_Call + 99
frame #6: 0x000000010008b97b Python`PyEval_EvalFrameEx + 10701
frame #7: 0x0000000100088d7a Python`PyEval_EvalCodeEx + 1409
frame #8: 0x000000010008f59d Python`fast_function + 117
frame #9: 0x000000010008c400 Python`PyEval_EvalFrameEx + 13394
frame #10: 0x0000000100088d7a Python`PyEval_EvalCodeEx + 1409
frame #11: 0x000000010002d1ae Python`function_call + 350
frame #12: 0x000000010000f0ea Python`PyObject_Call + 99
frame #13: 0x000000010008b97b Python`PyEval_EvalFrameEx + 10701
frame #14: 0x0000000100088d7a Python`PyEval_EvalCodeEx + 1409
frame #15: 0x000000010002d1ae Python`function_call + 350
frame #16: 0x000000010000f0ea Python`PyObject_Call + 99
frame #17: 0x0000000100019ed7 Python`instancemethod_call + 174
frame #18: 0x000000010000f0ea Python`PyObject_Call + 99
frame #19: 0x000000010005580f Python`slot_tp_call + 61
frame #20: 0x000000010000f0ea Python`PyObject_Call + 99
(...)
```
I'm not familiar with the codebase but I did some digging, here's what I found out so far (starting from the newest frame):
* ``stacklet_switch`` is called with ``target == 0xffffffffffffffff`` (https://bitbucket.org/pypy/pypy/src/7db3e6dd2efea611997d981f5738988a84f856cb/rpython/translator/c/src/stacklet/stacklet.c?at=default#cl-299)
* Opaque pointer with that value comes from /in my case, I imagine it can vary/ ``StackletGcRootFinder .switch`` (https://bitbucket.org/pypy/pypy/src/7db3e6dd2efea611997d981f5738988a84f856cb/rpython/rlib/_stacklet_n_a.py?at=default#cl-18)
* That's passed there by ``StackletThread.switch`` (https://bitbucket.org/pypy/pypy/src/7db3e6dd2efea611997d981f5738988a84f856cb/rpython/rlib/rstacklet.py?at=default#cl-37)
* Passed there by ``W_Continulet.switch`` (https://bitbucket.org/pypy/pypy/src/7db3e6dd2efea611997d981f5738988a84f856cb/pypy/module/_continuation/interp_continuation.py?at=default#cl-84) and this is in turn because ``w_to`` contains ``<* <C object None (opaque) at 0xffffffffffffffff>>``
For all intents and purposes release build debug session looks the same, the only notable exception is with the release version ``switch`` is called with target pointer equal to 0x0000000000000000 rather than 0xffffffffffffffff.
More information about the pypy-issue
mailing list