[Python-Dev] stack check on Unix: any suggestions?

Thomas Wouters thomas@xs4all.net
Sat, 2 Sep 2000 23:36:47 +0200


On Fri, Sep 01, 2000 at 11:09:02AM -0500, Charles G Waldman wrote:
> Skip Montanaro writes:
>  > Makes no difference:

>  >     stack size (kbytes)         unlimited
>  >     % ./python Misc/find_recursionlimit.py
>  >     Limit of 2400 is fine
>  >     repr
>  >     Segmentation fault

> This means that you're not hitting the rlimit at all but getting a
> real segfault!  Time to do setrlimit -c unlimited and break out GDB,
> I'd say.

Yes, which I did (well, my girlfriend was hogging the PC with 'net
connection, and there was nothing but silly soft-porn on TV, so I spent an
hour or two on my laptop ;) and I did figure out the problem isn't
stackspace (which was already obvious) but *damned* if I know what the
problem is. 

Here's an easy way to step through the whole procedure, though. Take a
recursive script, like the one Guido posted:

    i = 0
    class C:
      def __getattr__(self, name):
          global i
          print i
          i += 1
          return self.name # common beginners' mistake

Run it once, so you get a ballpark figure on when it'll crash, and then
branch right before it would crash, calling some obscure function
(os.getpid() works nicely, very simple function.) This was about 2926 or so
on my laptop (adding the branch changed this number, oddly enough.)

    import os
    i = 0
    class C:
      def __getattr__(self, name):
          global i
          print i
          i += 1
          if (i > 2625):
              os.getpid()
          return self.name # common beginners' mistake

(I also moved the 'print i' to inside the branch, saved me a bit of
scrollin') Then start GDB on the python binary, set a breakpoint on
posix_getpid, and "run 'test.py'". You'll end up pretty close to where the
interpreter decides to go bellyup. Setting a breakpoint on ceval.c line 612
(the "opcode = NEXTOP();' line) or so at that point helps doing a
per-bytecode check, though this made me miss the actual point of failure,
and I don't fancy doing it again just yet :P What I did see, however, was
that the reason for the crash isn't the pure recursion. It looks like the
recursiveness *does* get caught properly, and the interpreter raises an
error. And then prints that error over and over again, probably once for
every call to getattr(), and eventually *that* crashes (but why, I don't
know. In one test I did, it crashed in int_print, the print function for int
objects, which did 'fprintf(fp, "%ld", v->ival);'. The actual SEGV arrived
inside fprintf's internals. v->ival was a valid integer (though a high one)
and the problem was not derefrencing 'v'. 'fp' was stderr, according to its
_fileno member.

'ltrace' (if you have it) is also a nice tool to let loose on this kind of
script, by the way, though it does make the test take a lot longer, and you
really need enough diskspace to store the output ;-P

Back-to-augassign-docs-ly y'rs,

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!