[Python-Dev] Strategies for debugging buildbot failures?

Sun Jan 18 19:07:07 CET 2009

Mark Dickinson wrote:
> This is probably a stupid question, but here goes:
>
> Can anyone suggest good strategies for debugging buildbot
> test failures, for problems that aren't reproducible locally?
>
> There have been various times in the past that I've wanted
> to be able to do this.  Right now, I'm thinking particularly of
> the 'Unknown signal 32' failure that's been occurring on the
> gentoo x86 buildbots for 3.0 and 3.x since pre- 3.0 alpha
> days.  I recently noticed an apparent pattern to these
> failures: (failure occurs at the first test that involves
> threads, after test_os has been run), but am unsure how
> to proceed from there.
>
> Is it acceptable to commit a change (to the trunk or py3k, not to
> the release branches) solely for the purpose of getting more
> information about a failure?  I don't see a lot of this kind of
> activity going on in the checkin messages, so I'm not sure
> whether this is okay or not.  If I did this, the commit
> message would clearly indicate that the checkin was
> meant to be temporary, and give an expected time to reversion.
>   

At Resolver Systems we regularly extend the test framework purely to 
provide more diagnostic information in the event of test failures. We do 
a lot of functional testing through the UI, which is particularly prone 
to intermittent and hard to diagnose failures.

It can be built in in a way that doesn't affect the test run unless the 
test fails - and so there is no reason not to make the changes permanent 
unless they are particularly intrusive.

Michael Foord
> Alternatively, is it reasonable to create a new branch solely
> for the purpose of tracking down one particular problem?
> Again, I don't see this sort of thing happening, but it seems
> like an attractive strategy, since it allows one to test one
> particular buildbot (via the form for requesting a build)
> without messing up anything else.
>
> What do others do to debug these failures?
>
> Mark
>
> (P.S. After a bit of Googling, I suspect the 'Unknown
> signal 32' failure of being related to the LinuxThreads
> library, and probably not Python's fault.  But it would
> still be good to understand why it occurs with 3.x but
> not 2.x, and whether there's an easy workaround.)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>   

-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog