The reliability of python threads

Thu Jan 25 19:58:21 EST 2007

skip at pobox.com writes:
> What makes you think Paddy indicated he wouldn't try to solve the problem?
> Here's what he wrote:
> 
>     What I'm proposing is that if, for example, a process stops running
>     three times in a year at roughly three to four months intervals , and it
>     should have stayed up; then restart the server sooner, at aa time of
>     your choosing, whilst taking other measures to investicate the error.

Well, ok, that's better than just rebooting every so often and leaving
it at that, like the firmware systems he cited.

> I see nothing wrong with trying to minimize the chances of a problem

I think a measure to minimize the chance of some problem is only valid
if there's some plausible theory that it WILL decrease the chance of
the problem (e.g. if there's reason to think that the problem is
caused by a very slow resource leak, but that hasn't been suggested).
That's the part that I'm missing from this story.

One thing I'd certainly want to do is set up a test server under a
much heavier load than the real server sees, and check whether the
problem occurs faster.