[issue13215] multiprocessing Manager.connect() aggressively retries refused connections

Mon Oct 24 19:16:04 CEST 2011

Charles-François Natali <neologix at free.fr> added the comment:

>  While a 20 second timeout may make sense for *unresponsive* servers,
> ECONNREFUSED probably indicates that the server is not listening on this port, so
> hammering it with 1,999 more connection attempts isn't going to help.

That's funny, I noticed this a couple days ago, and it also puzzled me...

> I'm not sure, but I think that would be for the case where you are spawning the
> server yourself and the child process takes time to start up.

That's also what I think.
But that's strange, since:
- this holds for every client/server communication (e.g. why not do
that for smtplib, telnetlib, etc. ?)
- it's against the classical connect() semantics
- some code may prefer failing immediately (instead of "hammering" the
remote host) if the remote server is down, or the address is
incorrect: it can still handle the ECONNREFUSED if it wants to retry,
with a custom retry timeout
I removed the retry code and run test_multiprocessing and
test_concurrent_futures in loop, and didn't see any failure (on
Linux), so I'd say we could probably remove that.
OTOH, I would feel bad if this broke someone's code (even though code
relying on the automatic retries is probably broken).
So I'm +1 on removing the retry logic altogether, unless of course
someone comes up with a good reason to keep it (I digged a little
through the logs to see when this was introduced, but apparently it
was there in the original import).
If we don't remove it, I agree we should at least reduce the timeout
and increase the period (an exponential backoff may be a bit
overkill).

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13215>
_______________________________________