[python3-ldap] Python3-ldap: proposal for IP connection fallback

Tue Nov 18 23:16:34 CET 2014

Hello everybody,
I've received a request to implement a sort of fallback from ipv6 to
ipv4 in python3-ldap. The rationale is that when you specify a server
name python3-ldap queries the DNS service to resolve it (via
getaddrinfo) and then tries to connect to the first address received
(either ipv4 or ivp6). Chances are that you get a valid ipv6 but the
"path" from the client and the server is not available (maybe not all
ipv6 trunks are functional) so you get a connection error. The
proposal is to implement a fallback to an ipv4 address if available.

I received this message from Brian, a valid contributor to the library:

****
Hello,

Generally speaking it is standard practise to automatically fall back
to IPv4 is the server doesn't support IPv6. This means the client
doesn't need to know anything about the implementation status of the
server, and will automatically use IPv6 if/when it becomes available.

The standard behaviour, as implemented almost everywhere else is that
is should automatically fall back to IPv4 if it gets connection
refused.

This IPv4 fallback has nothing to do with SRV records either.

I do not see any reason why a python ldap binding should be a special case.

However being able to override the default behaviour and force IPv6
only or force IPv4 only is also considered a good idea.

***

and this message from Robert that helped me to implement the ipv6
portion of the connection:

****
Hi,

Ok, I stand corrected, after some research and analysis I think
fallback, even active by default, may be quite a good idea, but only
if implemented following something like RFC 6555 recommendation.
(Happy eyeball algorithm)

Results of first getaddrinfo call shouldn't be used for the lifetime
of Server object, but rather cached only for a short time (like one
minute) and refreshed after that as needed. So long living application
won't use outdated information from DNS, but also won't waste cycles
for constantly calling resolver unnecessarily.

If getaddrinfo caching is implemented, Server object should probably
have an option of separate internal pools for IPv4 and IPv6 and select
IPs in RFC 6555 way. It should be possible to select current
ROUND_ROBIN, RANDOM or FIRST strategies for each of them
independently. FIRST probably default if there's at least one IPv6
address) strategy would mean that connections are always tried in
order returned from most recent cached getaddrinfo call. It especially
makes sense in a following scenario (tested on Debian): server has two
IPv6 addresses, native 2a01::xx/16, 6to4 one and one IPv4. According
to default contents of /etc/gai.conf, clients having native IPv6 will
prefer server's native address as well. then 6to4, then IPv4. Clients
having 6to4 address will prefer 6to4, then IPv4, then native IPv6. If
both IPv6 addresses would be put into the same pool and then one of
them would be selected in a round robin way for subsequent RFC 6555
procedure, in latter case connection would be tried to the least
preferred address half of the time. Unfortunately, getaddrinfo doesn't
give information about in what groups addresses should be splitted,
there's no way of creating. For hostname having multihomed IPv4 only,
ROUND_ROBIN would be the default. I think it would handle two most
common scenarios: load balancing for IPv4, and graceful failover for
dual stack IPv4/IPv6. It would not handle in optimal way situations
when there are for example two IPv6 native and two 6to4 addresses. But
IMO such scenarios should be handled by SRV records anyway.

Due to 1. and 2. I'm not sure if returning ServerPool in the way it's
implemented now is a good idea. First, it would treat all addresses
the same not in order of preference. Second, it may be difficult to
implement something like RFC 6555 on top of it properly. Third, it
would be more difficult to track DNS changes. Fourth, I'm not sure if
Server constructor should sometimes return object of an other class
not even inherited from it. Of course you may have completely
different implementation on your mind than I guessed so it may be moot
point.

WIth RFC 6555 implemented I would drop idea of PREFER_IPV4|PREFER_IPV6
options and provide only IPV[46]ONLY. Default strategy should be
enough in most simple cases, and if short term override is needed,
former options haven't any clear advantages over latter ones. And for
advanced uses, tweaking system settings gives more flexibility which
using of PREFER options would only limit.

Regarding SRV, I'm still thinking how it should be implemented.
Probably current Server class should be closer to current ServerPool
implementation, maybe with introduced equivalent of current Server
kept behind the scenes as less exposed internal structures. Btw. quick
googling shows that at least OpenLDAP supports ldap SRVs and also MSDN
mentions them, so they aren't completely exotic idea.
***

The "happy eyeballs" (http://en.wikipedia.org/wiki/Happy_Eyeballs)
algorithm sends parallel SYN packet to both IPv4 and IPv6 addresses
and prefer IPv6 if available. This means that there is a little
overhead in the connection because one of the packet will be
discarded.

Do you have any suggestion or idea about this topic? It's correct to
address it at the python3-ldap library level or should it be resolved
at the tcp level?

Bye,
Giovanni