[issue9377] socket, PEP 383: Mishandling of non-ASCII bytes in host/domain names

David Watson report at bugs.python.org
Fri Oct 15 20:03:44 CEST 2010


David Watson <baikie at users.sourceforge.net> added the comment:

> As a further note: I think socket.gethostname() is a special case, since this is just about a local setting (i.e. not related to DNS).

But the hostname *is* commonly intended to be looked up in the
DNS or whatever name resolution mechanisms are used locally -
socket.getfqdn(), for instance, works by looking up the result
using gethostbyaddr() (actually the C function getaddrinfo(),
followed by gethostbyaddr()).  So I don't see the rationale for
treating it differently from the results of gethostbyaddr(),
getnameinfo(), etc.

POSIX says of the name lookup functions that "in many cases" they
are implemented by the Domain Name System, not that they always
are, so a name intended for lookup need not be ASCII-only either.

> We should then assume that it is encoded in the locale encoding (in particular, that it is encoded in mbcs on Windows).

I can see the point of returning the characters that were
intended, but code that looked up the returned name would then
have to be changed to re-encode it to bytes to avoid the
round-tripping issue when non-ASCII characters are returned.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9377>
_______________________________________


More information about the Python-bugs-list mailing list