[issue9377] socket, PEP 383: Mishandling of non-ASCII bytes in host/domain names

David Watson report at bugs.python.org
Wed Oct 20 21:37:24 CEST 2010


David Watson <baikie at users.sourceforge.net> added the comment:

I was looking at the MSDN pages linked to above, and these two
pages seemed to suggest that Unicode characters appearing in DNS
names represented UTF-8 sequences, and that Windows allowed such
non-ASCII byte sequences in the DNS by default:

http://msdn.microsoft.com/en-us/library/ms724220%28v=VS.85%29.aspx
http://msdn.microsoft.com/en-us/library/ms682032%28v=VS.85%29.aspx

(See the discussion of DNS_ERROR_NON_RFC_NAME in the latter.)
Can anyone confirm if this is the case?

The BSD-style gethostname() function can't be returning UTF-8,
though, or else the "Nötkötti" example above would have been
decoded successfully, given that Python currently uses
PyUnicode_FromString().

Also, if GetComputerNameEx() only offers a choice of DNS names or
NetBIOS names, and both are byte-oriented underneath (that was my
reading of the "Computer Names" page), then presumably there
shouldn't be a problem with mapping the result to a bytes
equivalent when necessary?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9377>
_______________________________________


More information about the Python-bugs-list mailing list