[issue21069] test_fileno of test_urllibnet intermittently fails when using www.example.com

Thu Mar 27 11:48:40 CET 2014

Ned Deily added the comment:

After looking at why the 2.7 version of the test does not fail, the problem became apparent.  In 2.7, test_errno tests urlopen() of the original deprecated urllib module.  In 3.x, the test was ported over but now uses urlopen() of urllib.request which is based on urllib2() of 2.x.

2.7:
>>> x = urllib.urlopen("http://www.example.com")
[79234 refs]
>>> x
<addinfourl at 3068742324L whose fp = <socket._fileobject object at 0xb6e7eea4>>
[79234 refs]
>>> os.fdopen(x.fileno()).read()
'<!doctype html>\n<html>\n<head>\n    <title>Example Domain</title>\n\n    <meta charset="utf-8" />\n    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />\n    <meta name="viewport" content="width=device-width, initial-scale=1" />\n    <style type="text/css">\n    body {\n        background-color: #f0f0f2;\n        margin: 0;\n        padding: 0;\n        font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;\n        \n    }\n    div {\n        width: 600px;\n        margin: 5em auto;\n        padding: 50px;\n        background-color: #fff;\n        border-radius: 1em;\n    }\n    a:link, a:visited {\n        color: #38488f;\n        text-decoration: none;\n    }\n    @media (max-width: 700px) {\n        body {\n            background-color: #fff;\n        }\n        div {\n            width: auto;\n            margin: 0 auto;\n            border-radius: 0;\n            padding: 1em;\n        }\n    }\n    </style>    \n</head>\n\n<body>\n<div>\n    <h1>Example Domain</h1>\n    <p>This domain is established to be used for illustrative examples in documents. You may use this\n    domain in examples without prior coordination or asking for permission.</p>\n    <p><a href="http://www.iana.org/domains/example">More information...</a></p>\n</div>\n</body>\n</html>\n'
[79234 refs]

3.4 (when the read doesn't fail):
>>> x = urllib.request.urlopen("http://www.example.com")
>>> x
<http.client.HTTPResponse object at 0xb6bc7114>
>>> os.fdopen(x.fileno()).read()
__main__:1: ResourceWarning: unclosed file <_io.TextIOWrapper name=4 mode='r' encoding='UTF-8'>
' without prior coordination or asking for permission.</p>\n    <p><a href="http://www.iana.org/domains/example">More information...</a></p>\n</div>\n</body>\n</html>\n'

In the 3.x case (and the 2.7 urllib2 case), the read from the file descriptor starts at mid-response or at the end (returning an empty byte string).  In the past, the test passed because of the amount of data returned by the previous test URL.  Now, with the short response from www.example.com, it's clear that the file descriptor read is not returning the whole response.  I don't know whether the file descriptor read is expected to be meaningful for urllib2/urllib.request.

Senthil, what do you think?

----------
nosy: +orsenthil

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue21069>
_______________________________________