[issue22450] urllib doesn't put Accept: */* in the headers

Raymond Hettinger report at bugs.python.org
Tue Aug 30 17:17:55 EDT 2016


Raymond Hettinger added the comment:

Update:  After more research, I learned that while 'Accept: */*' should not have an effect on the origin webserver, it can and does have an effect on proxy servers.

Origin servers are allowed to vary the content-type of responses when given different Accept headers.  When they do so, they should also send "Vary: Accept".   

Proxy servers such as NGinx and Varnish respond to the "Vary: Accept" by caching the different responses using a combination of url and the accept header as the cache key.  If the request has 'Accept: */*', then the cache lookup returns the same result as if the 'Accept: */*' had been passed directly to the server.  However, if the Accept header is omitted, the proxy cache can return any of the cached responses (typically the most recent, regardless of content-type).

Accordingly, it is a good practice to include 'Accept: */*' in the request so that you get a consistent result (what the server would have returned) rather than the inconsistent and unpredictable content-types you would receive in the absence of the Accept header.  I believe that is why the other tools and book examples use 'Accept: */*' even though the origin wouldn't care.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue22450>
_______________________________________


More information about the Python-bugs-list mailing list