[docs] Remove “Content-Type: application/x-www-form-urlencoded; charset” advice (issue 25576)

vadmium+py at gmail.com vadmium+py at gmail.com
Sun Nov 8 18:08:29 EST 2015


Reviewers: r.david.murray,


http://bugs.python.org/review/25576/diff/15901/Doc/library/urllib.request.rst
File Doc/library/urllib.request.rst (right):

http://bugs.python.org/review/25576/diff/15901/Doc/library/urllib.request.rst#newcode1245
Doc/library/urllib.request.rst:1245: >>> with
urllib.request.urlopen("http://requestb.in/xrbl82xr", data) as f:
On 2015/11/08 18:24:56, r.david.murray wrote:
> We lose something from the example no longer showing a use of
add_header.  Is
> there any other header we could add that would make sense in context? 
I don't
> think it is a big deal, though.

Nothing immediately comes to mind for this particular example. It might
be better to illustrate one concept per example, rather than adding in
unrelated headers at the same time. There is already an example at line
1209 setting Referer with add_header(). Some more options:

* Perhaps there should be a link to
<https://docs.python.org/dev/howto/urllib2.html#headers>, which has
example code for setting User-Agent via the Request constructor
parameter.

* In my patch at Issue 23360 I proposed setting the Content-Type in the
examples where data is not in urlencoded format, although again using
the constructor rather than add_header().

* Perhaps we can resolve Issue 25570 with an example setting User-Agent
via add_header().



Please review this at http://bugs.python.org/review/25576/

Affected files:
  Doc/library/urllib.parse.rst
  Doc/library/urllib.request.rst
  Lib/urllib/request.py


# HG changeset patch
# Parent  4df1eaecb506507e5d613f8601b48057e7f328a8
Remove advice about setting charset with application/x-www-form-urlencoded

No charset parameter is standardized for this Content-Type value. Also
clarify that urlencode() outputs ASCII.

diff -r 4df1eaecb506 -r 6ff71efb3b5b Doc/library/urllib.parse.rst
--- a/Doc/library/urllib.parse.rst	Sat Nov 07 03:15:32 2015 +0000
+++ b/Doc/library/urllib.parse.rst	Sat Nov 07 08:36:34 2015 +0000
@@ -535,10 +535,11 @@
                         errors=None, quote_via=quote_plus)
 
    Convert a mapping object or a sequence of two-element tuples, which may
-   contain :class:`str` or :class:`bytes` objects, to a "percent-encoded"
-   string.  If the resultant string is to be used as a *data* for POST
-   operation with :func:`~urllib.request.urlopen` function, then it should be
-   properly encoded to bytes, otherwise it would result in a :exc:`TypeError`.
+   contain :class:`str` or :class:`bytes` objects, to a percent-encoded ASCII
+   text string.  If the resultant string is to be used as a *data* for POST
+   operation with the :func:`~urllib.request.urlopen` function, then
+   it should be encoded to bytes, otherwise it would result in a
+   :exc:`TypeError`.
 
    The resulting string is a series of ``key=value`` pairs separated by ``'&'``
    characters, where both *key* and *value* are quoted using the *quote_via*
diff -r 4df1eaecb506 -r 6ff71efb3b5b Doc/library/urllib.request.rst
--- a/Doc/library/urllib.request.rst	Sat Nov 07 03:15:32 2015 +0000
+++ b/Doc/library/urllib.request.rst	Sat Nov 07 08:36:34 2015 +0000
@@ -36,13 +36,8 @@
    *data* should be a buffer in the standard
    :mimetype:`application/x-www-form-urlencoded` format.  The
    :func:`urllib.parse.urlencode` function takes a mapping or sequence of
-   2-tuples and returns a string in this format. It should be encoded to bytes
-   before being used as the *data* parameter. The charset parameter in
-   ``Content-Type`` header may be used to specify the encoding. If charset
-   parameter is not sent with the Content-Type header, the server following the
-   HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1
-   encoding. It is advisable to use charset parameter with encoding used in
-   ``Content-Type`` header with the :class:`Request`.
+   2-tuples and returns an ASCII text string in this format. It should
+   be encoded to bytes before being used as the *data* parameter.
 
    urllib.request module uses HTTP/1.1 and includes ``Connection:close`` header
    in its HTTP requests.
@@ -180,16 +175,9 @@
    the only ones that use *data*; the HTTP request will be a POST instead of a
    GET when the *data* parameter is provided.  *data* should be a buffer in the
    standard :mimetype:`application/x-www-form-urlencoded` format.
-
    The :func:`urllib.parse.urlencode` function takes a mapping or sequence of
-   2-tuples and returns a string in this format. It should be encoded to bytes
-   before being used as the *data* parameter. The charset parameter in
-   ``Content-Type`` header may be used to specify the encoding. If charset
-   parameter is not sent with the Content-Type header, the server following the
-   HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1
-   encoding. It is advisable to use charset parameter with encoding used in
-   ``Content-Type`` header with the :class:`Request`.
-
+   2-tuples and returns an ASCII string in this format. It should be
+   encoded to bytes before being used as the *data* parameter.
 
    *headers* should be a dictionary, and will be treated as if
    :meth:`add_header` was called with each key and value as arguments.
@@ -202,7 +190,7 @@
    ``"Python-urllib/2.6"`` (on Python 2.6).
 
    An example of using ``Content-Type`` header with *data* argument would be
-   sending a dictionary like ``{"Content-Type":" application/x-www-form-urlencoded;charset=utf-8"}``.
+   sending a dictionary like ``{"Content-Type": "application/x-www-form-urlencoded"}``.
 
    The final two arguments are only of interest for correct handling
    of third-party HTTP cookies:
@@ -1230,7 +1218,7 @@
    opener.open('http://www.example.com/')
 
 Also, remember that a few standard headers (:mailheader:`Content-Length`,
-:mailheader:`Content-Type` without charset parameter and :mailheader:`Host`)
+:mailheader:`Content-Type` and :mailheader:`Host`)
 are added when the :class:`Request` is passed to :func:`urlopen` (or
 :meth:`OpenerDirector.open`).
 
@@ -1253,11 +1241,8 @@
    >>> import urllib.request
    >>> import urllib.parse
    >>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
-   >>> data = data.encode('utf-8')
-   >>> request = urllib.request.Request("http://requestb.in/xrbl82xr")
-   >>> # adding charset parameter to the Content-Type header.
-   >>> request.add_header("Content-Type","application/x-www-form-urlencoded;charset=utf-8")
-   >>> with urllib.request.urlopen(request, data) as f:
+   >>> data = data.encode('ascii')
+   >>> with urllib.request.urlopen("http://requestb.in/xrbl82xr", data) as f:
    ...     print(f.read().decode('utf-8'))
    ...
 
diff -r 4df1eaecb506 -r 6ff71efb3b5b Lib/urllib/request.py
--- a/Lib/urllib/request.py	Sat Nov 07 03:15:32 2015 +0000
+++ b/Lib/urllib/request.py	Sat Nov 07 08:36:34 2015 +0000
@@ -149,13 +149,8 @@
 
     *data* should be a buffer in the standard application/x-www-form-urlencoded
     format. The urllib.parse.urlencode() function takes a mapping or sequence
-    of 2-tuples and returns a string in this format. It should be encoded to
-    bytes before being used as the data parameter. The charset parameter in
-    Content-Type header may be used to specify the encoding. If charset
-    parameter is not sent with the Content-Type header, the server following
-    the HTTP 1.1 recommendation may assume that the data is encoded in
-    ISO-8859-1 encoding. It is advisable to use charset parameter with encoding
-    used in Content-Type header with the Request.
+    of 2-tuples and returns an ASCII text string in this format. It should be
+    encoded to bytes before being used as the data parameter.
 
     urllib.request module uses HTTP/1.1 and includes a "Connection:close"
     header in its HTTP requests.




More information about the docs mailing list