[Python-checkins] cpython (2.7): Issue #17844: Refactor a documentation of Python specific encodings.

serhiy.storchaka python-checkins at python.org
Wed May 22 14:36:18 CEST 2013


http://hg.python.org/cpython/rev/85c04fdaa404
changeset:   83887:85c04fdaa404
branch:      2.7
user:        Serhiy Storchaka <storchaka at gmail.com>
date:        Wed May 22 15:28:30 2013 +0300
summary:
  Issue #17844: Refactor a documentation of Python specific encodings.
Add links to encoders and decoders for binary-to-binary codecs.

files:
  Doc/library/codecs.rst |  174 ++++++++++++++++------------
  Misc/NEWS              |    7 +-
  2 files changed, 105 insertions(+), 76 deletions(-)


diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -1098,88 +1098,112 @@
 | utf_8_sig       |                                | all languages                  |
 +-----------------+--------------------------------+--------------------------------+
 
-A number of codecs are specific to Python, so their codec names have no meaning
-outside Python. Some of them don't convert from Unicode strings to byte strings,
-but instead use the property of the Python codecs machinery that any bijective
-function with one argument can be considered as an encoding.
+Python Specific Encodings
+-------------------------
 
-For the codecs listed below, the result in the "encoding" direction is always a
-byte string. The result of the "decoding" direction is listed as operand type in
-the table.
+A number of predefined codecs are specific to Python, so their codec names have
+no meaning outside Python.  These are listed in the tables below based on the
+expected input and output types (note that while text encodings are the most
+common use case for codecs, the underlying codec infrastructure supports
+arbitrary data transforms rather than just text encodings).  For asymmetric
+codecs, the stated purpose describes the encoding direction.
 
-.. tabularcolumns:: |l|p{0.3\linewidth}|l|p{0.3\linewidth}|
+The following codecs provide unicode-to-str encoding [#encoding-note]_ and
+str-to-unicode decoding [#decoding-note]_, similar to the Unicode text
+encodings.
 
-+--------------------+---------------------------+----------------+---------------------------+
-| Codec              | Aliases                   | Operand type   | Purpose                   |
-+====================+===========================+================+===========================+
-| base64_codec       | base64, base-64           | byte string    | Convert operand to MIME   |
-|                    |                           |                | base64 (the result always |
-|                    |                           |                | includes a trailing       |
-|                    |                           |                | ``'\n'``)                 |
-+--------------------+---------------------------+----------------+---------------------------+
-| bz2_codec          | bz2                       | byte string    | Compress the operand      |
-|                    |                           |                | using bz2                 |
-+--------------------+---------------------------+----------------+---------------------------+
-| hex_codec          | hex                       | byte string    | Convert operand to        |
-|                    |                           |                | hexadecimal               |
-|                    |                           |                | representation, with two  |
-|                    |                           |                | digits per byte           |
-+--------------------+---------------------------+----------------+---------------------------+
-| idna               |                           | Unicode string | Implements :rfc:`3490`,   |
-|                    |                           |                | see also                  |
-|                    |                           |                | :mod:`encodings.idna`     |
-+--------------------+---------------------------+----------------+---------------------------+
-| mbcs               | dbcs                      | Unicode string | Windows only: Encode      |
-|                    |                           |                | operand according to the  |
-|                    |                           |                | ANSI codepage (CP_ACP)    |
-+--------------------+---------------------------+----------------+---------------------------+
-| palmos             |                           | Unicode string | Encoding of PalmOS 3.5    |
-+--------------------+---------------------------+----------------+---------------------------+
-| punycode           |                           | Unicode string | Implements :rfc:`3492`    |
-+--------------------+---------------------------+----------------+---------------------------+
-| quopri_codec       | quopri, quoted-printable, | byte string    | Convert operand to MIME   |
-|                    | quotedprintable           |                | quoted printable          |
-+--------------------+---------------------------+----------------+---------------------------+
-| raw_unicode_escape |                           | Unicode string | Produce a string that is  |
-|                    |                           |                | suitable as raw Unicode   |
-|                    |                           |                | literal in Python source  |
-|                    |                           |                | code                      |
-+--------------------+---------------------------+----------------+---------------------------+
-| rot_13             | rot13                     | Unicode string | Returns the Caesar-cypher |
-|                    |                           |                | encryption of the operand |
-+--------------------+---------------------------+----------------+---------------------------+
-| string_escape      |                           | byte string    | Produce a string that is  |
-|                    |                           |                | suitable as string        |
-|                    |                           |                | literal in Python source  |
-|                    |                           |                | code                      |
-+--------------------+---------------------------+----------------+---------------------------+
-| undefined          |                           | any            | Raise an exception for    |
-|                    |                           |                | all conversions. Can be   |
-|                    |                           |                | used as the system        |
-|                    |                           |                | encoding if no automatic  |
-|                    |                           |                | :term:`coercion` between  |
-|                    |                           |                | byte and Unicode strings  |
-|                    |                           |                | is desired.               |
-+--------------------+---------------------------+----------------+---------------------------+
-| unicode_escape     |                           | Unicode string | Produce a string that is  |
-|                    |                           |                | suitable as Unicode       |
-|                    |                           |                | literal in Python source  |
-|                    |                           |                | code                      |
-+--------------------+---------------------------+----------------+---------------------------+
-| unicode_internal   |                           | Unicode string | Return the internal       |
-|                    |                           |                | representation of the     |
-|                    |                           |                | operand                   |
-+--------------------+---------------------------+----------------+---------------------------+
-| uu_codec           | uu                        | byte string    | Convert the operand using |
-|                    |                           |                | uuencode                  |
-+--------------------+---------------------------+----------------+---------------------------+
-| zlib_codec         | zip, zlib                 | byte string    | Compress the operand      |
-|                    |                           |                | using gzip                |
-+--------------------+---------------------------+----------------+---------------------------+
+.. tabularcolumns:: |l|L|L|
+
++--------------------+---------------------------+---------------------------+
+| Codec              | Aliases                   | Purpose                   |
++====================+===========================+===========================+
+| idna               |                           | Implements :rfc:`3490`,   |
+|                    |                           | see also                  |
+|                    |                           | :mod:`encodings.idna`     |
++--------------------+---------------------------+---------------------------+
+| mbcs               | dbcs                      | Windows only: Encode      |
+|                    |                           | operand according to the  |
+|                    |                           | ANSI codepage (CP_ACP)    |
++--------------------+---------------------------+---------------------------+
+| palmos             |                           | Encoding of PalmOS 3.5    |
++--------------------+---------------------------+---------------------------+
+| punycode           |                           | Implements :rfc:`3492`    |
++--------------------+---------------------------+---------------------------+
+| raw_unicode_escape |                           | Produce a string that is  |
+|                    |                           | suitable as raw Unicode   |
+|                    |                           | literal in Python source  |
+|                    |                           | code                      |
++--------------------+---------------------------+---------------------------+
+| rot_13             | rot13                     | Returns the Caesar-cypher |
+|                    |                           | encryption of the operand |
++--------------------+---------------------------+---------------------------+
+| undefined          |                           | Raise an exception for    |
+|                    |                           | all conversions. Can be   |
+|                    |                           | used as the system        |
+|                    |                           | encoding if no automatic  |
+|                    |                           | :term:`coercion` between  |
+|                    |                           | byte and Unicode strings  |
+|                    |                           | is desired.               |
++--------------------+---------------------------+---------------------------+
+| unicode_escape     |                           | Produce a string that is  |
+|                    |                           | suitable as Unicode       |
+|                    |                           | literal in Python source  |
+|                    |                           | code                      |
++--------------------+---------------------------+---------------------------+
+| unicode_internal   |                           | Return the internal       |
+|                    |                           | representation of the     |
+|                    |                           | operand                   |
++--------------------+---------------------------+---------------------------+
 
 .. versionadded:: 2.3
    The ``idna`` and ``punycode`` encodings.
 
+The following codecs provide str-to-str encoding and decoding
+[#decoding-note]_.
+
+.. tabularcolumns:: |l|L|L|L|
+
++--------------------+---------------------------+---------------------------+------------------------------+
+| Codec              | Aliases                   | Purpose                   | Encoder/decoder              |
++====================+===========================+===========================+==============================+
+| base64_codec       | base64, base-64           | Convert operand to MIME   | :meth:`base64.b64encode`,    |
+|                    |                           | base64 (the result always | :meth:`base64.b64decode`     |
+|                    |                           | includes a trailing       |                              |
+|                    |                           | ``'\n'``)                 |                              |
++--------------------+---------------------------+---------------------------+------------------------------+
+| bz2_codec          | bz2                       | Compress the operand      | :meth:`bz2.compress`,        |
+|                    |                           | using bz2                 | :meth:`bz2.decompress`       |
++--------------------+---------------------------+---------------------------+------------------------------+
+| hex_codec          | hex                       | Convert operand to        | :meth:`base64.b16encode`,    |
+|                    |                           | hexadecimal               | :meth:`base64.b16decode`     |
+|                    |                           | representation, with two  |                              |
+|                    |                           | digits per byte           |                              |
++--------------------+---------------------------+---------------------------+------------------------------+
+| quopri_codec       | quopri, quoted-printable, | Convert operand to MIME   | :meth:`quopri.encodestring`, |
+|                    | quotedprintable           | quoted printable          | :meth:`quopri.decodestring`  |
++--------------------+---------------------------+---------------------------+------------------------------+
+| string_escape      |                           | Produce a string that is  |                              |
+|                    |                           | suitable as string        |                              |
+|                    |                           | literal in Python source  |                              |
+|                    |                           | code                      |                              |
++--------------------+---------------------------+---------------------------+------------------------------+
+| uu_codec           | uu                        | Convert the operand using | :meth:`uu.encode`,           |
+|                    |                           | uuencode                  | :meth:`uu.decode`            |
++--------------------+---------------------------+---------------------------+------------------------------+
+| zlib_codec         | zip, zlib                 | Compress the operand      | :meth:`zlib.compress`,       |
+|                    |                           | using gzip                | :meth:`zlib.decompress`      |
++--------------------+---------------------------+---------------------------+------------------------------+
+
+.. [#encoding-note] str objects are also accepted as input in place of unicode
+   objects.  They are implicitly converted to unicode by decoding them using
+   the default encoding.  If this conversion fails, it may lead to encoding
+   operations raising :exc:`UnicodeDecodeError`.
+
+.. [#decoding-note] unicode objects are also accepted as input in place of str
+   objects.  They are implicitly converted to str by encoding them using the
+   default encoding.  If this conversion fails, it may lead to decoding
+   operations raising :exc:`UnicodeEncodeError`.
+
 
 :mod:`encodings.idna` --- Internationalized Domain Names in Applications
 ------------------------------------------------------------------------
diff --git a/Misc/NEWS b/Misc/NEWS
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -26,12 +26,17 @@
 
 - Issue #14146: Highlight source line while debugging on Windows.
 
-
 Tests
 -----
 
 - Issue #11995: test_pydoc doesn't import all sys.path modules anymore.
 
+Documentation
+-------------
+
+- Issue #17844: Refactor a documentation of Python specific encodings.
+  Add links to encoders and decoders for binary-to-binary codecs.
+
 
 What's New in Python 2.7.5?
 ===========================

-- 
Repository URL: http://hg.python.org/cpython


More information about the Python-checkins mailing list