[Python-checkins] cpython (2.7): Issue #17844: Refactor a documentation of Python specific encodings.
serhiy.storchaka
python-checkins at python.org
Wed May 22 14:36:18 CEST 2013
http://hg.python.org/cpython/rev/85c04fdaa404
changeset: 83887:85c04fdaa404
branch: 2.7
user: Serhiy Storchaka <storchaka at gmail.com>
date: Wed May 22 15:28:30 2013 +0300
summary:
Issue #17844: Refactor a documentation of Python specific encodings.
Add links to encoders and decoders for binary-to-binary codecs.
files:
Doc/library/codecs.rst | 174 ++++++++++++++++------------
Misc/NEWS | 7 +-
2 files changed, 105 insertions(+), 76 deletions(-)
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -1098,88 +1098,112 @@
| utf_8_sig | | all languages |
+-----------------+--------------------------------+--------------------------------+
-A number of codecs are specific to Python, so their codec names have no meaning
-outside Python. Some of them don't convert from Unicode strings to byte strings,
-but instead use the property of the Python codecs machinery that any bijective
-function with one argument can be considered as an encoding.
+Python Specific Encodings
+-------------------------
-For the codecs listed below, the result in the "encoding" direction is always a
-byte string. The result of the "decoding" direction is listed as operand type in
-the table.
+A number of predefined codecs are specific to Python, so their codec names have
+no meaning outside Python. These are listed in the tables below based on the
+expected input and output types (note that while text encodings are the most
+common use case for codecs, the underlying codec infrastructure supports
+arbitrary data transforms rather than just text encodings). For asymmetric
+codecs, the stated purpose describes the encoding direction.
-.. tabularcolumns:: |l|p{0.3\linewidth}|l|p{0.3\linewidth}|
+The following codecs provide unicode-to-str encoding [#encoding-note]_ and
+str-to-unicode decoding [#decoding-note]_, similar to the Unicode text
+encodings.
-+--------------------+---------------------------+----------------+---------------------------+
-| Codec | Aliases | Operand type | Purpose |
-+====================+===========================+================+===========================+
-| base64_codec | base64, base-64 | byte string | Convert operand to MIME |
-| | | | base64 (the result always |
-| | | | includes a trailing |
-| | | | ``'\n'``) |
-+--------------------+---------------------------+----------------+---------------------------+
-| bz2_codec | bz2 | byte string | Compress the operand |
-| | | | using bz2 |
-+--------------------+---------------------------+----------------+---------------------------+
-| hex_codec | hex | byte string | Convert operand to |
-| | | | hexadecimal |
-| | | | representation, with two |
-| | | | digits per byte |
-+--------------------+---------------------------+----------------+---------------------------+
-| idna | | Unicode string | Implements :rfc:`3490`, |
-| | | | see also |
-| | | | :mod:`encodings.idna` |
-+--------------------+---------------------------+----------------+---------------------------+
-| mbcs | dbcs | Unicode string | Windows only: Encode |
-| | | | operand according to the |
-| | | | ANSI codepage (CP_ACP) |
-+--------------------+---------------------------+----------------+---------------------------+
-| palmos | | Unicode string | Encoding of PalmOS 3.5 |
-+--------------------+---------------------------+----------------+---------------------------+
-| punycode | | Unicode string | Implements :rfc:`3492` |
-+--------------------+---------------------------+----------------+---------------------------+
-| quopri_codec | quopri, quoted-printable, | byte string | Convert operand to MIME |
-| | quotedprintable | | quoted printable |
-+--------------------+---------------------------+----------------+---------------------------+
-| raw_unicode_escape | | Unicode string | Produce a string that is |
-| | | | suitable as raw Unicode |
-| | | | literal in Python source |
-| | | | code |
-+--------------------+---------------------------+----------------+---------------------------+
-| rot_13 | rot13 | Unicode string | Returns the Caesar-cypher |
-| | | | encryption of the operand |
-+--------------------+---------------------------+----------------+---------------------------+
-| string_escape | | byte string | Produce a string that is |
-| | | | suitable as string |
-| | | | literal in Python source |
-| | | | code |
-+--------------------+---------------------------+----------------+---------------------------+
-| undefined | | any | Raise an exception for |
-| | | | all conversions. Can be |
-| | | | used as the system |
-| | | | encoding if no automatic |
-| | | | :term:`coercion` between |
-| | | | byte and Unicode strings |
-| | | | is desired. |
-+--------------------+---------------------------+----------------+---------------------------+
-| unicode_escape | | Unicode string | Produce a string that is |
-| | | | suitable as Unicode |
-| | | | literal in Python source |
-| | | | code |
-+--------------------+---------------------------+----------------+---------------------------+
-| unicode_internal | | Unicode string | Return the internal |
-| | | | representation of the |
-| | | | operand |
-+--------------------+---------------------------+----------------+---------------------------+
-| uu_codec | uu | byte string | Convert the operand using |
-| | | | uuencode |
-+--------------------+---------------------------+----------------+---------------------------+
-| zlib_codec | zip, zlib | byte string | Compress the operand |
-| | | | using gzip |
-+--------------------+---------------------------+----------------+---------------------------+
+.. tabularcolumns:: |l|L|L|
+
++--------------------+---------------------------+---------------------------+
+| Codec | Aliases | Purpose |
++====================+===========================+===========================+
+| idna | | Implements :rfc:`3490`, |
+| | | see also |
+| | | :mod:`encodings.idna` |
++--------------------+---------------------------+---------------------------+
+| mbcs | dbcs | Windows only: Encode |
+| | | operand according to the |
+| | | ANSI codepage (CP_ACP) |
++--------------------+---------------------------+---------------------------+
+| palmos | | Encoding of PalmOS 3.5 |
++--------------------+---------------------------+---------------------------+
+| punycode | | Implements :rfc:`3492` |
++--------------------+---------------------------+---------------------------+
+| raw_unicode_escape | | Produce a string that is |
+| | | suitable as raw Unicode |
+| | | literal in Python source |
+| | | code |
++--------------------+---------------------------+---------------------------+
+| rot_13 | rot13 | Returns the Caesar-cypher |
+| | | encryption of the operand |
++--------------------+---------------------------+---------------------------+
+| undefined | | Raise an exception for |
+| | | all conversions. Can be |
+| | | used as the system |
+| | | encoding if no automatic |
+| | | :term:`coercion` between |
+| | | byte and Unicode strings |
+| | | is desired. |
++--------------------+---------------------------+---------------------------+
+| unicode_escape | | Produce a string that is |
+| | | suitable as Unicode |
+| | | literal in Python source |
+| | | code |
++--------------------+---------------------------+---------------------------+
+| unicode_internal | | Return the internal |
+| | | representation of the |
+| | | operand |
++--------------------+---------------------------+---------------------------+
.. versionadded:: 2.3
The ``idna`` and ``punycode`` encodings.
+The following codecs provide str-to-str encoding and decoding
+[#decoding-note]_.
+
+.. tabularcolumns:: |l|L|L|L|
+
++--------------------+---------------------------+---------------------------+------------------------------+
+| Codec | Aliases | Purpose | Encoder/decoder |
++====================+===========================+===========================+==============================+
+| base64_codec | base64, base-64 | Convert operand to MIME | :meth:`base64.b64encode`, |
+| | | base64 (the result always | :meth:`base64.b64decode` |
+| | | includes a trailing | |
+| | | ``'\n'``) | |
++--------------------+---------------------------+---------------------------+------------------------------+
+| bz2_codec | bz2 | Compress the operand | :meth:`bz2.compress`, |
+| | | using bz2 | :meth:`bz2.decompress` |
++--------------------+---------------------------+---------------------------+------------------------------+
+| hex_codec | hex | Convert operand to | :meth:`base64.b16encode`, |
+| | | hexadecimal | :meth:`base64.b16decode` |
+| | | representation, with two | |
+| | | digits per byte | |
++--------------------+---------------------------+---------------------------+------------------------------+
+| quopri_codec | quopri, quoted-printable, | Convert operand to MIME | :meth:`quopri.encodestring`, |
+| | quotedprintable | quoted printable | :meth:`quopri.decodestring` |
++--------------------+---------------------------+---------------------------+------------------------------+
+| string_escape | | Produce a string that is | |
+| | | suitable as string | |
+| | | literal in Python source | |
+| | | code | |
++--------------------+---------------------------+---------------------------+------------------------------+
+| uu_codec | uu | Convert the operand using | :meth:`uu.encode`, |
+| | | uuencode | :meth:`uu.decode` |
++--------------------+---------------------------+---------------------------+------------------------------+
+| zlib_codec | zip, zlib | Compress the operand | :meth:`zlib.compress`, |
+| | | using gzip | :meth:`zlib.decompress` |
++--------------------+---------------------------+---------------------------+------------------------------+
+
+.. [#encoding-note] str objects are also accepted as input in place of unicode
+ objects. They are implicitly converted to unicode by decoding them using
+ the default encoding. If this conversion fails, it may lead to encoding
+ operations raising :exc:`UnicodeDecodeError`.
+
+.. [#decoding-note] unicode objects are also accepted as input in place of str
+ objects. They are implicitly converted to str by encoding them using the
+ default encoding. If this conversion fails, it may lead to decoding
+ operations raising :exc:`UnicodeEncodeError`.
+
:mod:`encodings.idna` --- Internationalized Domain Names in Applications
------------------------------------------------------------------------
diff --git a/Misc/NEWS b/Misc/NEWS
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -26,12 +26,17 @@
- Issue #14146: Highlight source line while debugging on Windows.
-
Tests
-----
- Issue #11995: test_pydoc doesn't import all sys.path modules anymore.
+Documentation
+-------------
+
+- Issue #17844: Refactor a documentation of Python specific encodings.
+ Add links to encoders and decoders for binary-to-binary codecs.
+
What's New in Python 2.7.5?
===========================
--
Repository URL: http://hg.python.org/cpython
More information about the Python-checkins
mailing list