[Python-checkins] cpython (2.7): Closes #23181: codepoint -> code point
georg.brandl
python-checkins at python.org
Wed Jan 14 08:41:19 CET 2015
https://hg.python.org/cpython/rev/e280a04625cc
changeset: 94137:e280a04625cc
branch: 2.7
parent: 94123:6a19e37ce94d
user: Georg Brandl <georg at python.org>
date: Wed Jan 14 08:26:30 2015 +0100
summary:
Closes #23181: codepoint -> code point
files:
Doc/c-api/unicode.rst | 4 ++--
Doc/library/codecs.rst | 12 ++++++------
Doc/library/htmllib.rst | 4 ++--
Doc/library/json.rst | 2 +-
Doc/tutorial/interpreter.rst | 2 +-
5 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -547,7 +547,7 @@
After completion, *\*byteorder* is set to the current byte order at the end
of input data.
- In a narrow build codepoints outside the BMP will be decoded as surrogate pairs.
+ In a narrow build code points outside the BMP will be decoded as surrogate pairs.
If *byteorder* is *NULL*, the codec starts in native order mode.
@@ -580,7 +580,7 @@
mark (U+FEFF). In the other two modes, no BOM mark is prepended.
If *Py_UNICODE_WIDE* is not defined, surrogate pairs will be output
- as a single codepoint.
+ as a single code point.
Return *NULL* if an exception was raised by the codec.
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -787,7 +787,7 @@
Encodings and Unicode
---------------------
-Unicode strings are stored internally as sequences of codepoints (to be precise
+Unicode strings are stored internally as sequences of code points (to be precise
as :c:type:`Py_UNICODE` arrays). Depending on the way Python is compiled (either
via ``--enable-unicode=ucs2`` or ``--enable-unicode=ucs4``, with the
former being the default) :c:type:`Py_UNICODE` is either a 16-bit or 32-bit data
@@ -796,24 +796,24 @@
unicode object into a sequence of bytes is called encoding and recreating the
unicode object from the sequence of bytes is known as decoding. There are many
different methods for how this transformation can be done (these methods are
-also called encodings). The simplest method is to map the codepoints 0-255 to
+also called encodings). The simplest method is to map the code points 0-255 to
the bytes ``0x0``-``0xff``. This means that a unicode object that contains
-codepoints above ``U+00FF`` can't be encoded with this method (which is called
+code points above ``U+00FF`` can't be encoded with this method (which is called
``'latin-1'`` or ``'iso-8859-1'``). :func:`unicode.encode` will raise a
:exc:`UnicodeEncodeError` that looks like this: ``UnicodeEncodeError: 'latin-1'
codec can't encode character u'\u1234' in position 3: ordinal not in
range(256)``.
There's another group of encodings (the so called charmap encodings) that choose
-a different subset of all unicode code points and how these codepoints are
+a different subset of all unicode code points and how these code points are
mapped to the bytes ``0x0``-``0xff``. To see how this is done simply open
e.g. :file:`encodings/cp1252.py` (which is an encoding that is used primarily on
Windows). There's a string constant with 256 characters that shows you which
character is mapped to which byte value.
-All of these encodings can only encode 256 of the 1114112 codepoints
+All of these encodings can only encode 256 of the 1114112 code points
defined in unicode. A simple and straightforward way that can store each Unicode
-code point, is to store each codepoint as four consecutive bytes. There are two
+code point, is to store each code point as four consecutive bytes. There are two
possibilities: store the bytes in big endian or in little endian order. These
two encodings are called ``UTF-32-BE`` and ``UTF-32-LE`` respectively. Their
disadvantage is that if e.g. you use ``UTF-32-BE`` on a little endian machine you
diff --git a/Doc/library/htmllib.rst b/Doc/library/htmllib.rst
--- a/Doc/library/htmllib.rst
+++ b/Doc/library/htmllib.rst
@@ -185,14 +185,14 @@
.. data:: name2codepoint
- A dictionary that maps HTML entity names to the Unicode codepoints.
+ A dictionary that maps HTML entity names to the Unicode code points.
.. versionadded:: 2.3
.. data:: codepoint2name
- A dictionary that maps Unicode codepoints to HTML entity names.
+ A dictionary that maps Unicode code points to HTML entity names.
.. versionadded:: 2.3
diff --git a/Doc/library/json.rst b/Doc/library/json.rst
--- a/Doc/library/json.rst
+++ b/Doc/library/json.rst
@@ -533,7 +533,7 @@
that don't correspond to valid Unicode characters (e.g. unpaired UTF-16
surrogates), but it does note that they may cause interoperability problems.
By default, this module accepts and outputs (when present in the original
-:class:`str`) codepoints for such sequences.
+:class:`str`) code points for such sequences.
Infinite and NaN Number Values
diff --git a/Doc/tutorial/interpreter.rst b/Doc/tutorial/interpreter.rst
--- a/Doc/tutorial/interpreter.rst
+++ b/Doc/tutorial/interpreter.rst
@@ -140,7 +140,7 @@
For example, to write Unicode literals including the Euro currency symbol, the
ISO-8859-15 encoding can be used, with the Euro symbol having the ordinal value
164. This script, when saved in the ISO-8859-15 encoding, will print the value
-8364 (the Unicode codepoint corresponding to the Euro symbol) and then exit::
+8364 (the Unicode code point corresponding to the Euro symbol) and then exit::
# -*- coding: iso-8859-15 -*-
--
Repository URL: https://hg.python.org/cpython
More information about the Python-checkins
mailing list