[Python-checkins] cpython (merge 3.3 -> default): #16127: merge with 3.3.

Fri Oct 5 02:34:16 CEST 2012

http://hg.python.org/cpython/rev/89ee959a9b54
changeset:   79478:89ee959a9b54
parent:      79476:42c063b3821f
parent:      79477:a1aa13ef00c5
user:        Ezio Melotti <ezio.melotti at gmail.com>
date:        Fri Oct 05 03:34:02 2012 +0300
summary:
  #16127: merge with 3.3.

files:
  Doc/c-api/unicode.rst              |   2 --
  Doc/reference/lexical_analysis.rst |   4 +---
  Include/unicodeobject.h            |   3 +--
  Objects/unicodeobject.c            |  14 ++++----------
  4 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -1083,8 +1083,6 @@
    After completion, *\*byteorder* is set to the current byte order at the end
    of input data.
 
-   In a narrow build codepoints outside the BMP will be decoded as surrogate pairs.
-
    If *byteorder* is *NULL*, the codec starts in native order mode.
 
    Return *NULL* if an exception was raised by the codec.
diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst
--- a/Doc/reference/lexical_analysis.rst
+++ b/Doc/reference/lexical_analysis.rst
@@ -538,9 +538,7 @@
    this escape sequence.  Exactly four hex digits are required.
 
 (6)
-   Any Unicode character can be encoded this way, but characters outside the Basic
-   Multilingual Plane (BMP) will be encoded using a surrogate pair if Python is
-   compiled to use 16-bit code units (the default).  Exactly eight hex digits
+   Any Unicode character can be encoded this way.  Exactly eight hex digits
    are required.
 
 
diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h
--- a/Include/unicodeobject.h
+++ b/Include/unicodeobject.h
@@ -1022,8 +1022,7 @@
 
 /* Create a Unicode Object from the given Unicode code point ordinal.
 
-   The ordinal must be in range(0x10000) on narrow Python builds
-   (UCS2), and range(0x110000) on wide builds (UCS4). A ValueError is
+   The ordinal must be in range(0x110000). A ValueError is
    raised in case it is not.
 
 */
diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c
--- a/Objects/unicodeobject.c
+++ b/Objects/unicodeobject.c
@@ -5800,18 +5800,12 @@
     void *data;
     Py_ssize_t expandsize = 0;
 
-    /* Initial allocation is based on the longest-possible unichr
+    /* Initial allocation is based on the longest-possible character
        escape.
 
-       In wide (UTF-32) builds '\U00xxxxxx' is 10 chars per source
-       unichr, so in this case it's the longest unichr escape. In
-       narrow (UTF-16) builds this is five chars per source unichr
-       since there are two unichrs in the surrogate pair, so in narrow
-       (UTF-16) builds it's not the longest unichr escape.
-
-       In wide or narrow builds '\uxxxx' is 6 chars per source unichr,
-       so in the narrow (UTF-16) build case it's the longest unichr
-       escape.
+       For UCS1 strings it's '\xxx', 4 bytes per source character.
+       For UCS2 strings it's '\uxxxx', 6 bytes per source character.
+       For UCS4 strings it's '\U00xxxxxx', 10 bytes per source character.
     */
 
     if (!PyUnicode_Check(unicode)) {

-- 
Repository URL: http://hg.python.org/cpython