[pypy-commit] cffi default: Document char16_t and char32_t
arigo
pypy.commits at gmail.com
Fri Jun 2 03:19:42 EDT 2017
Author: Armin Rigo <arigo at tunes.org>
Branch:
Changeset: r2963:c60281bf502f
Date: 2017-06-02 09:19 +0200
http://bitbucket.org/cffi/cffi/changeset/c60281bf502f/
Log: Document char16_t and char32_t
diff --git a/doc/source/cdef.rst b/doc/source/cdef.rst
--- a/doc/source/cdef.rst
+++ b/doc/source/cdef.rst
@@ -178,7 +178,8 @@
* intN_t, uintN_t (for N=8,16,32,64), intptr_t, uintptr_t, ptrdiff_t,
size_t, ssize_t
-* wchar_t (if supported by the backend)
+* wchar_t (if supported by the backend). *New in version 1.11:*
+ char16_t and char32_t.
* _Bool and bool (equivalent). If not directly supported by the C
compiler, this is declared with the size of ``unsigned char``.
diff --git a/doc/source/ref.rst b/doc/source/ref.rst
--- a/doc/source/ref.rst
+++ b/doc/source/ref.rst
@@ -104,11 +104,13 @@
returns a ``bytes``, not a ``str``.
- If 'cdata' is a pointer or array of wchar_t, returns a unicode string
- following the same rules.
+ following the same rules. *New in version 1.11:* can also be
+ char16_t or char32_t.
-- If 'cdata' is a single character or byte or a wchar_t, returns it as a
- byte string or unicode string. (Note that in some situation a single
- wchar_t may require a Python unicode string of length 2.)
+- If 'cdata' is a single character or byte or a wchar_t or charN_t,
+ returns it as a byte string or unicode string. (Note that in some
+ situation a single wchar_t or char32_t may require a Python unicode
+ string of length 2.)
- If 'cdata' is an enum, returns the value of the enumerator as a string.
If the value is out of range, it is simply returned as the stringified
@@ -125,7 +127,7 @@
- If 'cdata' is a pointer to 'wchar_t', returns a unicode string.
('length' is measured in number of wchar_t; it is not the size in
- bytes.)
+ bytes.) *New in version 1.11:* can also be char16_t or char32_t.
- If 'cdata' is a pointer to anything else, returns a list, of the
given 'length'. (A slower way to do that is ``[cdata[i] for i in
@@ -626,10 +628,10 @@
| ``char`` | a string of length 1 | a string of | int(), bool(), |
| | or another <cdata char>| length 1 | ``<`` |
+---------------+------------------------+------------------+----------------+
-| ``wchar_t`` | a unicode of length 1 | a unicode of | |
-| | (or maybe 2 if | length 1 | int(), bool(), |
-| | surrogates) or | (or maybe 2 if | ``<`` |
-| | another <cdata wchar_t>| surrogates) | |
+| ``wchar_t``, | a unicode of length 1 | a unicode of | |
+| ``char16_t``, | (or maybe 2 if | length 1 | int(), bool(), |
+| ``char32_t`` | surrogates) or | (or maybe 2 if | ``<`` |
+| | another similar <cdata>| surrogates) | |
+---------------+------------------------+------------------+----------------+
| ``float``, | a float or anything on | a Python float | float(), int(),|
| ``double`` | which float() works | | bool(), ``<`` |
@@ -671,9 +673,9 @@
| ``char[]``, | | | ``-`` |
| ``_Bool[]`` | | | |
+---------------+------------------------+ +----------------+
-| ``wchar_t[]`` | same as arrays, or a | | len(), iter(), |
-| | Python unicode string | | ``[]``, |
-| | | | ``+``, ``-`` |
+|``wchar_t[]``, | same as arrays, or a | | len(), iter(), |
+|``char16_t[]``,| Python unicode string | | ``[]``, |
+|``char32_t[]`` | | | ``+``, ``-`` |
| | | | |
+---------------+------------------------+------------------+----------------+
| structure | a list or tuple or | a <cdata> | read/write |
diff --git a/doc/source/using.rst b/doc/source/using.rst
--- a/doc/source/using.rst
+++ b/doc/source/using.rst
@@ -25,6 +25,11 @@
unicode string to an integer, ``ord(x)`` does not work; use instead
``int(ffi.cast('wchar_t', x))``.
+*New in version 1.11:* in addition to ``wchar_t``, the C types
+``char16_t`` and ``char32_t`` work the same but with a known fixed size.
+In previous versions, this could be achieved using ``uint16_t`` and
+``int32_t`` but without automatic convertion to Python unicodes.
+
Pointers, structures and arrays are more complex: they don't have an
obvious Python equivalent. Thus, they correspond to objects of type
``cdata``, which are printed for example as
@@ -197,9 +202,10 @@
>>> ffi.string(x) # interpret 'x' as a regular null-terminated string
'Hello'
-Similarly, arrays of wchar_t can be initialized from a unicode string,
+Similarly, arrays of wchar_t or char16_t or char32_t can be initialized
+from a unicode string,
and calling ``ffi.string()`` on the cdata object returns the current unicode
-string stored in the wchar_t array (adding surrogates if necessary).
+string stored in the source array (adding surrogates if necessary).
Note that unlike Python lists or tuples, but like C, you *cannot* index in
a C array from the end using negative numbers.
@@ -347,7 +353,8 @@
assert lib.strlen("hello") == 5
-You can also pass unicode strings as ``wchar_t *`` arguments. Note that
+You can also pass unicode strings as ``wchar_t *`` or ``char16_t *`` or
+``char32_t *`` arguments. Note that
the C language makes no difference between argument declarations that
use ``type *`` or ``type[]``. For example, ``int *`` is fully
equivalent to ``int[]`` (or even ``int[5]``; the 5 is ignored). For CFFI,
diff --git a/doc/source/whatsnew.rst b/doc/source/whatsnew.rst
--- a/doc/source/whatsnew.rst
+++ b/doc/source/whatsnew.rst
@@ -6,6 +6,13 @@
v1.11
=====
+* Support the modern standard types ``char16_t`` and ``char32_t``.
+ These work like ``wchar_t``: they represent one unicode character, or
+ when used as ``charN_t *`` or ``charN_t[]`` they represent a unicode
+ string. The difference with ``wchar_t`` is that they have a known,
+ fixed size. They should work at all places that used to work with
+ ``wchar_t`` (please report an issue if I missing something).
+
* Support the C99 types ``float _Complex`` and ``double _Complex``.
Note that libffi doesn't support them, which means that in the ABI
mode you still cannot call C functions that take complex numbers
More information about the pypy-commit
mailing list