[Python-checkins] gh-98712: Clarify "readonly bytes-like object" semantics in C arg-parsing docs (GH-98710)

miss-islington webhook-mailer at python.org
Fri Dec 23 10:09:40 EST 2022


https://github.com/python/cpython/commit/bd472198c6ab7d69657aa3d70df1320584cf98b2
commit: bd472198c6ab7d69657aa3d70df1320584cf98b2
branch: 3.10
author: Miss Islington (bot) <31488909+miss-islington at users.noreply.github.com>
committer: miss-islington <31488909+miss-islington at users.noreply.github.com>
date: 2022-12-23T07:09:34-08:00
summary:

gh-98712: Clarify "readonly bytes-like object" semantics in C arg-parsing docs (GH-98710)

(cherry picked from commit 49f6ff719c4e0beeafd6c42edd696601acf72764)

Co-authored-by: Petr Viktorin <encukou at gmail.com>

files:
M Doc/c-api/arg.rst

diff --git a/Doc/c-api/arg.rst b/Doc/c-api/arg.rst
index 85f9eda17a20..6a53c79bd3be 100644
--- a/Doc/c-api/arg.rst
+++ b/Doc/c-api/arg.rst
@@ -34,24 +34,39 @@ These formats allow accessing an object as a contiguous chunk of memory.
 You don't have to provide raw storage for the returned unicode or bytes
 area.
 
-In general, when a format sets a pointer to a buffer, the buffer is
-managed by the corresponding Python object, and the buffer shares
-the lifetime of this object.  You won't have to release any memory yourself.
-The only exceptions are ``es``, ``es#``, ``et`` and ``et#``.
-
-However, when a :c:type:`Py_buffer` structure gets filled, the underlying
-buffer is locked so that the caller can subsequently use the buffer even
-inside a :c:type:`Py_BEGIN_ALLOW_THREADS` block without the risk of mutable data
-being resized or destroyed.  As a result, **you have to call**
-:c:func:`PyBuffer_Release` after you have finished processing the data (or
-in any early abort case).
-
 Unless otherwise stated, buffers are not NUL-terminated.
 
-Some formats require a read-only :term:`bytes-like object`, and set a
-pointer instead of a buffer structure.  They work by checking that
-the object's :c:member:`PyBufferProcs.bf_releasebuffer` field is ``NULL``,
-which disallows mutable objects such as :class:`bytearray`.
+There are three ways strings and buffers can be converted to C:
+
+*  Formats such as ``y*`` and ``s*`` fill a :c:type:`Py_buffer` structure.
+   This locks the underlying buffer so that the caller can subsequently use
+   the buffer even inside a :c:type:`Py_BEGIN_ALLOW_THREADS`
+   block without the risk of mutable data being resized or destroyed.
+   As a result, **you have to call** :c:func:`PyBuffer_Release` after you have
+   finished processing the data (or in any early abort case).
+
+*  The ``es``, ``es#``, ``et`` and ``et#`` formats allocate the result buffer.
+   **You have to call** :c:func:`PyMem_Free` after you have finished
+   processing the data (or in any early abort case).
+
+*  .. _c-arg-borrowed-buffer:
+
+   Other formats take a :class:`str` or a read-only :term:`bytes-like object`,
+   such as :class:`bytes`, and provide a ``const char *`` pointer to
+   its buffer.
+   In this case the buffer is "borrowed": it is managed by the corresponding
+   Python object, and shares the lifetime of this object.
+   You won't have to release any memory yourself.
+
+   To ensure that the underlying buffer may be safely borrowed, the object's
+   :c:member:`PyBufferProcs.bf_releasebuffer` field must be ``NULL``.
+   This disallows common mutable objects such as :class:`bytearray`,
+   but also some read-only objects such as :class:`memoryview` of
+   :class:`bytes`.
+
+   Besides this ``bf_releasebuffer`` requirement, there is no check to verify
+   whether the input object is immutable (e.g. whether it would honor a request
+   for a writable buffer, or whether another thread can mutate the data).
 
 .. note::
 
@@ -89,7 +104,7 @@ which disallows mutable objects such as :class:`bytearray`.
    Unicode objects are converted to C strings using ``'utf-8'`` encoding.
 
 ``s#`` (:class:`str`, read-only :term:`bytes-like object`) [const char \*, :c:type:`Py_ssize_t`]
-   Like ``s*``, except that it doesn't accept mutable objects.
+   Like ``s*``, except that it provides a :ref:`borrowed buffer <c-arg-borrowed-buffer>`.
    The result is stored into two C variables,
    the first one a pointer to a C string, the second one its length.
    The string may contain embedded null bytes. Unicode objects are converted
@@ -108,8 +123,9 @@ which disallows mutable objects such as :class:`bytearray`.
    pointer is set to ``NULL``.
 
 ``y`` (read-only :term:`bytes-like object`) [const char \*]
-   This format converts a bytes-like object to a C pointer to a character
-   string; it does not accept Unicode objects.  The bytes buffer must not
+   This format converts a bytes-like object to a C pointer to a
+   :ref:`borrowed <c-arg-borrowed-buffer>` character string;
+   it does not accept Unicode objects.  The bytes buffer must not
    contain embedded null bytes; if it does, a :exc:`ValueError`
    exception is raised.
 



More information about the Python-checkins mailing list