[Python-checkins] r55146 - in python/branches/py3k-struni: Doc/api/concrete.tex Include/unicodeobject.h Objects/unicodeobject.c

Sat May 5 20:16:03 CEST 2007

On 5/5/07, walter.doerwald <python-checkins at python.org> wrote:
> Author: walter.doerwald
> Date: Sat May  5 14:00:46 2007
> New Revision: 55146
>
> Modified:
>    python/branches/py3k-struni/Doc/api/concrete.tex
>    python/branches/py3k-struni/Include/unicodeobject.h
>    python/branches/py3k-struni/Objects/unicodeobject.c
> Log:
> Add PyUnicode_FromString(), which create a unicode object from a
> const char * (i.e. 0-terminated latin-1 encoded bytes).
>
>
> Modified: python/branches/py3k-struni/Doc/api/concrete.tex
> ==============================================================================
> --- python/branches/py3k-struni/Doc/api/concrete.tex    (original)
> +++ python/branches/py3k-struni/Doc/api/concrete.tex    Sat May  5 14:00:46 2007
> @@ -995,6 +995,17 @@
>    \var{u} is \NULL{}.
>  \end{cfuncdesc}
>
> +\begin{cfuncdesc}{PyObject*}{PyUnicode_FromString}{const char *u}
> +  Create a Unicode Object from the char buffer \var{u} of the.
> +  \var{u} must be 0-terminated, the bytes will be interpreted as
> +  being latin-1 encoded. \var{u} may also be \NULL{} which causes the
> +  contents to be undefined. It is the user's responsibility to fill
> +  in the needed data.  The buffer is copied into the new object.
> +  If the buffer is not \NULL{}, the return value might be a shared object.
> +  Therefore, modification of the resulting Unicode object is only allowed
> +  when \var{u} is \NULL{}.
> +\end{cfuncdesc}
> +
>  \begin{cfuncdesc}{Py_UNICODE*}{PyUnicode_AsUnicode}{PyObject *unicode}
>    Return a read-only pointer to the Unicode object's internal
>    \ctype{Py_UNICODE} buffer, \NULL{} if \var{unicode} is not a Unicode
>
> Modified: python/branches/py3k-struni/Include/unicodeobject.h
> ==============================================================================
> --- python/branches/py3k-struni/Include/unicodeobject.h (original)
> +++ python/branches/py3k-struni/Include/unicodeobject.h Sat May  5 14:00:46 2007
> @@ -172,6 +172,7 @@
>  # define PyUnicode_FromObject PyUnicodeUCS2_FromObject
>  # define PyUnicode_FromOrdinal PyUnicodeUCS2_FromOrdinal
>  # define PyUnicode_FromUnicode PyUnicodeUCS2_FromUnicode
> +# define PyUnicode_FromString PyUnicodeUCS2_FromString
>  # define PyUnicode_FromWideChar PyUnicodeUCS2_FromWideChar
>  # define PyUnicode_GetDefaultEncoding PyUnicodeUCS2_GetDefaultEncoding
>  # define PyUnicode_GetMax PyUnicodeUCS2_GetMax
> @@ -250,6 +251,7 @@
>  # define PyUnicode_FromObject PyUnicodeUCS4_FromObject
>  # define PyUnicode_FromOrdinal PyUnicodeUCS4_FromOrdinal
>  # define PyUnicode_FromUnicode PyUnicodeUCS4_FromUnicode
> +# define PyUnicode_FromString PyUnicodeUCS4_FromString
>  # define PyUnicode_FromWideChar PyUnicodeUCS4_FromWideChar
>  # define PyUnicode_GetDefaultEncoding PyUnicodeUCS4_GetDefaultEncoding
>  # define PyUnicode_GetMax PyUnicodeUCS4_GetMax
> @@ -427,6 +429,12 @@
>      Py_ssize_t size             /* size of buffer */
>      );
>
> +/* Similar to PyUnicode_FromUnicode(), but u points to null-terminated
> +   Latin-1 encoded bytes */
> +PyAPI_FUNC(PyObject*) PyUnicode_FromString(
> +    const char *u        /* string */
> +    );
> +
>  /* Return a read-only pointer to the Unicode object's internal
>     Py_UNICODE buffer. */
>
>
> Modified: python/branches/py3k-struni/Objects/unicodeobject.c
> ==============================================================================
> --- python/branches/py3k-struni/Objects/unicodeobject.c (original)
> +++ python/branches/py3k-struni/Objects/unicodeobject.c Sat May  5 14:00:46 2007
> @@ -393,6 +393,51 @@
>      return (PyObject *)unicode;
>  }
>
> +PyObject *PyUnicode_FromString(const char *u)
> +{
> +    PyUnicodeObject *unicode;
> +    Py_ssize_t size = strlen(u);

This should check for overflow.  It's possible to have a 2+GB string
on a 32-bit platform which would result in a negative size.

n