[Python-Dev] Breaking undocumented API

Guido van Rossum guido at python.org
Tue Nov 16 16:48:20 CET 2010


On Tue, Nov 16, 2010 at 7:16 AM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> What this thread has shown is that there is no consensus on what
> public names are and what rules should be followed when changing names
> that can be imported from a module.  I have opened an issue at
> http://bugs.python.org/issue10434 to address this.  My vote is to
> adopt the definition spelled out in the language reference, copy it to
> the library manual and add some discussion of the deprecation
> policies.

Hm. Apart from the specific semantics assigned by the language to
single and double leading (and trailing) underscores, I still think
this belongs in a style guide, not in the library manual. When reading
the library manual, one should always assume that undocumented
features are subject to change at any time.

When writing library code, one should of course be much more
conservative, and guidelines for contributors are needed to ensure
that in the future we won't repeat the mistakes of the past (mostly my
own mistakes :-).

> I also have a similar question about C API.  Here, in absence of
> __all__, the answer should be clear: all symbols in public header
> files should start with either _Py_ or Py_ and those that start with
> Py_ are public.   The question is what should be done with names that
> start with Py_, but are not documented?  Can we add an underscore to
> those names?  If so, should a (deprecated) alias be made available?
> Should they be documented as deprecated?

Even more care should be taken here, since breakage is harder to fix,
especially in 3rd party code that needs to be compatible with a wide
range of Python versions.

The good news here is that the intended rule is very clear:

- *no* symbols that don't start with Py_ or _Py_ (unless there's a
technical reason why it can't be named that way)
- public == Py_
- private == _Py_

> I think these questions can only be answered on a case by case bases

Right!

> which choices being:
>
> 1. Document.
> 2. Document as deprecated.
> 3. Document as deprecated, add underscore prefix and retain a deprecated alias.
> 4. Add an underscore prefix.
>
> The specific set of names that I would like to consider is the
> following from unicode.h.  I am marking with (*) the names that I
> think should be documented and with (D) those that should be
> deprecated:
>
> PyUnicode_GetMax
> PyUnicode_Resize (*)
> PyUnicode_InternImmortal
> PyUnicode_FromOrdinal (*)
> PyUnicode_GetDefaultEncoding (D)
> PyUnicode_AsDecodedObject
> PyUnicode_AsDecodedUnicode
> PyUnicode_AsEncodedObject
> PyUnicode_AsEncodedUnicode
> PyUnicode_BuildEncodingMap
> PyUnicode_EncodeDecimal (*)
> PyUnicode_Append (*)
> PyUnicode_AppendAndDel (*)
> PyUnicode_Partition (*)
> PyUnicode_RPartition (*)
> PyUnicode_RSplit (*)
> PyUnicode_IsIdentifier (*)
> Py_UNICODE_strlen
> Py_UNICODE_strcpy
> Py_UNICODE_strcat
> Py_UNICODE_strncpy
> Py_UNICODE_strcmp
> Py_UNICODE_strncmp
> Py_UNICODE_strchr
> Py_UNICODE_strrchr

I'll leave this to others more familiar with the Unicode code; I would
recommend being fairly conservative though since these have been
around for a long time.

-- 
--Guido van Rossum (python.org/~guido)


More information about the Python-Dev mailing list