It is quite easy to extend Python with a new module written in C. It
should be equally easy to embed the Python interpreter as a library in
an existing (or new) application. This is indeed possible, but there
is a potential problem when the application is large: the Python
interpreter library defines many external names that may clash with
names defined by the application. To a lesser extend the macros,
structures and typedefs defined in Python's header files may also
clash with application-defined names (or with names defined by other
libraries, when the application needs to use several libraries in one
This article proposes a change to the naming of all external
functions, macros, structures, typedefs etc. that are in any way
visible to the application with the intent to make clashes avoidable.
(This excludes the names of static functions and variables, local
variables, structure members, and macros, structures and typedefs
defined in .c files.) At the same time the set of names is somewhat
rationalized to make it easier to guess where a name is defined.
The purpose of this proposal is to get feedback. This is your chance
to make your voice heard! Please evaluate every aspect of the
proposal and let me know whether you like it or not. No effort has
yet begun to implement the proposal, so there are no big costs
involved in changes to the proposal. Your opinion is especially
important if you have written (or are maintaining) code that extends
the Python interpreter, since you will have to change your code and
learn to use the new conventions.
Note that these changes affect the C code that implements the Python
interpreter (and its extensions) only; the Python language as seen by
Python programmers will remain unchanged!
The mechanism to avoid clashes is to prefix all relevant names with
the string Py. This gives an application a very simple way to avoid
clashes: simply don't define names starting with Py in the
application. A similar mechanism is successfully used by most
widespread libraries, e.g. X.
The full naming scheme gives functions a name as follows:
Py<Module>_<Function>, where <Module> is the (sometimes abbreviated)
name of the module to which the function belongs (this may also be a
basic type), and <Function> is a descriptive name for the function.
The <Module> and <Function> parts use a mixed-case convention: each
is spelled with an initial capital letter, and if they consist of
multiple words each word has an initial capital. Embedded underscores
are not used except between <Module> and <Function>. Some examples:
Functions applying to any kind of objects, and some other oft-used
utility functions, have a prefix of just Py_, e.g.
Global variables are named using similar conventions, e.g.
Functions and variables that have to be global for some reason but are
not part of the official interface have an initial underscore, e.g.
Macros with arguments are named like functions, e.g.:
Macros without arguments (symbolic constants) have a second part that
is entirely in capitals and uses underscores to separate words. So
do enums. E.g.:
Typedefs contain no underscore; they consist of Py followed by one or
more words with initial capitals, e.g.:
Most of these typedef names correspond to structures; the structure
tag will be the typedef name prefixed with an underdscore.
Note that the current system has a number of structures that are not
covered by typedefs. These will be given typedef names.
Also note that pointer types will continue to be written with the "*"
notation, e.g. "PyObject *". Pointers in C have sufficiently special
semantics to make it important in practice to know whether a
particular variable is a pointer; e.g. Python frequently uses casts
between various object pointers.
Header files will follow the conventions for typedef names, e.g.
A header named "Python.h" will collect almost all useful headers.
During a transition period, which will last at least two releases that
each live at least 3 months (giving a total of at least 6 months), it
will be possible to use the old names and the new names together.
The transition mechanism uses #define to identify the new and old names.
In the first transition release, the (majority of) source code will
still use the old names, including the old header file names, but
macros will be defined to support the use of the new names, e.g.:
#define PyObject object
In the second transition release, the source code will use the new
names, also for header file names, and macros will be defined to
support code that still uses the old names, e.g.:
#define object PyObject
Headers with the alternative names will be provided in both cases.
In both transition releases, the compatibility macros will be gathered
in a single header file (a different one each time!).
A Python script will be provided which translates C code from the old
to the new naming conventions with 99% accuracy. It remains to be
seen whether occurrences in comments should be replaced; when comments
refer to functions and typedefs etc., they should, but some names
(e.g. object!) can also be used as a noun, so global substitutions may
cause undesired effects. Also, there are some situations where names
occurring in strings must be substituted!
It may be possible to introduce other changes as well, e.g. changes to
the source tree: I might separate the .h files from the .c files,
separate optional modules from the required part of the interpreter,
move the parser generator to its own directory. (Anything else?)
A much simpler alternative would be to prefix all names with py_
(sometimes PY_) and leave everything else unchanged: py_object,
py_getattr, py_is_listobject, etc. This will be easier for those
(like myself!) whose eyes and fingers have been trained to use the
current names, but it will leave the existing inconsistencies in
place. (The Python language itself uses lowercase for must
situations, but its library is not too consistent, so it cannot really
serve as a guideline here.)
--Guido van Rossum, CWI, Amsterdam <Guido.van.Rossum@cwi.nl>