[Python-Dev] Python for small platforms

Jeff Collins Jeffery.D.Collins@aero.org
Tue, 11 Apr 2000 13:03:47 -0700


I've just had the chance to examine the unicode implementation and was
surprised by the size of the code introduced - not just by the size of
the database extension module (which I understand Christian Tismer is
optimizing and which I assume can be configured away), but in
particular by the size of the additional objects (unicodeobject.c,
unicodetype.c).  These additional objects alone contribute
approximately 100K to the resulting executable.  On desktop systems,
this is not of much concern and suggestions have been made previously
to reduce this if necessary (shared extension modules and possibly a
shared VM - libpython.so).  However, on small embedded systems (eg,
PalmIII), this additional code is tremendous.  The current size of the
python-1.5.2-pre-unicode VM (after removal of float and complex
objects with more reductions to come) on the PalmIII is 240K (already
huge by Palm standards).  (For reference, the size of python-1.5.1 on
the PalmIII is 160K, after removal of the compiler, parser,
float/long/complex objects.)  With the unicode additions, this value
jumps to 340K.

The upshot of this is that for small platforms on which I am working,
unicode support will have to be removed.  My immediated concern is
that unicode is getting so embedded in python that it will be
difficult to extract.

The approach I've taken for removing "features" (like float objects):
1)  removes the feature with WITHOUT_XXX #ifdef/#endif decorations, 
	where XXX denotes the removable feature (configurable in config.h)
2)  preserves the python API:  builtin functions, C API, PyArg_Parse,
	print format specifiers, etc., raise MissingFeatureError if
	attempts are made to use them.  Of course, the API associated
	with the removed feature is no longer present.
3)  protects the reduced VM: all reads (via marshal, compile, etc.)
	involving source/compiled python code will fail with
	a MissingFeatureError if the reduced VM doesn't support it.
4)  does not yet support a MissingFeatureError in the tokenizer
	if, say, 2.2 (for removed floats) is entered on the python
	command line.  This instead results in a SyntaxError
	indicating a problem with the decimal point.  It appears that
	another error token would have to be added to support
	this error.

Of course, I may have missed something, but if the above appears to be
a reasonable approach, I can supply patches (at least for floats and
complexes) for further discussion.  In the longer term, it would be
helpful if developers would follow this (or a similar agreed upon
approach) when adding new features.  This would reduce the burden of
maintaining python for small embedded platforms.

Thanks,

Jeff