[issue1943] improved allocation of PyUnicode objects

Mon May 25 10:17:56 CEST 2009

Marc-Andre Lemburg <mal at egenix.com> added the comment:

Antoine, I have explained the reasons for rejecting the patch. In short,
it violates a design principle behind the Unicode implementation.

If you want to change such a basic aspect of the Unicode implementation,
then write a PEP which demonstrates the usefulness on a larger set of
more general tests and comes up with significant results (10% speedup in
some micro benchmarks is not significant; memory tests need to be run
without pymalloc and require extra care to work around OS malloc
optimization strategies).

Like I said: The current design of the Unicode object implementation
would benefit more from advances in pymalloc tuning, not from making it
next to impossible to extend the Unicode objects to e.g. 

 * reuse existing memory blocks for allocation, 
 * pointing straight into memory mapped files, 
 * providing highly efficient ways to tokenize Unicode data,
 * sharing of data between Unicode objects,
 etc.

The reason I chose this design was to make the above easily
implementable and it was a conscious decision to use a PyObject
rather than a PyVarObject, like the string object, since I knew 
that the Unicode object was eventually going to replace the string
object.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue1943>
_______________________________________