[issue16475] Support object instancing and recursion in marshal

Kristján Valur Jónsson report at bugs.python.org
Mon Nov 19 12:35:43 CET 2012


Kristján Valur Jónsson added the comment:

If you have string sharing, adding support for general sharing falls automatically out without any effort.  There is no reason _not_ to support it, in other words.
Marshal may be primarily used for .pyc files but it is not the only usage.  It is a very fast and powerful serializer for data that is not subject to the overhead or safety concerns of the general pickle protocol.  This is illustrated by the following code (2.7):

case TYPE_CODE:
        if (PyEval_GetRestricted()) {
            PyErr_SetString(PyExc_RuntimeError,
                "cannot unmarshal code objects in "
                "restricted execution mode");
Obviously, this shows that marshal is still expected to work and be useful even if not for pickling code objects.

It is good to know that you care about the size of the .pyc files, Martin.  But we should bear in mind that this size difference is directly reflected in the memory use of the loaded data.  A reduction by 25% of the .pyc size is roughly equivalent to a 25% memory use reduction by the loaded code object.

I haven't produced data about the savings of general object reuse because it relies on my "recode" code optimizer module which is still work in progress.  However, I will do some tests and let you know.  Suffice to say that it is enormously frustrating to re-generate code objects with an optimization tool, sharing common or identical sub-objects and so on, and then finding that the marshal module undoes all of that.

I'll report back with additional figures.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue16475>
_______________________________________


More information about the Python-bugs-list mailing list