Caching objects in a C extension

casevh casevh at gmail.com
Fri Jan 8 12:50:05 EST 2010


On Jan 8, 9:19 am, "Diez B. Roggisch" <de... at nospam.web.de> wrote:
> casevh schrieb:
>
>
>
>
>
> > On Jan 8, 2:59 am, "Diez B. Roggisch" <de... at nospam.web.de> wrote:
> >> casevh schrieb:
>
> >>> I'm working with a C extension that needs to rapidly create and delete
> >>> objects. I came up with an approach to cache objects that are being
> >>> deleted and resurrect them instead of creating new objects. It appears
> >>> to work well but I'm afraid I may be missing something (besides
> >>> heeding the warning in the documentation that _Py_NewReference is for
> >>> internal interpreter use only).
> >>> Below is a simplified version of the approach I'm using:
> >>> MyType_dealloc(MyTypeObject *self)
> >>> {
> >>>     if(I_want_to_save_MyType(self)) {
> >>>         // Save the object pointer in a cache
> >>>         save_it(self);
> >>>     } else {
> >>>         PyObject_Del(self);
> >>>     }
> >>> }
> >>> MyType_new(void)
> >>> {
> >>>     MyTypeObject *self;
> >>>     if(there_is_an_object_in_the_cache) {
> >>>         self = get_object_from_cache;
> >>>         _Py_NewReference((PyObject*)self);
> >>>     } else {
> >>>         if(!(self = PyObjectNew(MyTypeObject, &MyType))
> >>>             return NULL;
> >>>         initialize_the_new_object(self);
> >>>     }
> >>>     return self;
> >>> }
> >>> The objects referenced in the cache have a reference count of 0 and I
> >>> don't increment the reference count until I need to resurrect the
> >>> object. Could these objects be clobbered by the garbage collector?
> >>> Would it be safer to create the new reference before stuffing the
> >>> object into the cache (even though it will look like there is a memory
> >>> leak when running under a debug build)?
> >> Deep out of my guts I'd say keeping a reference, and using you own
> >> LRU-scheme would be the safest without residing to use dark magic.
>
> >> Diez- Hide quoted text -
>
> >> - Show quoted text -
>
> > Thanks for the reply. I realized that I missed one detail. The objects
> > are created by the extension but are deleted by Python. I don't know
> > that an object is no longer needed until its tp_dealloc is called. At
> > that point, its reference count is 0.
>
> I don't fully understand. Whoever creates these objects, you get a
> reference to them at some point. Then you increment (through the
> destined Macros) the ref-count.
>
> All objects in your pool with refcount 1 are canditates for removal. All
> you need to do is to keep a kind of timestamp together with them, since
> when they are released. If that's to old, fully release them.
>
> Diez- Hide quoted text -
>
> - Show quoted text -

These are numeric objects created by gmpy. I'm trying to minimize the
overhead for using mpz with small numbers. Objects are created and
deleted very often by the interpreter as expressions are evaluated. I
don't keep ownership of the objects.

casevh



More information about the Python-list mailing list