reference counting and PyTuple_SetItem

Thu Jun 10 21:09:49 EDT 2004

[Anne Wilson]
...
> Now I'm testing this:
>
>
> while(<some condition>) {
>      pyFiveMinArgs = PyTuple_New(PY_CALL_ARG_CNT);
>      PyTuple_SetItem(pyFiveMinArgs, 0, PyInt_FromLong(300L));
>      PyTuple_SetItem(pyFiveMinArgs, 1, PyInt_FromLong(1L));
>      PyTuple_SetItem(pyFiveMinArgs, 2, PyString_FromString("5min"));
>      PyTuple_SetItem(pyFiveMinArgs, 3, PyString_FromString(statsDir));
>      PyTuple_SetItem(pyFiveMinArgs, 4, PyString_FromString(rHost));
>
>      pyOneHourArgs = PyTuple_New(PY_CALL_ARG_CNT);
>      PyTuple_SetItem(pyOneHourArgs, 0, PyInt_FromLong(3600L));
>      PyTuple_SetItem(pyOneHourArgs, 1, PyInt_FromLong(15L));
>      PyTuple_SetItem(pyOneHourArgs, 2, PyString_FromString("1hr"));
>      PyTuple_SetItem(pyOneHourArgs, 3, PyString_FromString(statsDir));
>      PyTuple_SetItem(pyOneHourArgs, 4, PyString_FromString(rHost));
>      ...
>
>      pyValue = PyObject_CallObject(pyFunc, pyFiveMinArgs);
>      Py_DECREF(pyValue);
>      pyValue = PyObject_CallObject(pyFunc, pyOneHourArgs);
>      Py_DECREF(pyValue);
>
>      ...
>      Py_DECREF(pyFiveMinArgs);
>      Py_DECREF(pyOneHourArgs);
> }
>
>
> But, OW!  It pains me to be so inefficient, creating the same damn
> PyObjects over and over and over and over and over again.

OTOH, it's so clear as to be darned-near obvious now -- even though it's
still wrong <wink -- but virtually all C API calls can fail, and you really
do need to check each one for an error return; in particular, all the calls
here can at least run out of memory).

> To me this is way beyond "micro" optimization.  In my own code I'm
> actually doing this for three more cases beyond the fiveMin and oneHour
> stuff shown above.

Does it matter?  That is, have you profiled the code and determined that
this part is a bottleneck?  If not, optimization will introduce bugs and
waste *your* time (not to mention mine <wink>).

> Is there a more efficient way to do this embedded call?

Ignoring error-checking, most people would float the argument construction
outside the loop (faster), and use a higher-level API function to do the
calls (slower); e.g.,

i1 = PyInt_FromLong(1);
i15 = PyInt_FromLong(15);
i300 = PyInt_FromLong(300);
i3600 = PyInt_FromLong(3600);
statsdir = PyString_FromString(statsDir);
srhost = PyString_FromString(rHost);
s5min = PyString_FromString("5min");
s1hr = PyString_FromString("1hr");

while(<some condition>) {
     pyValue = PyObject_CallFunction(pyFunc, "OOOOO",
                    i300, i1, s5min, statsdir, srhost);
     Py_DECREF(pyValue);
     pyValue = PyObject_CallFunction(pyFunc, , "OOOOO",
                    i3600, i15, s1hr, statsdir, srhost);
     Py_DECREF(pyValue);
}
Py_DECREF(i1);
Py_DECREF(i15);
[etc]

> (... other than rewriting the Python code in C...)  Can I use a list
> instead of tuple?

Not unless pyFunc takes a list as an argument.  You would have exactly the
same refcount woes:  it's not tuples that cause those, it's trying to
optimize low-level operations with flawed understanding of how they work.

"Do the simplest thing that could possibly work" is good advice here.  Write
tests (preferably before coding) to ensure that things continue to work.  If
timing shows the code is truly too slow, and profiling shows that this part
is truly the bottleneck, then it *may* be good to trade off maintainability
for speed.  But what you're doing in the C loop here is almost certainly
insignificantly expensive compared to the overhead of calling back into
Python at all.  Indeed, I'd seriously question why this part is coded in C
at all -- it's buying you bugs, but I'm not sure it's buying you anything
worth having.