[Cython] Cython syntax to pre-allocate lists for performance

Yury V. Zaytsev yury at shurup.com
Thu Mar 7 19:26:26 CET 2013


Hi Stefan,

On Thu, 2013-03-07 at 12:21 +0100, Stefan Behnel wrote:

> Note that Python has an algorithm for shrinking a list on appending, so
> this might not be sufficient for your use case. What do you need it for?

W00t! I didn't know about that.

I'm wrapping a C++ code that should transmit large lists of objects to
Python, while these objects are stored into something vector-like, which
shouldn't get exposed directly. In the past, they did something like

    obj = PyList_New(a.size());
    for (a.begin(); a.end(); ++a) PyList_SetItem(obj, ...)

I figured I can translate it into a while loop like

    obj = []
    while (it != a.end()):
        obj.append(it)
        inc(it)

but then I'm not using the information about the size of a that I
already have, and for huge lists this tends to be quite slow.

I think this must be quite a common use case for bindings...

> And why won't [None]*N help you out? It should be pretty cheap.

It probably will, at least a bit. It just stroke me that if I'm going to
do something along the lines of

    idx = 0
    obj = [None] * a.size()
    while (it != a.end()):
        obj[idx] = it
        idx += 1
        inc(it)

I could also squeeze the last bits of performance by avoiding the
creation of Nones and subsequently populating the list with them.

If you say I have to use PyList_New directly, oh well... It's just that
now since I'm rewriting the bindings in Cython anyways, I'm also trying
to avoid using Python C API directly as much as possible.

> Won't list comprehensions work for you? They could potentially be adapted
> to presize the list.

I guess not.

-- 
Sincerely yours,
Yury V. Zaytsev




More information about the cython-devel mailing list