[Cython] Cython syntax to pre-allocate lists for performance

Robert Bradshaw robertwb at gmail.com
Thu Mar 7 19:07:53 CET 2013


On Thu, Mar 7, 2013 at 6:48 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Zaur Shibzukhov, 07.03.2013 15:39:
>> 2013/3/7 Zaur Shibzukhov
>>
>>> I guess the problem is to construct new (even empty) list with
>>> pre-allocated memory exactly for N elements.
>>>
>>> N*[NULL] - changes semantics because there can't be list with N elements
>>> and filled by NULL.
>>> N*[None] - more expansive for further assignments because of Py_DECREFs.
>>>
>>> I suppose that N*[] could do the trick.
>
> That looks wrong to me.
>
>
>>> It could be optimized so that N*[]
>>> is equal to an empty list but with preallocated memory exactly for N
>>> elements. Could it be?
>>
>> Cython optimize already PyList_Append very well. Theofore scenario when
>> first one create empty list with exactly prealocated memory for N elements
>>  and second eval elements of the list and add them using plain list.append
>> could optimized in Cython very well too. As result constructed list will
>> contain memory only for N elements. This allows don't vast memory when one
>> need to build many many lists with relative big size.
>
> I prefer not adding any new syntax as long as we are not sure we can't fix
> it by making list comprehensions smarter. I tried this a while ago and have
> some initial code lying around somewhere in my patch queue. Didn't have the
> time to make it any usable, though, also because Cython didn't have its own
> append() for list comprehensions at the time. It does now, as you noted.

There are several cases where we can get the size of the result list
upfront, we can certainly do better here now. I'm also -1 to adding
special syntax for populating a list with NULL values, if you really
want to do this (and I doubt it really matters in most cases) calling
PyList_New is the "syntax" to use.

- Robert


More information about the cython-devel mailing list