Strange array.array performance

Fri Feb 20 07:23:09 EST 2009

On Fri, Feb 20, 2009 at 2:42 AM, Scott David Daniels
<Scott.Daniels at acm.org> wrote:
> Maxim Khitrov wrote:
>>
>> ... Here's the function that I'll be using from now on. It gives me
>> exactly the behavior I need, with an int initializer being treated as
>> array size. Still not as efficient as it could be if supported
>> natively by array (one malloc instead of two + memmove + extra
>> function call), but very good performance nevertheless:
>>
>> from array import array as _array
>> array_null = dict((tc, '\0' * _array(tc).itemsize) for tc in
>> 'cbBuhHiIlLfd')
>
> How about:
>  array_null = dict((tc, _array(tc, (0,)).tostring() for tc in
> 'cbBuhHiIlLfd')
> ...
> (some ancient floating points did not use all-0 bytes for 0.0).

Didn't know that, thanks. I actually got rid of the dict, since
benchmarks showed access time to itemsize in the function itself is
not any slower than dict access. After going through all the different
speed tests yesterday the function now looks like this:

from array import array as _array

def array(typecode, init):
	if isinstance(init, int):
		a = _array(typecode, (0,))

		if a.itemsize * init > 1048576:
			return a * init
		else:
			a.fromstring((init - 1) * a.tostring())
			return a

	return _array(typecode, init)

This uses the fast fromstring operation when creating an array that is
less than 1MB in size. Over that, array multiplication is used, which
is slower, but doesn't require the extra memory.

- Max