[Numpy-discussion] recarray field names

Fernando Perez Fernando.Perez at colorado.edu
Wed Mar 15 16:30:04 EST 2006


Erin Sheldon wrote:

> Yes, I see, but I think you meant
> 
>     if name in t.dtype.fields.keys():

No, he really meant:

if name in t.dtype.fields:

dictionaries are iterators, so you don't need to construct the list of keys 
separately.  It's just a redundant waste of time and memory in most cases, 
unless you intend to modify the dict in your loop, case in which the iterator 
approach won't work and you /do/ need the explicit keys() call.

In addition

if name in t.dtype.fields

is faster than:

if name in t.dtype.fields.keys()

While both are O(N) operations, the first requires a single call to the hash 
function on 'name' and then a C lookup in the dict's internal key table as a 
hash table, while the second is a direct walkthrough of a list with 
python-level equality testing.

In [15]: nkeys = 1000000

In [16]: dct = dict(zip(keys,[None]*len(keys)))

In [17]: time bool(-1 in keys)
CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s
Wall time: 0.01
Out[17]: False

In [18]: time bool(-1 in dct)
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00
Out[18]: False


In realistic cases for your original question you are not likely to see the 
difference, but it's always a good idea to be aware of the performance 
characteristics of various approaches.  For a different problem, there may 
well be a real difference.

Cheers,

f




More information about the NumPy-Discussion mailing list