[Numpy-discussion] recarray field names
Fernando Perez
Fernando.Perez at colorado.edu
Wed Mar 15 16:30:04 EST 2006
Erin Sheldon wrote:
> Yes, I see, but I think you meant
>
> if name in t.dtype.fields.keys():
No, he really meant:
if name in t.dtype.fields:
dictionaries are iterators, so you don't need to construct the list of keys
separately. It's just a redundant waste of time and memory in most cases,
unless you intend to modify the dict in your loop, case in which the iterator
approach won't work and you /do/ need the explicit keys() call.
In addition
if name in t.dtype.fields
is faster than:
if name in t.dtype.fields.keys()
While both are O(N) operations, the first requires a single call to the hash
function on 'name' and then a C lookup in the dict's internal key table as a
hash table, while the second is a direct walkthrough of a list with
python-level equality testing.
In [15]: nkeys = 1000000
In [16]: dct = dict(zip(keys,[None]*len(keys)))
In [17]: time bool(-1 in keys)
CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s
Wall time: 0.01
Out[17]: False
In [18]: time bool(-1 in dct)
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00
Out[18]: False
In realistic cases for your original question you are not likely to see the
difference, but it's always a good idea to be aware of the performance
characteristics of various approaches. For a different problem, there may
well be a real difference.
Cheers,
f
More information about the NumPy-Discussion
mailing list