grab dict keys/values without iterating ?!

Steven D'Aprano steve+comp.lang.python at pearwood.info
Wed Dec 11 08:44:56 EST 2013


On Wed, 11 Dec 2013 12:07:08 +0200, Tamer Higazi wrote:

> Hi Dave!
> 
> You were absolutely right.
> I don't want to iterate the entire dict to get me the key/values
> 
> Let us say this dict would have 20.000 entries, but I want only those
> with "Aa" to be grabed.
> Those starting with these 2 letters would be only 5 or 6 then it would
> take a lot of time.

What do you mean by "a lot of time"?

Here is a small test. I set up a dict with 456976 keys, and then iterate 
over them in just over a quarter of a second on my (old, slow) computer. 
Here is the code I use:



data = {}
letters = "abcdefghijklmnopqrstuvwxyz"
for a in letters.upper():
    for b in letters:
        for c in letters:
            for d in letters:
                key = a + b + c + d
                data[key] = None

print(len(data))

count = 0
with Timer():
    for key in data:
        if key.startswith("Aa"):
            count += 1

print("Found %d keys starting with 'Aa'")



The Timer() function is not standard to Python, but you can find it here:

http://code.activestate.com/recipes/577896


Are you sure that just using a normal dict will be too slow?


> In which way would you prefer to store the data, and which functions or
> methods would you use effectively to accomplish this task ?

I would use a dict, and iterate over the keys, until such time that I new 
that iterating was the bottle-neck causing my code to be too slow. Until 
I knew that absolutely for sure, I would not optimize.

If necessary, I would consider having 26 dicts, one for each initial 
letter:

data = {}
for c in "ABCDEFGHIJKLMNOPQRSTUVWXYZ":
    data[c] = {}

then store keys in the particular dict. That way, if I wanted keys 
starting with Aa, I would only search the A dict, not the B dict, C dict, 
etc.

key = "Aardvark"
data[key[0]][key] = "some value"



-- 
Steven



More information about the Python-list mailing list