searching a value of a dict (each value is a list)

Jeremy C B Nicoll jeremy at omba.demon.co.uk
Sun Dec 9 21:04:04 EST 2007


Seongsu Lee <senux at senux.com> wrote:

> The reason I use the dict for my data is to speed up the search by key.

Ok, I understand that once the overhead of creating the dict has been done,
getting access to values within it is quick.  And taking the time to create
a set of reverse keys speeds up the reverse access.

Rather than scanning the whole dict and creating reverse keys for everything
in it first, there might be an advantage in making search logic test if
there is a reverse key for the negative integer to be searched for, and if
so use it, otherwise scan the dict creating and examining reverse keys until
the integer is found.  That way if the integer you're looking for is early
in the dict you'd only create reverse keys for the integers from the start
of the dict until the required one.  


We don't know what external(?) process created the data that you've stored
in the dict, nor what you use it for - or more to the point - how often.  If
you're going to make just one search of that data then there's little point
in having a fast search after a slow dict creation.  On the other hand if
you have many many searches to do the initial overhead might be acceptable.


(I don't know how slow creating the dict would be for a typical example of a
million keys each keying lists of 1-1000 integers.) 
 
> The code could create also a reverse index (a reverse dict) at the
> time. But as you said, it waste the space and I wanted to ask someone
> who may know some way to reduce the waste of space while searching
> fast.

Is the dict used by anything else?  If the data in it was held in some other
form would that cause your program (or other programs) lots of problems?  If
the range of values of the integers being stored is suitable, you might
sensibly use several or many smaller dicts to store all the data (and thus
save time reverse-keying much less of it).

-- 
Jeremy C B Nicoll - my opinions are my own.



More information about the Python-list mailing list