How to test if one dict is subset of another?
Diez B. Roggisch
deets at nospam.web.de
Mon Feb 19 03:17:38 EST 2007
Jay Tee schrieb:
> Hi,
>
> I have some code that does, essentially, the following:
>
> - gather information on tens of thousands of items (in this case, jobs
> running on a
> compute cluster)
> - store the information as a list (one per job) of Job items
> (essentially wrapped
> dictionaries mapping attribute names to values)
>
> and then does some computations on the data. One of the things the
> code needs to do, very often, is troll through the list and find jobs
> of a certain class:
>
> for j in jobs:
> if (j.get('user') == 'jeff' and j.get('state')=='running') :
> do_something()
>
> This operation is ultimately the limiting factor in the performance.
> What I would like to try, if it is possible, is instead do something
> like this:
>
> if j.subset_attr({'user' : 'jeff', 'state' : 'running'}) :
> do_something()
>
>
> where subset_attr would see if the dict passed in was a subset of the
> underlying attribute dict of j:
This would still need to run over all items in jobs. No gain.
>
> j1's dict : { 'user' : 'jeff', 'start' : 43, 'queue' : 'qlong',
> 'state' : 'running' }
> j2's dict : { 'user' : 'jeff', 'start' : 57, 'queue' : 'qlong',
> 'state' : 'queued' }
>
> so in the second snippet, if j was j1 then subset_attr would return
> true, for j2 the answer would be false (because of the 'state' value
> not being the same).
If you're jobs dictionary is immutable regarding the key-set (not from
it's implementation, but from its usage), the thing you can do to
enhance performance is to create an index. Take a predicate like
def p(j):
return j.get('user') == 'jeff'
and build a list
jeffs_jobs = [j for j in jobs if p(j)]
Then you can test only over these. Alternatively, if you have quite a
few of such predicate/action-pairs, try and loop once over all jobs,
applynig the predicates and actions accordingly.
Diez
More information about the Python-list
mailing list