Comparison

Tue Sep 23 07:13:16 EDT 2014

LJ wrote:

> I have a network in which the nodes are defined as dictionaries using the
> NetworkX package. Inside each node (each dictionary) I defined a
> dictionary of dictionaries holding attributes corresponding to different
> ways in which the node can be reached (this dictionaries I refer to as
> labels). At some point in my algorithm I am looping through some subset of
> nodes and through the labels in each node and I perform some "joining"
> checks with the labels of each node in another subset of nodes. To clarify
> I check for a feasibility condition in a pairwise manner for every label
> in one node and every label of another. This nested loop is taking time. I
> wonder if the fact of defining the labels using numpy instead of
> dictionaries could imply a faster way to perform this loop.

I doubt it very much. Python dicts are extremely fast, and it doesn't sound
like your problem is that each dict lookup is slow. It sounds like your
problem is that you are doing a lot of them.

If you have 100 labels in one node, and 100 labels in the other, and you are
doing:

for label_a in A:
    for label_b in B:
        is_feasible(label_a, label_b)

the total number of feasibility checks is 100*100 = 10000, not the 200 that
you may be expecting. The bottleneck here is probably your is_feasible
check, called many times, not the dict look-ups.

But of course we cannot be sure without seeing your code, and we can't be
certain without running the profiler and seeing exactly where your
bottlenecks are.

-- 
Steven