Graph library for Python

Wed Dec 9 19:41:11 EST 2009

>> Generally, we've tried to discourage people from instantiating
>> nodes and edges directly, in favor of having them controlled
>> through the graph. Maybe something along the lines of:
>>
>> g = Graph(nodes=['a', 'b', 'c'], edges=[('a', 'b'), ('a', 'c'), ('b', 'c')])
>>
>> ?
>
> That would work as well, but you then miss out on the extra
> parameters you can pass to Edge() and Node().

Just pushed a change that will allow that.

> In another email you wrote that Edge() and Node() constructors
> should not be used directly, since they have to know the Graph
> to which they belong.
>
> Wouldn't it be possible to implement this kind of parent-
> referencing in a lazy way, ie. by postponing all the init-
> processing until the Graph instance is known ?

While possible, I'm wary of this. We tried it in an initial
prototype (all nodes and edges descended naturally from
a Universe graph until otherwise stated) and that way lie
madness. Perhaps at some point someone would like to
give another shot at it, though.

>> Beware, that doesn't just match nodes.
>>
>> g = Graph()
>> g.add_node('a')
>> g.add_node('b')
>> g.add_edge('a', 'b', 'ab')
>> 'ab' in g # returns true
>
> I'd avoid such an ambiguity. It could easily hide programming
> errors (testing for edges instead of nodes).
>
> OTOH, you could also regard the graph as a set of nodes
> and edges (as you apparently already do). In that case,
> you'd define
>
> for x in g: print x
>
> as iteration of both nodes and edges in some arbitrary
> order and then use the more specific:
>
> for n in g.nodes: print x
> for e in g.edges: print x
>
> for iteration over just the nodes or edges.

Yup, that's exactly what we do.

>>>>>> g.get_edge_attributes('weight', 'color')
>>> {"ab": {"weight": 3, "color": "blue"},
>>>  "ac": {"weight": 2, "color": "green"},
>>>  "bc": {"weight": 1, "color": "red"}}
>>>
>>> Again, the idea is to reduce call overhead and later
>>> on be able to move these lookups to C.
>>
>> Entirely doable, but I'm not sure I see the use case.
>> Would you mind providing a more realistic example?
>
> The idea is that you use the Graph representation of the
> data to implement some algorithm e.g. optimizing weights
> for example.
>
> The algorithm could be implemented in C to be fast enough
> for large data sets.
>
> The above two methods would then be used to quickly push the
> original data into the graph and extract it again after
> the algorithm has run.

I can see this, but I think the cleaner way is just to iterate
over nodes and edges. Having said that, if somebody wants
to come up with some timing data and it seems to provide a
big advantage, I think robbie for one would make the change.

>> Hmm. Sounds like a plausible use case to me, although I'm
>> not sure its one that should be encouraged. The bigger
>> question in my mind is whether all attribute lookups should
>> have to pay the extra lookup cost to support a somewhat
>> narrow need. I'll definitely have to talk with the other devs
>> about this one, and run a little bit of timing besides.
>
> This would only be an optional second way of accessing
> the .data dictionary.
>
> node.data['pass'] = 1
> node.data['from'] = 'Las Vegas'
> node.data['to'] = 'New York'
>
> would work without such modifications.

Yes, but the change is not reflected on the node. For
example:

>>> g = Graph(nodes={'a':{'spotted':True}})
>>> g['a'].data['spotted'] = False
>>> g['a'].data['spotted']
True

I can see how this would represent a gotcha, though.
Is there a general opinion on whether this is a plus or
a minus?

Geremy Condra