Standard graph API?

Sat Sep 11 06:09:27 EDT 2004

Andrew Dalke <adalke at mindspring.com> writes:

> Magnus Lie Hetland wrote:
>> Hm. How would the algorithms work without a standard API?
>
> There are certain things the different graphs have in
> common.  For example,
>  1. "test if two nodes/edges are the same"
>  2. "test if two nodes/edges are identically colored"
>  3. "list of all nodes adjacent to a node"
>  4. "list of all edges from a node"  (either directed or
>      undirected)
>  5. "get the appropriate weight"
>
> Different graphs have different ways to do this.

[...]

> The ways to get the properties differ but the things
> you do with them do not change.
>
> I can concieve then some way to generate code
> based on a template, like this
>
>  dfs =  make_code(dfs_template,
>                   args = "node, handler",
>                   bond_neighbors = "node.xatoms",
>                   on_enter = "handler.enter(node)")
>
>   .. make the graph ...
>  class Handler:
>     def enter(self, node):
>       print "Hi", node
>
>  dfs(graph, Handler())

Yuk.

This is *exactly* the type of thing that adaptation (PEP 246,
PyProtocols) is designed to support:

    # pseudo-code here, I don't know the exact PEP 246 code form off
    # the top of my head.
    class IGraph(Protocol):
        def nodes_same(n1, n2):
            "No implementation here - this is just a protocol"
        # etc

    class IDFSVisitor(Protocol):
        # mode protocol defns...

    def dfs(g, h):
        g = adapt(g, IGraph)
        h = adapt(h, IDFSVisitor)
        # code, using standard method names

Now, for your molecular graphs, you just write an adapter to make the
graph conform to the IGraph protocol. In theory, the adaptation
should be maximally efficient (so that you don't have to worry about
overheads of unnecessary wrappers), in practice I don't know how
close PyProtocols comes (although I believe it's good).

>> To clarify, by API I here mean a protocol, i.e. a definition of how a
>> graph is supposed to behave -- *not* a standard implementation.
>
> I think we're talking about the same thing -- the sorts
> of things you can do with nodes and edges, and not a
> statement that a node or edge has a given property or method.
>
>> I've been thinking a bit about the use of object adaptation here; I
>> think it would be quite perfect. One possibility would be to use the
>> functionality of PyProtocols, but that's hardly standard... But it
>> would allow you to do stuff like
>>   graph = AdjacencyMap(someStrangeMolecularGraph)
>>   # or graph = adapt(someStrange..., AdjacencyMap)
>>   graphAlgorithm(graph)

Yes, this (to me) would be ideal. The issue of not being standard is
a bit circular - it's not standard because there aren't enough
examples of where it is needed, but people don't use it because it's
not standard.

I believe Guido also has some concerns over how interfaces "should"
work - but he's unlikely to do anything about that unless someone
forces the issue by championing PEP 246. It might be worth asking
Philip Eby if he would be happy to see PyProtocols added to the
standard library.

> The problem is all the conversion from/to my graph
> form to the standard graph form.  Either the adapters
> have to be live (so the modification to the standard
> form get propogated back to my graph) or there needs
> to be conversions between the two.  Both sound slow.

My understanding is that a well-written adapter can be very
efficient. But I don't know in practice how that is achieved. And
obviously, a particular operation will never be efficient if the
underlying representation can't support it efficiently.

>> Wouldn't it be *much* better to use the established (but not standard)
>> mechanism of object adaptation, as championed in PEP 246 and as
>> implemented and extended upon in PyProtocols?
>
> Not really.  Consider the events that could occur in a
> DFS.  There's
>   - on enter
>   - on exit
>   - on backtrack
> and probably a few more that could be used in a general
> purpose DFS.  But I might need only one of them.  With
> a PE 246 adapter I can adapt my graph to fit the algorithm,
> but if I don't need all of those events there's no way
> to adapt the algorithm to fit my needs.

You could adapt a visitor class to fit a standard DFS-visitor
protocol.

> (Yes, even though I'm using Python I'm still worried
> about efficiency.)

I worry about efficiency, as well. But my experiments showed call
overhead as the killer - adding an "on backtrack" callback to the
algorithm, and calling it with a visitor which had a null
implementation of the callback still added a chunk of overhead. Caling
a "do-nothing" callback for each node in a 10,000 node graph isn't
free. Of course, this argues for the template "build code on the fly"
approach, which I don't relish. Anyone know another good way of
avoiding this overhead?

>> (If only adapt() could become a standard library function... <sigh> ;)
>
> Perhaps someday, when people get more experience with it.
> I've not yet used it.

It needs a champion. Someone to do the work to get it into the
library. We have a PEP, and an implementation (PyProtocols) so I
suspect that it's a job that wouldn't require huge technical skills.

> I found that 'is' testing for my graphs is much better.
> At the very least, it's a lot faster (no method call overhead).

You're never going to avoid a method call in any standard - there's
going to be *someone* for whom "is" is inappropriate. So the standard
will have to cater for that.

Paul.
-- 
Home computers are being called upon to perform many new functions,
including the consumption of homework formerly eaten by the dog --
Doug Larson