TopSort in Python?

Dinu C. Gherman gherman at my-deja.com
Tue Jul 6 06:40:48 EDT 1999


Thanks for the two versions! They seem both appropriate,
although MCF's has the strange feature of adding (1,) here
or there. TP's has a great Exception class, it seems...

Ok, last week-end I typed in the algorithm sitting on my
own shelf at home (just to add one more). I haven't made any
timings, but I like that this one below is pretty short and
easy to understand. The top-level function parameters are
a bit different, but I wrote a wrapper function for using
the "more regular" one, too...

Anyway,-not-really-that-important-now'ly,

Dinu



"""A simple and short topological sorting algorithm in Python.

Input

The input is a set of unique elements (vertices), V, and a di-
rected graph of edges between them, E, where V is a list of
any Python objects and E a list of (v, w) tuples, with v and
w representing indices in V.

Output

The output is L = TS(V, E), a sorted list of all elements in
V, given E. Elements x in V without any (v, w) in E for with
v == x or w == x are prepended to L. Moreover, for all v, w
in V, v != w: (v, w) in E => index(L, v) < index(L, w).

Algorithm

In the directed graph we progressively pick one leaf node,
remove all adges to it and add it at the head of the (ini-
tially empty) result list. If there is no more such leaf
node, we have either a cycle (if there are still edges/
nodes left) or we're done. Isolated nodes (with neither in-
nor outcoming edges) are inserted at the head of the result
list.

Notes

Much of the algorithm can be changed to adapt to different
needs (while still calling it topsort), like the order of
isolated nodes, throwing a real exception when a cycle is
detected, a more efficient implementation of allEdgesFrom(),
or a randomized order of elements where multiple solutions
do exist, the omission of the vertices list parameter in the
top-level function (meaning using edges (v, v) in E), the
use of a class to remove some of the parameter passing, etc.

Dinu Gherman,
July 1999
"""


import sys


def allEdgesFrom(v0, E):
    "Return a list of all edges (v0, w) in E starting at v0."

    resEdges = []
    for v, w in E:
        if v0 == v:
            resEdges.append((v, w))
    return resEdges


def _topSort(v, E, visited, sorted, sortedIndices):
    "Recursive topsort function."

    # print "trying", v, visited, sorted, sortedIndices
    visited[v] = 1
    for v, w in allEdgesFrom(v, E):
        # print "edge found", (v, w)
        if not visited[w]:
            _topSort(w, E, visited, sorted, sortedIndices)
        else:
            if not sorted[w]:
                print "No sorting possible!"
                sys.exit(0)
    sorted[v] = 1
    sortedIndices.insert(0, v)


def topSort(V, E):
    "Top-level sorting function."

    n = len(V)
    visited = [0] * n   # Flags for (un-)visited elements.
    sorted = [0] * n    # Flags for (un-)sorted elements.
    sortedIndices = []  # The list of sorted element indices.

    for v in range(n):
        if not visited[v]:
            _topSort(v, E, visited, sorted, sortedIndices)

    # Build and return a list of elements from the sort indices.
    sortedElements = []
    for i in sortedIndices:
        sortedElements.append(V[i])
    return sortedElements


def wrap(pairs):
    """Convert an element pairs list into (verticesList, edgeIndexList).

    This might be useful for those who prefer topsort(edgeList)
    instead of topsort(verticesList, edgeIndexList)...
    E.g. wrap( [('a','b'), ('b','c'), ('c','a')] )
         -> (['a','b','c'], [(0,1), (1,2), (2,0)])
    """

    e = []
    v = []

    # Make a list of unique vertices.
    for x, y in pairs:
        if x not in v:
            v.append(x)
        if y not in v:
            v.append(y)

    # Convert original element pairs into index pairs.
    for x, y in pairs:
        e.append((v.index(x), v.index(y)))

    return v, e


def testCase(V, E):
    print V
    print E
    print topSort(V, E)
    print


def test():
    # Test wrapper.
    # E = [('a','b'), ('b','c'), ('c','a')]
    # print E
    # print wrap(E)
    # print

    # No error to be expected.
    V = ['a', 'b', 'c', 'd', 'e', 'f']
    E = [(0,1), (1,2), (1,3), (3,4), (4,5), (3,2)]
    testCase(V, E)

    # Still no error, but an isolated element 'g' should appear.
    V = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
    E = [(0,1), (1,2), (1,3), (3,4), (4,5), (3,2)]
    testCase(V, E)

    # Error because of cycle in graph.
    V = ['a', 'b', 'c']
    E = [(0,1), (1,2), (2,0)]
    testCase(V, E)


if __name__ == '__main__':
    test()


Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.




More information about the Python-list mailing list