[SciPy-Dev] Proposed addition to scipy.sparse.csgraph.maximum_flow

Thu Jan 2 09:25:35 EST 2020

Hi,

recently I have made use of the maximum flow algorithm for
CSR matrices, i.e., the scipy.sparse.csgraph.maximum_flow function.

My test case was a non-sparse, bipartite network with lots of connections
between the two node sets. The value of the maximum flow was small in
comparison to the total number of nodes and edges in the graph.
I wanted to use the scipy.sparse.csgraph module, however with a depth first
search (DFS) instead of the breadth-first search that you have implemented.
Therefore, I have implemented a new option to use DFS instead of BFS with
your implementation. When measuring the execution time for a random graph,
there was a signnificant speedup of the DFS procedure compared to the
BFS variant.

During these tests, I also refactored the two other functions that you call
within the main csgraph.maximum_flow function, i.e., _add_reverse_edges and
_make_edge_pointers. Depending on the input graph, I found that the time
spent in this functions can be larger than the execution time for the actual
Edmond-Karps algorithm.

Below, you can find a table showing the measured execution times in seconds.
Within the columns of this table, the execution time is broken down into parts
as follows:

a) Adding reverse edges (function _add_reverse_edges).
b) Making edge pointers (function _make_edge_pointers).
c) Actual max flow computation (function _edmond_karps)
d) Total execution time (function maximum_flow).

Rows of the table refer to the changes which were made to the implementation,
where each measurement was performed with either DFS or BFS for the
construction of augmenting paths:

1) Original implementation.
2) After the reimplementation of _make_edge_pointers.
3) After the reimplementation of _add_reverse_edges.

Here are the execution times:

        |     a)    |     b)    |     c)    |     d)
--------+-----------+-----------+-----------+-----------
1. BFS  |   1.29 s  |   3.81 s  |   8.93 s  |  14.08 s
1. DFS  |   1.29 s  |   3.82 s  |   0.50 s  |   5.66 s
--------+-----------+-----------+-----------+-----------
2. BFS  |   1.54 s  |   0.22 s  |   8.94 s  |  10.76 s
2. DFS  |   1.33 s  |   0.22 s  |   0.50 s  |   2.10 s
--------+-----------+-----------+-----------+-----------
3. BFS  |   0.16 s  |   0.27 s  |   9.01 s  |   9.50 s
3. DFS  |   0.16 s  |   0.23 s  |   0.52 s  |   0.96 s

By comparing the total execution times for BFS in 1d) and for
DFS in 3d), my code changes have led to a speedup of about 14,
for the network example which I have mentioned.

Please let me know if would like to incorporate some or all of my
changes into your code base.
My changes (a single commit) can be found here:
https://github.com/mhwende/scipy/commit/e7be2dc29364fe95f6fae20a4c0775716cdba5e5

Finally, here is the script which I have used for testing.
The above timings have been produced with the command:

python scipymaxflowtest 2000 2000 --dfs=0

This will create a graph with 4000 nodes and 4,000,000 edges,
with randomly chosen zero or unit capacities.

# File scipymaxflowtest.py

import time
import argparse
import numpy as np
import scipy.sparse as sp

def scipymaxflowtest(layer_sizes, dfs):
  layer_sizes = np.array(layer_sizes, dtype = np.int)
  num_layers = len(layer_sizes)
  print("Layer sizes:", *layer_sizes)
  print("Number of layers:", num_layers)
  num_nodes = 2 + np.sum(layer_sizes)
  print("Number of nodes:", num_nodes)
  num_edges = layer_sizes[0] + layer_sizes[-1]
  for i in range(layer_sizes.size - 1):
    num_edges += layer_sizes[i] * layer_sizes[i + 1]
  print("Number of edges:", num_edges)
  #data = np.ones(num_edges, dtype = np.int)
  data = np.random.randint(0, 2, num_edges)
  indices = np.zeros(num_edges, dtype = np.int)
  indptr = np.zeros(num_nodes + 1, dtype = np.int)
  ptr = 0
  for j in range(layer_sizes[0]):
    indices[ptr] = 2 + j
    ptr += 1
  indptr[1] = ptr
  indptr[2] = ptr
  layer_sum = 0
  for k in range(num_layers - 1):
    next_layer_sum = layer_sum + layer_sizes[k]
    for i in range(layer_sizes[k]):
      for j in range(layer_sizes[k + 1]):
        indices[ptr] = 2 + next_layer_sum + j
        ptr += 1
      indptr[2 + layer_sum + i + 1] = ptr
    layer_sum = next_layer_sum
  for i in range(layer_sizes[-1]):
    indices[ptr] = 1
    ptr += 1
    indptr[2 + layer_sum + i + 1] = ptr
  indptr[num_nodes] = ptr

  # Check that we set as many entries in the index array as there are edges.
  assert ptr == num_edges

  # Create CSR matrix from data and index arrays.
  a = sp.csr_matrix((data, indices, indptr), shape = (num_nodes, num_nodes))
  if num_nodes <= 12:
    print("A =\n{}".format(a.toarray()))

  t0 = time.time()
  flow_result = sp.csgraph.maximum_flow(a, 0, 1, dfs)
  t1 = time.time()
  flow_value = flow_result.flow_value
  print("Maximum flow: {} (took {:.6f} s)".format(flow_value, t1 - t0))
  return flow_value

if __name__ == "__main__":
  parser = argparse.ArgumentParser()
  parser.add_argument(
      "layer_sizes", type = int, nargs = "+",
      help = "Number of nodes on individual graph layers.")
  parser.add_argument(
      "--dfs", type = int, default = 0,
      help = "Whether to user depth first search.")
  args = parser.parse_args()
  np.random.seed(4891243)
  scipymaxflowtest(args.layer_sizes, args.dfs)