2.7. Network Flow


What is network flow?

The network flow problem can be phrased in several ways, the canonical one being: Given a computer network with point to point links of certain capacities, what is the greatest rate at one given machine can send data to another given machine? The problem is more complex that it first sounds because data flows can take several different paths and merge and split.

The problem can be represented in graph theory, with computers corresponding to nodes and the network links to edges. Every edge has a capacity (from the question) and a directed flow (which is part of the answer). The flows may not exceed the corresponding capacities and the inflow and outflow must be equal for every node other than the two nodes in question.

The nodes in question are known as the source and the sink. An edge whose flow equals its capacity is said to be saturated. An important point about netflow flow that is initially not easy to grasp is that because flow is directed, an edge may be saturated when considered in one direction but not the other.


Solution to network flow problem

The standard method for solving the network flow problem is known as the Ford-Fulkerson method, which is basically the following:

  1. Start with no flow along any edge
  2. Find a path from source to sink that doesn't contain any saturated edges in the direction from source to sink
  3. Increase the flow of every edge in this path by the same amount. The amount is the largest amount that will not cause one of the capacities to be exceeded.
  4. Go back to step 2 and repeat. When no such path exists, the flow will be maximised. The proof of this is beyond the scope of this page.


Which path?

You may have noticed that the Ford-Fulkerson method is not really an algorithm, because it does not specify how to find the path. Research has found two methods that run quickly in most networks and reasonably fast even in extreme cases. The first is to take the shortest path and the second is to take the path along which the flow could be increased the most. It might seem tempting to use the longest path (since this would seem to "fill up" the network quickly) but in fact there are some networks where this can take arbitrarily long, apart from being difficult to find. Note that the choice of path finding algorithm only affects the speed of the algorithm, not the amount of the final flow.


Directed or undirected?

You might be wondering whether the edges are themselves directed. The answer is: they can be either directed or undirected. It only affects the algorithm in terms of the idea of a saturated edge: an edge is saturated in a particular direction if the flow cannot be increased in that direction, so a directed edge with no flow is actually saturated in the reverse direction.


Node capacities

So far we have only considered capacities in the edges; what about limiting the amount of data that can flow through a particular node? The Ford-Fulkerson method can't be applied directly to this problem, but fortunately it is possible to think of the problem in a different way that allows it to be applied.

Firstly, all undirected edges need to be split into a pair of directed edges, one in each direction. At the end you may finish with flow in both directions, but this can be safely cancelled out.

Using the computer analogy, imagine that all the input data comes in at one point, the output data leaves at another point, and that it must travel across an internal bus between them. The capacity of the node is the capacity of this bus, so just introduce it into the network as another link. The node representing the computer is split into two nodes, one representing the input port and the other the output port, and the bus is the edge between them.

Once the other edges have been fixed up to point to the correct nodes based on their direction, we have an equivalent problem but with node capacities eliminated, which means that we can apply Ford-Fulkerson as usual.


Applications

There are many problems that can be solved with network flow that do not look like network flow. One example comes from a USA Computer Olympiad from several years ago: summarised, what is the smallest number of nodes that must fail in a network to possibly prevent two given computers to be able to communicate (the given nodes may not fail)?

The idea is that if there are multiple independent paths between the two, then one node from every path must fail. But network flow can help us find a maximal set of independent paths (maximal in that there are as many as possible): give every node and edge a capacity of 1 and solve the network flow problem. The capacities guarantee that paths cannot merge at any point.

Now how do we pick the nodes to eliminate? One way is just to try picking nodes until we find one whose failure would reduce the maximum number of independent paths in the network. Unfortunately the proof that such a node exists is beyond the scope of this page. A useful point about Ford-Fulkerson is that it will also work if the initial network is valid but not necessarily empty. Thus one can remove a vertex and reset the flow on the path through that vertex and start from there, rather than starting from an empty network and rebuilding all the paths you originally had.

A similar question would be to find how many links need to fail. This can be solved in the same way, but without the limitation on node capacities. Again, the proof that the number of edges that need to be deleted is equal to the maximum flow is beyond the scope of this page.


[Prev] [Next] [Up]
Last updated Sat Apr 14 12:22:50 2001