# Thursday, September 6, 2012 Introduction to the course

Lecture notes for the first COMP 360 class, taught by Hamed Hatami. These lecture notes are student-generated and any errors or omissions should be assumed to be the fault of the notetaker and not of the lecturer. To correct an error, you have to be registered and logged-in; alternatively, you can contact @dellsystem directly.

## 1Overview of the course¶

The required textbook is Algorithm Design by Jon Kleinberg and Éva Tardos (likely the same as the one used in COMP 251).

In COMP 250 and COMP 251, we mainly covered basic algorithms: sorting, searching, graph traversal, and the like. In this course, we will cover more advanced algorithms, particularly optimisation algorithms. An approximate overview of the topics that will be covered is:

• Network flow
• NP-completeness
• Linear programming
• Approximation algorithms
• Randomisation algorithms

## 2Network flow¶

### 2.1The max-flow problem¶

The first topic to be covered is network flow, specifically, the max-flow problem. We define a flow network as a directed graph, $G = (V, E)$ (i.e. a set of vertices and edges), such that:

1. Every edge $e$ has a capacity $c_e$.
2. There is a source $s \in V$ from which the flow starts.
3. There is a sink $t \in V$ where the flow ends.

Some assumptions:

• The capacities are all natural numbers (positive integers).
• The source has no incoming edges, and at least one outgoing edge.
• The sink has no outgoing edges, and at least one incoming edge.
• Every other node is adjacent to (connected with) at least one incoming and one outgoing edge.

We define an s-t flow as a function, $f: E \to \mathbb{Z}^+ \cup \{0\}$, which assigns a non-negative integer to every edge (which is the flow through that edge). This function must satisfy the following conditions:

1. Capacity condition. The flow through any edge must not be greater than the capacity of that edge. Mathematically: $\forall e \in E, f(e) \leq c_e$.
2. Conservation condition. For all nodes other than the source and the sink, the flow through all incoming edges should be equal to the flow through all outgoing edges. Mathematically: $\forall v \in V / \{s, t\}$, $\displaystyle\sum_{\text{incoming edges}} f(e) = \sum_{\text{outgoing edges}} f(e)$.

Next, we define a $v(f)$: the total value of a flow $f$ leaving the source. This is the particular quantity that we wish to maximise. Specifically, the problem is: given a flow network $(G, \{c_e\}, s, t)$, what is the maximum $v(f)$ among all valid flows $f$?

### 2.2Ford-Fulkerson¶

One algorithm for finding the maximum flow in a flow network is known as Ford-Fulkerson. The basic process is:

1. Start with the flow for every edge being set to 0.
2. At every step, push more flow through until we can add no more.

(A better, more detailed explanation of the process can be found below.)

#### 2.2.1Definitions¶

At this point, we need to define the concept of a residual graph. Given $(G, \{c_e\}, s, t)$, and an $s-t$ flow $f$, the residual graph $G_f$ has the same vertices as $G$, and its edges are specified as follows:

1. If $f(e) < c_e$, then $G_f$ has the same edge $e$, but with a capacity of $c_e - f(e)$. In other words, if the flow in the original graph has not reached capacity, the flow in the residual graph is the remaining capacity.
2. If $f(e) > 0$, then $G_f$ has an edge in the opposite direction (possibly in addition to an edge in the same direction) with capacity $f(e)$.

Here's an example, taken from the lecture1:

In other words, for any edge, its equivalent edge in the same direction in the residual graph has a capacity of whatever remains of the original edge's capacity (that has not been "filled" with flow). It also has an edge in the opposite direction with capacity being whatever the flow of the original edge was.

Now we define an augmented path $P$, which is a simple path2 in a residual graph $G_f$ that starts at the source and ends at the sink. We define $\text{Bottleneck}(P, f)$3 to be the minimum residual capacity of any edge in $P$. So the bottleneck (or cut) is the maximum amount of flow that can be pushed through $P$.

#### 2.2.2Algorithm steps¶

Now we can redefine the steps of this algorithm in terms of the residual graph and augmented paths:

1. Set $f(e)$ to 0 for every edge in $G$.
2. While there is an $s-t$ path in the residual graph $G_f$:
• $f' = \text{augment}(f,P)$ (i.e. increase the flow by $\text{Bottleneck}(P, f)$ through $P$)
• Update $f$ to $f'$
• Reconstruct the residual graph.

#### 2.2.3Analysis of the algorithm¶

##### 2.2.3.1Correctness¶

The augmented flow satisfies the two conditions of a valid flow - the capacity constraint, and conservation - as follows:

• Capacity: the residual capacities are always less than the original capacities.
• Conservation: there are three possible cases, and in each case the conservation condition is preserved.
• -- a --> V -- a -->
• -- a --> V <- a ---
• <- a --- V -- a -->
##### 2.2.3.2Termination¶

A valid flow cannot be larger than the total capacity of the edges leaving the source. Every time we agument the flow, we increase the value by at least 1. So, there are at most $C$ iterations where $C$ is the sum of the capacities of the edges leaving the source.

##### 2.2.3.3Running time¶

Every iteration is $O(m) + O(m) = O(m)$. The total running time is $O(mC)$ where $m$ is the number of edges in the graph.

Is this, then, a polynomial-time algorithm? No, it is not. If you make bad initial choices when augmenting the flow, it could take much longer to complete the algorithm.4

#### 2.2.4Implementation in Python¶

class Edge:
def __init__(self, u, v, c):
self.source = u
self.sink = v
self.capacity = c
# flow only determined when working out the Ford-Fulkerson algorithm

class FlowGraph:
def __init__(self):
self.neigb = {} # list of nodes and the edges that connect the said node to its neighbours
self.flow= {} # list of flow in edges

self.neigb[v] = [] # initiate each node with a list of edge coming from this node

def get_edges(self, v):
return self.neigb[v]

if u == v:
raise ValueError("u==v")

# directly add the edges of the residual graph
edge = Edge(u, v, w)
redge = Edge(v, u, 0)
edge.redge = redge
redge.redge = edge
self.neigb[u].append(edge)
self.neigb[v].append(redge)
self.flow[edge] = 0
self.flow[redge] = 0

def find_path(self, source, sink, path):
if source == sink:
return path

for edge in self.get_edges(source):
residual = edge.capacity - self.flow[edge]

# path is a list of set (edge, residual capacity)
if residual > 0 and not (edge, residual) in path:
result = self.find_path(edge.sink, sink, path + [(edge, residual)])

if result is not None:
return result

def max_flow(self, source, sink):
path = self.find_path(source, sink, [])

while path is not None:
# bottleneck is calculated as the minimal flow in the current augmented path
flow = min(res for edge, res in path)

for edge, res in path:
self.flow[edge] += flow
self.flow[edge.redge] -= flow

path = self.find_path(source, sink, [])

return sum(self.flow[edge] for edge in self.get_edges(source))

# Remark: instead of creating residual graph, each node has residual edges
# residual edge is edge of of another node

# Algorithm test
g = FlowGraph()

4. The professor also mentioned something about $\lceil \log_2 n \rceil$ bits being needed to store an integer $n$, but that's a fairly trivial comment valid for many algorithms and doesn't really shed additional light on Ford-Fulkerson in particular. I could have misunderstood what he was saying, though.