Divide and conquer algorithms aren't really taught in programming textbooks, but it's something every programmer should know. Divide and conquer algorithms are the backbone of concurrency and multi-threading.

Divide and conquer algorithms aren't really taught in programming textbooks, but it's something every programmer should know. Divide and conquer algorithms are the backbone of concurrency and multi-threading.

Often I'll hear about how you can optimise a for loop to be faster or how switch statements are slightly faster than if statements. Most computers have more than one core, with the ability to support multiple threads. Before worrying about optimising for loops or if statements try to attack your problem at a different angle.

Divide and Conquer is one of the ways to attack a problem from a different angle. Throughout this article, I'm going to talk about creating divide and conquer solutions and what it is. Don't worry if you have **zero** experience or knowledge on the topic. This article is designed to be read by someone with very little programming knowledge...

( Red More )

Dijkstra's algorithm can find for you the shortest path between two nodes on a graph. It's a must-know for any programmer. There are nice gifs and history in its <a href="https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm" target="_blank">Wikipedia page</a>.

Dijkstra's algorithm can find for you the shortest path between two nodes on a graph. It's a must-know for any programmer. There are nice gifs and history in its Wikipedia page.

In this post I'll use the time-tested implementation from Rosetta Codechanged just a bit for being able to process weighted and unweighted graph data, also, we'll be able to edit the graph on the fly. I'll explain the code block by block.

The algorithmThe algorithm is pretty simple. Dijkstra created it in 20 minutes, now you can learn to code it in the same time.

- Mark all nodes unvisited and store them.
- Set the distance to zero for our initial node and to infinity for other nodes.
- Select the unvisited node with the smallest distance, it's current node now.
- Find unvisited neighbors for the current node and calculate their distances through the current node. Compare the newly calculated distance to the assigned and save the smaller one.
*For example, if the node A has a distance of 6, and the A-B edge has length 2, then the distance to B through A will be 6 + 2 = 8. If B was previously marked with a distance greater than 8 then change it to 8.* - Mark the current node as visited and remove it from the unvisited set.
- Stop, if the destination node has been visited (when planning a route between two specific nodes) or if the smallest distance among the unvisited nodes is infinity. If not, repeat steps 3-6.

First, imports and data formats. The original implementations suggests using namedtuple for storing edge data. We'll do exactly that, but we'll add a default value to the cost argument. There are many ways to do that, find what suits you best.

from collections import deque, namedtuple we'll use infinity as a default distance to nodes.inf = float('inf')

Edge = namedtuple('Edge', 'start, end, cost')def make_edge(start, end, cost=1):

return Edge(start, end, cost)

Let's initialize our data:

class Graph:

definit(self, edges):

# let's check that the data is right

wrong_edges = [i for i in edges if len(i) not in [2, 3]]

if wrong_edges:

raise ValueError('Wrong edges data: {}'.format(wrong_edges))`self.edges = [make_edge(*edge) for edge in edges]`

Let's find the vertices. In the original implementation the vertices are defined in the _ _ init _ _, but we'll need them to update when edges change, so we'll make them a property, they'll be recounted each time we address the property. Probably not the best solution for big graphs, but for small ones it'll go.

@property

def vertices(self):

return set(

# this piece of magic turns ([1,2], [3,4]) into [1, 2, 3, 4]

# the set above makes it's elements unique.

sum(

([edge.start, edge.end] for edge in self.edges), []

)

)

Now, let's add adding and removing functionality.

def get_node_pairs(self, n1, n2, both_ends=True):

if both_ends:

node_pairs = [[n1, n2], [n2, n1]]

else:

node_pairs = [[n1, n2]]

return node_pairs`def remove_edge(self, n1, n2, both_ends=True): node_pairs = self.get_node_pairs(n1, n2, both_ends) edges = self.edges[:] for edge in edges: if [edge.start, edge.end] in node_pairs: self.edges.remove(edge) def add_edge(self, n1, n2, cost=1, both_ends=True): node_pairs = self.get_node_pairs(n1, n2, both_ends) for edge in self.edges: if [edge.start, edge.end] in node_pairs: return ValueError('Edge {} {} already exists'.format(n1, n2)) self.edges.append(Edge(start=n1, end=n2, cost=cost)) if both_ends: self.edges.append(Edge(start=n2, end=n1, cost=cost))`

Let's find neighbors for every node:

@property

def neighbours(self):

neighbours = {vertex: set() for vertex in self.vertices}

for edge in self.edges:

neighbours[edge.start].add((edge.end, edge.cost))`return neighbours`

It's time for the algorithm! I renamed the variables so it would be easier to understand.

def dijkstra(self, source, dest):

assert source in self.vertices, 'Such source node doesn't exist'`# 1. Mark all nodes unvisited and store them. # 2. Set the distance to zero for our initial node # and to infinity for other nodes. distances = {vertex: inf for vertex in self.vertices} previous_vertices = { vertex: None for vertex in self.vertices } distances[source] = 0 vertices = self.vertices.copy() while vertices: # 3. Select the unvisited node with the smallest distance, # it's current node now. current_vertex = min( vertices, key=lambda vertex: distances[vertex]) # 6. Stop, if the smallest distance # among the unvisited nodes is infinity. if distances[current_vertex] == inf: break # 4. Find unvisited neighbors for the current node # and calculate their distances through the current node. for neighbour, cost in self.neighbours[current_vertex]: alternative_route = distances[current_vertex] + cost # Compare the newly calculated distance to the assigned # and save the smaller one. if alternative_route < distances[neighbour]: distances[neighbour] = alternative_route previous_vertices[neighbour] = current_vertex # 5. Mark the current node as visited # and remove it from the unvisited set. vertices.remove(current_vertex) path, current_vertex = deque(), dest while previous_vertices[current_vertex] is not None: path.appendleft(current_vertex) current_vertex = previous_vertices[current_vertex] if path: path.appendleft(current_vertex) return path`

Let's use it.

graph = Graph([The whole code from above:

("a", "b", 7), ("a", "c", 9), ("a", "f", 14), ("b", "c", 10),

("b", "d", 15), ("c", "d", 11), ("c", "f", 2), ("d", "e", 6),

("e", "f", 9)])print(graph.dijkstra("a", "e"))

>>> deque(['a', 'c', 'd', 'e'])

from collections import deque, namedtuple we'll use infinity as a default distance to nodes.inf = float('inf')

Edge = namedtuple('Edge', 'start, end, cost')def make_edge(start, end, cost=1):

return Edge(start, end, cost)class Graph:

definit(self, edges):

# let's check that the data is right

wrong_edges = [i for i in edges if len(i) not in [2, 3]]

if wrong_edges:

raise ValueError('Wrong edges data: {}'.format(wrong_edges))`self.edges = [make_edge(*edge) for edge in edges] @property def vertices(self): return set( sum( ([edge.start, edge.end] for edge in self.edges), [] ) ) def get_node_pairs(self, n1, n2, both_ends=True): if both_ends: node_pairs = [[n1, n2], [n2, n1]] else: node_pairs = [[n1, n2]] return node_pairs def remove_edge(self, n1, n2, both_ends=True): node_pairs = self.get_node_pairs(n1, n2, both_ends) edges = self.edges[:] for edge in edges: if [edge.start, edge.end] in node_pairs: self.edges.remove(edge) def add_edge(self, n1, n2, cost=1, both_ends=True): node_pairs = self.get_node_pairs(n1, n2, both_ends) for edge in self.edges: if [edge.start, edge.end] in node_pairs: return ValueError('Edge {} {} already exists'.format(n1, n2)) self.edges.append(Edge(start=n1, end=n2, cost=cost)) if both_ends: self.edges.append(Edge(start=n2, end=n1, cost=cost)) @property def neighbours(self): neighbours = {vertex: set() for vertex in self.vertices} for edge in self.edges: neighbours[edge.start].add((edge.end, edge.cost)) return neighbours def dijkstra(self, source, dest): assert source in self.vertices, 'Such source node doesn\'t exist' distances = {vertex: inf for vertex in self.vertices} previous_vertices = { vertex: None for vertex in self.vertices } distances[source] = 0 vertices = self.vertices.copy() while vertices: current_vertex = min( vertices, key=lambda vertex: distances[vertex]) vertices.remove(current_vertex) if distances[current_vertex] == inf: break for neighbour, cost in self.neighbours[current_vertex]: alternative_route = distances[current_vertex] + cost if alternative_route < distances[neighbour]: distances[neighbour] = alternative_route previous_vertices[neighbour] = current_vertex path, current_vertex = deque(), dest while previous_vertices[current_vertex] is not None: path.appendleft(current_vertex) current_vertex = previous_vertices[current_vertex] if path: path.appendleft(current_vertex) return path`

graph = Graph([

("a", "b", 7), ("a", "c", 9), ("a", "f", 14), ("b", "c", 10),

("b", "d", 15), ("c", "d", 11), ("c", "f", 2), ("d", "e", 6),

("e", "f", 9)])print(graph.dijkstra("a", "e"))

P.S. For those of us who, like me, read more books about the Witcher than about algorithms, it's Edsger Dijkstra, not Sigismund.

<em>Photo by Ishan @seefromthesky on Unsplash</em>

Dijkstra's algorithm can find for you the shortest path between two nodes on a graph. It's a must-know for any programmer. There are nice gifs and history in its Wikipedia page.

In this post I'll use the time-tested implementation from Rosetta Codechanged just a bit for being able to process weighted and unweighted graph data, also, we'll be able to edit the graph on the fly. I'll explain the code block by block.

The algorithmThe algorithm is pretty simple. Dijkstra created it in 20 minutes, now you can learn to code it in the same time.

- Mark all nodes unvisited and store them.
- Set the distance to zero for our initial node and to infinity for other nodes.
- Select the unvisited node with the smallest distance, it's current node now.
- Find unvisited neighbors for the current node and calculate their distances through the current node. Compare the newly calculated distance to the assigned and save the smaller one.
*For example, if the node A has a distance of 6, and the A-B edge has length 2, then the distance to B through A will be 6 + 2 = 8. If B was previously marked with a distance greater than 8 then change it to 8.* - Mark the current node as visited and remove it from the unvisited set.
- Stop, if the destination node has been visited (when planning a route between two specific nodes) or if the smallest distance among the unvisited nodes is infinity. If not, repeat steps 3-6.

First, imports and data formats. The original implementations suggests using namedtuple for storing edge data. We'll do exactly that, but we'll add a default value to the cost argument. There are many ways to do that, find what suits you best.

```
from collections import deque, namedtuple
# we'll use infinity as a default distance to nodes.
inf = float('inf')
Edge = namedtuple('Edge', 'start, end, cost')
def make_edge(start, end, cost=1):
return Edge(start, end, cost)
```

Let's initialize our data:

```
class Graph:
def __init__(self, edges):
# let's check that the data is right
wrong_edges = [i for i in edges if len(i) not in [2, 3]]
if wrong_edges:
raise ValueError('Wrong edges data: {}'.format(wrong_edges))
self.edges = [make_edge(*edge) for edge in edges]
```

Let's find the vertices. In the original implementation the vertices are defined in the _ _ init _ _, but we'll need them to update when edges change, so we'll make them a property, they'll be recounted each time we address the property. Probably not the best solution for big graphs, but for small ones it'll go.

```
@property
def vertices(self):
return set(
# this piece of magic turns ([1,2], [3,4]) into [1, 2, 3, 4]
# the set above makes it's elements unique.
sum(
([edge.start, edge.end] for edge in self.edges), []
)
)
```

Now, let's add adding and removing functionality.

```
def get_node_pairs(self, n1, n2, both_ends=True):
if both_ends:
node_pairs = [[n1, n2], [n2, n1]]
else:
node_pairs = [[n1, n2]]
return node_pairs
def remove_edge(self, n1, n2, both_ends=True):
node_pairs = self.get_node_pairs(n1, n2, both_ends)
edges = self.edges[:]
for edge in edges:
if [edge.start, edge.end] in node_pairs:
self.edges.remove(edge)
def add_edge(self, n1, n2, cost=1, both_ends=True):
node_pairs = self.get_node_pairs(n1, n2, both_ends)
for edge in self.edges:
if [edge.start, edge.end] in node_pairs:
return ValueError('Edge {} {} already exists'.format(n1, n2))
self.edges.append(Edge(start=n1, end=n2, cost=cost))
if both_ends:
self.edges.append(Edge(start=n2, end=n1, cost=cost))
```

Let's find neighbors for every node:

```
@property
def neighbours(self):
neighbours = {vertex: set() for vertex in self.vertices}
for edge in self.edges:
neighbours[edge.start].add((edge.end, edge.cost))
return neighbours
```

It's time for the algorithm! I renamed the variables so it would be easier to understand.

```
def dijkstra(self, source, dest):
assert source in self.vertices, 'Such source node doesn\'t exist'
# 1. Mark all nodes unvisited and store them.
# 2. Set the distance to zero for our initial node
# and to infinity for other nodes.
distances = {vertex: inf for vertex in self.vertices}
previous_vertices = {
vertex: None for vertex in self.vertices
}
distances[source] = 0
vertices = self.vertices.copy()
while vertices:
# 3. Select the unvisited node with the smallest distance,
# it's current node now.
current_vertex = min(
vertices, key=lambda vertex: distances[vertex])
# 6. Stop, if the smallest distance
# among the unvisited nodes is infinity.
if distances[current_vertex] == inf:
break
# 4. Find unvisited neighbors for the current node
# and calculate their distances through the current node.
for neighbour, cost in self.neighbours[current_vertex]:
alternative_route = distances[current_vertex] + cost
# Compare the newly calculated distance to the assigned
# and save the smaller one.
if alternative_route < distances[neighbour]:
distances[neighbour] = alternative_route
previous_vertices[neighbour] = current_vertex
# 5. Mark the current node as visited
# and remove it from the unvisited set.
vertices.remove(current_vertex)
path, current_vertex = deque(), dest
while previous_vertices[current_vertex] is not None:
path.appendleft(current_vertex)
current_vertex = previous_vertices[current_vertex]
if path:
path.appendleft(current_vertex)
return path
```

Let's use it.

```
graph = Graph([
("a", "b", 7), ("a", "c", 9), ("a", "f", 14), ("b", "c", 10),
("b", "d", 15), ("c", "d", 11), ("c", "f", 2), ("d", "e", 6),
("e", "f", 9)])
print(graph.dijkstra("a", "e"))
>>> deque(['a', 'c', 'd', 'e'])
```

The whole code from above:
```
from collections import deque, namedtuple
# we'll use infinity as a default distance to nodes.
inf = float('inf')
Edge = namedtuple('Edge', 'start, end, cost')
def make_edge(start, end, cost=1):
return Edge(start, end, cost)
class Graph:
def __init__(self, edges):
# let's check that the data is right
wrong_edges = [i for i in edges if len(i) not in [2, 3]]
if wrong_edges:
raise ValueError('Wrong edges data: {}'.format(wrong_edges))
self.edges = [make_edge(*edge) for edge in edges]
@property
def vertices(self):
return set(
sum(
([edge.start, edge.end] for edge in self.edges), []
)
)
def get_node_pairs(self, n1, n2, both_ends=True):
if both_ends:
node_pairs = [[n1, n2], [n2, n1]]
else:
node_pairs = [[n1, n2]]
return node_pairs
def remove_edge(self, n1, n2, both_ends=True):
node_pairs = self.get_node_pairs(n1, n2, both_ends)
edges = self.edges[:]
for edge in edges:
if [edge.start, edge.end] in node_pairs:
self.edges.remove(edge)
def add_edge(self, n1, n2, cost=1, both_ends=True):
node_pairs = self.get_node_pairs(n1, n2, both_ends)
for edge in self.edges:
if [edge.start, edge.end] in node_pairs:
return ValueError('Edge {} {} already exists'.format(n1, n2))
self.edges.append(Edge(start=n1, end=n2, cost=cost))
if both_ends:
self.edges.append(Edge(start=n2, end=n1, cost=cost))
@property
def neighbours(self):
neighbours = {vertex: set() for vertex in self.vertices}
for edge in self.edges:
neighbours[edge.start].add((edge.end, edge.cost))
return neighbours
def dijkstra(self, source, dest):
assert source in self.vertices, 'Such source node doesn\'t exist'
distances = {vertex: inf for vertex in self.vertices}
previous_vertices = {
vertex: None for vertex in self.vertices
}
distances[source] = 0
vertices = self.vertices.copy()
while vertices:
current_vertex = min(
vertices, key=lambda vertex: distances[vertex])
vertices.remove(current_vertex)
if distances[current_vertex] == inf:
break
for neighbour, cost in self.neighbours[current_vertex]:
alternative_route = distances[current_vertex] + cost
if alternative_route < distances[neighbour]:
distances[neighbour] = alternative_route
previous_vertices[neighbour] = current_vertex
path, current_vertex = deque(), dest
while previous_vertices[current_vertex] is not None:
path.appendleft(current_vertex)
current_vertex = previous_vertices[current_vertex]
if path:
path.appendleft(current_vertex)
return path
graph = Graph([
("a", "b", 7), ("a", "c", 9), ("a", "f", 14), ("b", "c", 10),
("b", "d", 15), ("c", "d", 11), ("c", "f", 2), ("d", "e", 6),
("e", "f", 9)])
print(graph.dijkstra("a", "e"))
```

P.S. For those of us who, like me, read more books about the Witcher than about algorithms, it's Edsger Dijkstra, not Sigismund.

<strong>Originally published by </strong><a href="https://medium.com/@diogoribeiro_94486" target="_blank">Diogo Ribeiro</a> <em>at </em><a href="https://medium.com/@diogoribeiro_94486/a-review-of-basic-algorithms-and-data-structures-in-python-graph-algorithms-d73691d86211" target="_blank"><em>Medium</em></a>

Recently, while reviewing basic graph algorithms, I decided to write down my study notes as an article in case someone else finds them useful. To verify my understanding, I wrote minimal implementations of the algorithms in Python which make up the bulk of this article. Simple unit tests accompany the code. The unit tests can also be used as examples of using the code.

I’m hoping to write at least a few follow-up posts, focusing on combinatorial algorithms, string algorithms, and maybe even one on computational geometry.

Most of the code was written to be easy to understand without having to reference much else (with a few exceptions, for example, Kruskal’s algorithm uses the disjoint set structure defined in another section). This results in some duplication, especially in the unit tests. I consider this to be acceptable, given that the purpose of the code is to be used as educational material and not as code in production use that needs a day to day maintenance.

One last thing before we start: I wrote the article and all the code relatively quickly. Mistakes and bugs are definitely possible. Corrections are appreciated; please comment below if you find any.

Algorithms and data structures in this article:

- Disjoint Set (Union-Find)
- Kruskal’s Minimum Spanning Tree (MST)
- Depth First Search (DFS)
- Breadth First Search (BFS)
- Kahn’s Topological Sort Algorithm
- Dijkstra’s Shortest Path Algorithm
- Bellman-Ford Shortest Path Algorithm

The disjoint set structure is used to keep track of a partitioning of a set of objects into subsets. The main question it needs to answer is “do X and Y belong to the same subset?” and the main operation it needs to support is joining two subsets so that elements in either of the subsets will belong to the same larger subset afterward.

Quick and minimal implementation is provided below. The implementation below uses a forest to keep track of the subsets in the partition. Each tree in the forest is one subset, and the root of the tree is the “representative” element of the subset. To check if two elements belong to the same subset, we check if they have the same representative element.

Noting that the ideal tree in this implementation is a star (this minimizes the number of recursive `find`

calls), we "compress" the paths on each call to find. That is, we set the parent of all the elements on the path to the representative to the representative as we unwind down the recursive call stack.

```
class DisjointSet(object):
def __init__(self, n):
"""
Initializes a disjoint set structure consisting of n disjoint sets.
"""
self.parent = list(range(n))
def find(self, x):
"""Returns the representative element of the set x belongs to."""
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
return self.parent[x]
def union(self, x, y):
"""Joins the sets containing x and y."""
self.parent[self.find(x)] = self.find(y)
```

And the accompanied unit test:

```
import unittest
from union_find import DisjointSet
class DisjointSetTest(unittest.TestCase):
def test_initialized_state(self):
d = DisjointSet(3)
self.assertEqual(d.find(0), 0)
self.assertEqual(d.find(1), 1)
self.assertEqual(d.find(2), 2)
def test_basic_union(self):
d = DisjointSet(3)
d.union(0, 1)
self.assertEqual(d.find(0), d.find(1))
self.assertNotEqual(d.find(1), d.find(2))
def test_basic_union_idempotent(self):
d = DisjointSet(2)
d.union(0, 1)
d.union(0, 1)
self.assertEqual(d.find(0), d.find(1))
def test_union_all(self):
d = DisjointSet(100)
for i in range(1, 100):
d.union(i - 1, i)
for i in range(1, 100):
self.assertEqual(d.find(0), d.find(i))
```

Kruskal’s minimum spanning tree algorithm is a good example of a greedy algorithm. Starting with a forest consisting of individual disjoint vertices, at each step we pick the next best edge (one with minimal weight) provided it does not introduce a cycle into the forest, and continue until the forest becomes a tree. It’s rather easy to prove that the resulting tree is a minimum spanning tree.

Using the disjoint set structure shown above to keep track of the minimum spanning forest, the implementation below is very simple:

```
from collections import namedtuple
from union_find import DisjointSet
# Putting weight as the first element means Edges will sort by weight first,
# then source and target (lexicographically).
Edge = namedtuple('Edge', ['weight', 'source', 'target'])
def kruskal_mst(n, edges):
"""
Given a positive integer n (number of vertices) and a collection of Edge
namedtuple objects representing the undirected edges of a graph, returns a
list of edges forming a minimal spanning tree of the graph. Assumes the
vertices are numbers in the range 0 to n - 1. Also assumes input is a
valid connected undirected graph and that for two vertices v and w only one
of (v, w) or (w, v) is an edge in the input. Output is undefined if these
assumptions are not satisfied.
"""
d = DisjointSet(n)
mst_tree = []
for edge in sorted(edges):
if d.find(edge.source) != d.find(edge.target):
mst_tree.append(edge)
if len(mst_tree) == n - 1:
break
d.union(edge.source, edge.target)
return mst_tree
```

And the accompanied unit test:

```
import unittest
from kruskal import kruskal_mst, Edge
class KruskalMSPTest(unittest.TestCase):
def test_single_vertex_graph(self):
self.assertEqual(kruskal_mst(1, []), [])
def test_single_edge_graph(self):
edges = [Edge(source=0, target=1, weight=10)]
self.assertEqual(kruskal_mst(2, edges), edges)
def test_cycle_5(self):
edges = [
Edge(source=0, target=1, weight=50),
Edge(source=1, target=2, weight=30),
Edge(source=2, target=3, weight=60),
Edge(source=3, target=4, weight=20),
Edge(source=4, target=0, weight=10),
]
# Everything except the heaviest edge. Output sorted by weight.
self.assertEqual(kruskal_mst(5, edges), [
Edge(source=4, target=0, weight=10),
Edge(source=3, target=4, weight=20),
Edge(source=1, target=2, weight=30),
Edge(source=0, target=1, weight=50),
])
def test_complete_graph_4(self):
edges = [
Edge(source=0, target=1, weight=10),
Edge(source=0, target=2, weight=30),
Edge(source=0, target=3, weight=40),
Edge(source=1, target=2, weight=20),
Edge(source=1, target=3, weight=50),
Edge(source=2, target=3, weight=60),
]
self.assertEqual(kruskal_mst(4, edges), [
Edge(source=0, target=1, weight=10),
Edge(source=1, target=2, weight=20),
Edge(source=0, target=3, weight=40),
])
```

Depth-first search is arguably the simplest graph traversal algorithm. It’s a simple recursive algorithm that just needs to keep track of which vertices have already been processed. In fact, many other recursive algorithms can be thought of as a DFS on some underlying graph (e.g. binary search is guided DFS on the binary search tree). DFS can be used to determine if there is a path from a vertex to another and to visit every vertex starting from a source vertex. Variations of DFS can be used for determining connected components and doing topological sorting. The code below simply uses DFS to return all vertices reachable from a starting vertex.

```
def dfs(graph, source):
"""
Given a directed graph (format described below), and a source vertex,
returns a set of vertices reachable from source.
The graph parameter is expected to be a dictionary mapping each vertex to a
list of vertices indicating outgoing edges. For example if vertex v has
outgoing edges to u and w we have graph[v] = [u, w].
"""
visited = set()
def _recurse(v):
if v in visited:
return
visited.add(v)
for w in graph[v]:
_recurse(w)
_recurse(source)
return visited
```

And the accompanied unit test:

```
import unittest
from dfs import dfs
class DFSTest(unittest.TestCase):
def test_single_vertex(self):
graph = {0: []}
self.assertEqual(dfs(graph, 0), {0})
def test_single_vertex_with_loop(self):
graph = {0: [0]}
self.assertEqual(dfs(graph, 0), {0})
def test_two_vertices_no_path(self):
graph = {
0: [],
1: [],
}
self.assertEqual(dfs(graph, 0), {0})
self.assertEqual(dfs(graph, 1), {1})
def test_two_vertices_with_simple_path(self):
graph = {
0: [1],
1: [],
}
self.assertEqual(dfs(graph, 0), {0, 1})
self.assertEqual(dfs(graph, 1), {1})
def test_complete_graph(self):
def _complete_graph(n):
return {v: list(set(range(n)) - {v}) for v in range(n)}
for n in range(2, 10):
graph = _complete_graph(n)
for v in range(n):
self.assertEqual(dfs(graph, v), set(range(n)))
def test_cycle_5(self):
graph = {
0: [1],
1: [2],
2: [3],
3: [4],
4: [0],
}
for v in range(5):
self.assertEqual(dfs(graph, v), {0, 1, 2, 3, 4})
```

BFS is one of the simplest graph algorithms and a good algorithm to understand prior to Dijkstra’s, which is coming up next. It can be used to simply traverse a graph and visit every vertex, to search for a particular vertex, or find the shortest path (assuming edges don’t have weights) to every vertex starting from a single vertex.

```
from collections import deque
def bfs(graph, source, target):
"""
Given a directed graph (format described below), and source and target
vertices, returns a shortest unweighted path as a list of vertices going
from source to target, or None if no such path exists. Returned path will
not include the source vertex in it.
The graph parameter is expected to be a dictionary mapping each vertex to a
list of vertices indicating outgoing edges. For example if vertex v has
outgoing edges to u and w we have graph[v] = [u, w].
"""
q = deque([source])
# previous_vertex[v] holds the immediate vertex before v in the shortest
# path from source to v. This dictionary also acts as our "visited" set
# since we set previous_vertex[v] as soon as the vertex enters our queue.
previous_vertex = {source: source}
while q:
v = q.popleft()
if v == target:
return _construct_path(previous_vertex, source, target)
for w in graph[v]:
if w not in previous_vertex:
previous_vertex[w] = v
q.append(w)
return None
def _construct_path(previous_vertex, source, target):
if source == target:
return []
return _construct_path(previous_vertex, source,
previous_vertex[target]) + [target]
```

And the accompanied unit test:

```
import unittest
from bfs import bfs
class BFSTest(unittest.TestCase):
def test_single_vertex(self):
graph = {0: []}
self.assertEqual(bfs(graph, 0, 0), [])
def test_single_vertex_with_loop(self):
graph = {0: [0]}
self.assertEqual(bfs(graph, 0, 0), [])
def test_two_vertices_no_path(self):
graph = {
0: [],
1: [],
}
self.assertEqual(bfs(graph, 0, 1), None)
def test_two_vertices_with_simple_path(self):
graph = {
0: [1],
1: [],
}
self.assertEqual(bfs(graph, 0, 1), [1])
def test_complete_graph(self):
def _complete_graph(n):
return {v: list(set(range(n)) - {v}) for v in range(n)}
for n in range(2, 10):
graph = _complete_graph(n)
for v in range(n):
for w in range(n):
self.assertEqual(bfs(graph, v, w),
[] if v == w else [w])
def test_cycle_5(self):
graph = {
0: [4, 1],
1: [0, 2],
2: [1, 3],
3: [2, 4],
4: [3, 0],
}
self.assertEqual(bfs(graph, 0, 2), [1, 2])
self.assertEqual(bfs(graph, 0, 3), [4, 3])
```

Given a directed acyclic graph (DAG) representing a set of, say, tasks and their dependencies, the topological sort is the problem of finding an order of task execution that will satisfy all the dependencies. This problem arises in a variety of applications. Examples include task scheduling, build systems (e.g. Bazel), parallel pipelines (e.g. Hadoop), and formula evaluation (e.g. in spreadsheets).

While a variation of DFS can be used for topological sorting, my personal favorite algorithm for doing topological sorts is Kahn’s algorithm, due to its intuitiveness. The idea behind the algorithm is simple: start with vertices with no incoming edges, process them, and then remove them and all their outgoing edges from the graph and continue until there’s nothing left in the graph.

In the code below, instead of returning a particular topological sort, the algorithm assigns a “sequence” to each vertex, such that if `sequence[v] < sequence[w]`

then `v`

should be before `w`

in *any* topological sort of the graph. This simplifies unit testing, and also allows for easier use of the output in cases where parallelization is possible (since all tasks with the same sequence number can be executed in parallel).

```
from collections import deque, namedtuple
Vertex = namedtuple('Vertex', ['name', 'incoming', 'outgoing'])
def build_doubly_linked_graph(graph):
"""
Given a graph with only outgoing edges, build a graph with incoming and
outgoing edges. The returned graph will be a dictionary mapping vertex to a
Vertex namedtuple with sets of incoming and outgoing vertices.
"""
g = {v:Vertex(name=v, incoming=set(), outgoing=set(o))
for v, o in graph.items()}
for v in g.values():
for w in v.outgoing:
if w in g:
g[w].incoming.add(v.name)
else:
g[w] = Vertex(name=w, incoming={v}, outgoing=set())
return g
def kahn_top_sort(graph):
"""
Given an acyclic directed graph (format described below), returns a
dictionary mapping vertex to sequence such that sorting by the sequence
component will result in a topological sort of the input graph. Output is
undefined if input is a not a valid DAG.
The graph parameter is expected to be a dictionary mapping each vertex to a
list of vertices indicating outgoing edges. For example if vertex v has
outgoing edges to u and w we have graph[v] = [u, w].
"""
g = build_doubly_linked_graph(graph)
# sequence[v] < sequence[w] implies v should be before w in the topological
# sort.
q = deque(v.name for v in g.values() if not v.incoming)
sequence = {v: 0 for v in q}
while q:
v = q.popleft()
for w in g[v].outgoing:
g[w].incoming.remove(v)
if not g[w].incoming:
sequence[w] = sequence[v] + 1
q.append(w)
return sequence
```

And the accompanied unit test:

```
import unittest
from kahn import kahn_top_sort
class KahnTopSortTest(unittest.TestCase):
def test_single_vertex(self):
graph = {
0: [],
}
self.assertEqual(kahn_top_sort(graph), {
0: 0,
})
def test_total_order_2(self):
graph = {
0: [1],
1: [],
}
self.assertEqual(kahn_top_sort(graph), {
0: 0,
1: 1,
})
def test_total_order_3(self):
graph = {
0: [1],
1: [2],
2: [],
}
self.assertEqual(kahn_top_sort(graph), {
0: 0,
1: 1,
2: 2,
})
def test_two_independent_total_orders(self):
# 0 -> 1 -> 2
# 3 -> 4 -> 5
graph = {
0: [1],
1: [2],
2: [],
3: [4],
4: [5],
5: [],
}
self.assertEqual(kahn_top_sort(graph), {
0: 0,
3: 0,
1: 1,
4: 1,
2: 2,
5: 2,
})
def test_simple_dag_1(self):
# 0 -> 1 -> 2
# \ /
# 3
graph = {
0: [1, 3],
1: [2],
2: [],
3: [1],
}
self.assertEqual(kahn_top_sort(graph), {
0: 0,
3: 1,
1: 2,
2: 3,
})
```

Dijkstra’s shortest path algorithm is very similar to BFS, except a priority queue is used instead of a regular queue. A proper implementation would use a priority queue with an “update key” operation which would reduce the redundant items in the queue. The implementation below, for the sake of simplicity, uses the built-in Python `PriorityQueue`

which does not support "update key".

The invariant in the algorithm is that each time we get an item from the queue, we know that we have the shortest path from source to it already (this is where the guarantee of non-negative weights is key, as this invariant can fail if we have negative weights.)

```
from collections import namedtuple, defaultdict
from Queue import PriorityQueue
Edge = namedtuple('Edge', ['target', 'weight'])
def dijkstra(graph, source, target):
"""
Given a directed graph (format described below), and source and target
vertices, returns a shortest path as a list of vertices going from source
to target, along with the distance of the shortest path, or None if no such
path exists. Returned path will not include the source vertex in it.
Assumes non-negative weights.
The graph parameter is expected to be a dictionary mapping each vertex to a
list of Edge named tuples indicating the vertex's outgoing edges. For
example if vertex v has outgoing edges to u and w with weights 10 and 20
respectively, we have graph[v] = [Edge(u, 10), Edge(w, 20)].
"""
q = PriorityQueue()
q.put((0, source))
# previous_vertex[v] holds the immediate vertex before v in the shortest
# path from source to v. This dictionary also acts as our "visited" set
# since we set previous_vertex[v] as soon as the vertex enters our queue.
previous_vertex = {source: source}
# Arguably not the best way to represent infinity but it works for the sake
# of learning the algorithm.
shortest_distance = defaultdict(lambda: float('inf'))
shortest_distance[source] = 0
while not q.empty():
(distance, v) = q.get()
if v == target:
return (distance, _construct_path(previous_vertex, source, target))
for edge in graph[v]:
alt_distance = edge.weight + distance
if alt_distance < shortest_distance[edge.target]:
shortest_distance[edge.target] = alt_distance
q.put((alt_distance, edge.target))
previous_vertex[edge.target] = v
return None
def _construct_path(previous_vertex, source, target):
if source == target:
return []
return _construct_path(previous_vertex, source,
previous_vertex[target]) + [target]
```

And the accompanied unit test:

```
import unittest
from dijkstra import dijkstra, Edge
class DijkstraTest(unittest.TestCase):
def test_single_vertex(self):
graph = {0: []}
self.assertEqual(dijkstra(graph, 0, 0), (0, []))
def test_two_vertices_no_path(self):
graph = {
0: [],
1: [],
}
self.assertEqual(dijkstra(graph, 0, 1), None)
def test_two_vertices_with_path(self):
graph = {
0: [Edge(target=1, weight=10)],
1: [],
}
self.assertEqual(dijkstra(graph, 0, 1), (10, [1]))
def test_cycle_3(self):
graph = {
0: [Edge(target=1, weight=10), Edge(target=2, weight=30)],
1: [Edge(target=0, weight=10), Edge(target=2, weight=10)],
2: [Edge(target=0, weight=30), Edge(target=1, weight=30)],
}
self.assertEqual(dijkstra(graph, 0, 2), (20, [1, 2]))
def test_clrs_example(self):
graph = {
's': [
Edge(target='t', weight=3),
Edge(target='y', weight=5),
],
't': [
Edge(target='x', weight=6),
Edge(target='y', weight=2),
],
'y': [
Edge(target='t', weight=1),
Edge(target='z', weight=6),
],
'x': [
Edge(target='z', weight=2),
],
'z': [
Edge(target='x', weight=7),
Edge(target='s', weight=3),
],
}
distance, path = dijkstra(graph, 's', 'z')
self.assertEqual(distance, 11)
self.assertIn(path, [
['y', 'z'],
['t', 'y', 'x', 'z'],
])
distance, path = dijkstra(graph, 's', 'x')
self.assertEqual(distance, 9)
self.assertIn(path, [
['t', 'x'],
['y', 'x'],
])
```

Bellman-Ford is another single-source shortest path algorithm. It’s very easy to implement but has worse running time than Dijkstra’s. While in Dijkstra’s we relax edges greedily based on the next closest vertex to the source, in Bellman-Ford we relax every edge exactly n-1 times. Each such iteration guarantees to increase the number of vertices for which we have the shortest path by at least one, and hence after n-1 iterations, we have the shortest path to every vertex. We then do a final loop over all the edges and try to relax further. If we succeed, we know a negative cycle exists. This is the key advantage of Bellman-Ford as compared to Dijkstra’s (Dijkstra’s algorithm does not work if negative weights exist.)

Here’s a basic implementation:

```
from collections import namedtuple, defaultdict
Edge = namedtuple('Edge', ['target', 'weight'])
def bellman_ford(graph, source, target):
"""
Given a directed graph (format described below), and source and target
vertices, returns a shortest path as a list of vertices going from source
to target, along with the distance of the shortest path, or None if no such
path exists and -1 if a negative loop is found. Returned path will not
include the source vertex in it. Assumes non-negative weights.
The graph parameter is expected to be a dictionary mapping each vertex to a
list of Edge named tuples indicating the vertex's outgoing edges. For
example if vertex v has outgoing edges to u and w with weights 10 and 20
respectively, we have graph[v] = [Edge(u, 10), Edge(w, 20)].
"""
# previous_vertex[v] holds the immediate vertex before v in the shortest
# path from source to v. This dictionary also acts as our "visited" set
# since we set previous_vertex[v] as soon as the vertex enters our queue.
previous_vertex = {source: source}
# Arguably not the best way to represent infinity but it works for the sake
# of learning the algorithm.
shortest_distance = defaultdict(lambda: float('inf'))
shortest_distance[source] = 0
# Run n - 1 times. We start by knowing the shortest path to 1 vertex
# (source itself) and each iteration below increases the vertices for which
# we have the shortest path to by one. This means at the end we have the
# shortest path to 1 + (n - 1) = n vertices.
for i in range(len(graph) - 1):
for v in graph:
for edge in graph[v]:
alt_distance = shortest_distance[v] + edge.weight
if alt_distance < shortest_distance[edge.target]:
shortest_distance[edge.target] = alt_distance
previous_vertex[edge.target] = v
# Final loop over all edges to check for negative loops. If at this point
# we find a shorter alternative path it means a negative loop exists.
for v in graph:
for edge in graph[v]:
alt_distance = shortest_distance[v] + edge.weight
if alt_distance < shortest_distance[edge.target]:
return -1
if shortest_distance[target] < float('inf'):
return (shortest_distance[target],
_construct_path(previous_vertex, source, target))
return None
def _construct_path(previous_vertex, source, target):
if source == target:
return []
return _construct_path(previous_vertex, source,
previous_vertex[target]) + [target]
```

And as before, accompanied unit test, which is a copy of the one used for Dijkstra’s, with an additional test for negative cycles:

```
import unittest
from bellman import bellman_ford, Edge
class BellmanFordTest(unittest.TestCase):
def test_single_vertex(self):
graph = {0: []}
self.assertEqual(bellman_ford(graph, 0, 0), (0, []))
def test_two_vertices_no_path(self):
graph = {
0: [],
1: [],
}
self.assertEqual(bellman_ford(graph, 0, 1), None)
def test_two_vertices_with_path(self):
graph = {
0: [Edge(target=1, weight=10)],
1: [],
}
self.assertEqual(bellman_ford(graph, 0, 1), (10, [1]))
def test_cycle_3(self):
graph = {
0: [Edge(target=1, weight=10), Edge(target=2, weight=30)],
1: [Edge(target=0, weight=10), Edge(target=2, weight=10)],
2: [Edge(target=0, weight=30), Edge(target=1, weight=30)],
}
self.assertEqual(bellman_ford(graph, 0, 2), (20, [1, 2]))
def test_negative_cycle_3(self):
graph = {
0: [Edge(target=1, weight=10), Edge(target=2, weight=30)],
1: [Edge(target=0, weight=10), Edge(target=2, weight=10)],
2: [Edge(target=0, weight=-30), Edge(target=1, weight=30)],
}
self.assertEqual(bellman_ford(graph, 0, 2), -1)
def test_clrs_example(self):
graph = {
's': [
Edge(target='t', weight=3),
Edge(target='y', weight=5),
],
't': [
Edge(target='x', weight=6),
Edge(target='y', weight=2),
],
'y': [
Edge(target='t', weight=1),
Edge(target='z', weight=6),
],
'x': [
Edge(target='z', weight=2),
],
'z': [
Edge(target='x', weight=7),
Edge(target='s', weight=3),
],
}
distance, path = bellman_ford(graph, 's', 'z')
self.assertEqual(distance, 11)
self.assertIn(path, [
['y', 'z'],
['t', 'y', 'x', 'z'],
])
distance, path = bellman_ford(graph, 's', 'x')
self.assertEqual(distance, 9)
self.assertIn(path, [
['t', 'x'],
['y', 'x'],
])
```