Questions?

Today and Thursday I'd like to discuss several "shortest" algorithms on weighted graphs. Your assignment this week is essentially to show that you understand this stuff, by summarizing what's going on here in your own words.

- We started this term thinking about O() efficiency: how fast algorithms run as a function of the size of the problem.
- We then looked at various "data structures": collections of things optimized for different tasks, with different () behaviors for .add(), .is_member(), .get_smallest(), .find(), and so on: array, linked lists, stacks, queues, minheap.
- Sorting was next; looking at some algorithms to process arrays and analyzing their O() efficiency with (a) theory arguments based on loops and (b) numerical experiments on random arrays of different lengths.
- From there we started talking about graphs and trees: how to represent them, how to traverse them, how to search for something.
- And now we're ready look at more graph problems.

Along the way we've seen several different ideas behind these algorithms :

- brute force : check each possibility exhaustively
- divide and conquer : subdivide the problem and combine

Today's algorithms use two more :

- greedy algorithms : take the fastest path towards the goal
- dynamic programming : recursive subproblems (calling this "dynamic" is not a helpful description, but that's what it's called)

For all of these, we are given a graph G with

- V is a set of vertices; size of V is |V|=n
- E is a set of edges size of E is |E|=m
- Each edge is a pair of vertices (Vi, Vj).
- For a weighted graph, we extend that to a triplet (Vi, Vj, w_ij).

Often the weight is the distance along the edge, but it might also be the flow or bandwidth or whatever. Usually the weights are positive.

For all of these algorithms, we will need to be able to find quickly the neighbors any vertex V, and so will need a data structure that is something like

```
{ Vi : [(Vj, w_ij), ...], ... }
```

which says that for the i'th vertex Vi, there is a list of the edges which connect Vi to Vj, and the weight w_ij of that edge.

This is the "adjacency list" we have already seen, with the addition of a weight for each edge.

Skienna uses a linked list here; in python we could just use python's lists.

For an "undirected" graph w_ij = w_ji, and each edge shows up twice in that data structure, once as i's connection to j, and once as j's connection to i.

The real goal here is get a sense of the ideas that go into algorithms like these, more than these particular ones. There are many variations, and many, many different problems. Some are doable; some are not. Some can only be solved approximately for practical sizes.

You should be paying attention to questions like :

- What exactly is the problem that we're trying to solve?
- What inputs (description of the problem) does it need?
- What outputs (form of the answer) does it produce?
- What is the general idea of this algorithm?
- What data structures (and APIs) does this need?
- What is its time and space O() performance?

One good way to build your understanding is to work through the algorithm on a tiny problem by hand - that means typing out everything, including the data structures, for enough steps to see what's what. Tedious but helpful.

We'll talk through some of these in class today, using some online sources, and will continue Thursday.

- wikipedia: shortest path problem
- Dijkstra's algorithm (wikipedia)
- Floyd-Warshall algorithm (wikipedia)

I have python code for both of these in code/graphs/shortest_path, including generating some images of the graph and shortest path.

Dijkstra's is a variation on the breadth-search we've already seen, and fairly straightforward to understand. It's a "greedy" algorithm, which at every step chooses the shortest outwards extension.

Floyd-Warshall is for most people more difficult to understand - math-ish and tricky. In particular, it uses a shortest(i, j, k) mapping which is subtle: the shortest distance from vertex[i] to vertex[j] using only some of the vertices between, 1,2,...,k. It's an entirely different approach from Dijkstra, and very cool. It's one of the "dynamic programming" algorithms, where the solution is expressed recursively in terms of smaller solutions.

- wikipedia: minimum spanning tree
- Prim's algorithm (wikipedia)
- Kruskal's algorithm (wikipedia)

The first thing to understand here is what the problem is - it's a little trickier than the minimum distance.

Both of these are "greedy" in that they add the shortest thing next ... but Prim's grows one tree, while Kruskal's grows many small trees until they all merge together.

I don't have example code for either of these, at least not yet - perhaps we'll work on that together this week.

https://cs.bennington.college /courses /spring2022 /algorithms /notes /april18

last modified Sun April 17 2022 8:48 pm

last modified Sun April 17 2022 8:48 pm