and

Data

Structures

Spring 2021

- Explore this course website. Read the syllabus and browse the resources.
- Please join the slack channel. You can find information about it in the left menu after you log in. Introduce yourself, and ask a question.
- Submit a response to this assignment describe your computing and programming background, and what you hope to get out of this course. What sort of computer do you have - Mac? Windows? Unix?
- In Python or C or both, count the number of Pythagorean Triples with \( a \lt b \lt c \le N \) up to various values for N such as N=(100, 1000, 10000, ...). The specific values are up to you. For at least some values, have the program write the triples to a file. Measure and report on how long your program takes to run for different values of N. I suggest using brute force - one of the algorithm design patterns we'll be looking at this semester; I'm not looking for fancy math tricks at this point.
- Please have this assignment done before Monday's class so that we can go over it together then.
- Do submit your source code and results here, as attached files or embedded code, along with a discussion of what you did.
- If you have time, start reading about "analysis" and "O()" in any of the textbooks.
- Let me know how all this went : easy? hard?

- Read about algorithm analysis and O() notation. Some sources include
- the Algorithm Design Manual chapter 2
- Asymptotic Notation in Skienna's slides & videos
- the Analysis chapter at runestone
- O() in bulding-blocks

- A sorting algorithm takes 1 second to sort 1000 items. About how long do you think it take to do to 10,000 items if (a) the sorting algorithm is \( O(n^2) \) or (b) the sorting algorithms is \( O(n \, log(n)) \).
- Do some numerical experiments to estimate the O(?) behavior of some of the following Python operations. To estimate the timing of very fast operations, repeat the operation a large number of times. Or use the python timeit package. Make at least one plot to visualize a result.
- (x in y) where y is a list
- (x in y) where y is a set
- (x.pop()) where x is a heap (see the python built-in heapq package)
- (x.pop()) where x is a list
- (x.pop(1)) where x is a list
- (x.sort()) where x is a list

- Study the algorithms power1(a,b,c) and power2(a,b,c) in power.py. (See "exponentiation by squaring; its used in cryptography.) What do you think the O(b) behavior of these is for large values of b with (a,c) held fixed? Verify with some numerical experiments.
- Say you have an algorithm with O() behavior described by one of these four functions ( N log(N) , \(N^2\) , \(2^N\) , N! ) . For each, what values of N do you think it would be practical to run a program with that behavior? Would N=10000=1e4 be practical? N=1e6? Make a plot of N vs time that illustrates what's going on.
- Suppose you want search for a text string in a book sized document. Which is better, an O(1) algorithm or an O(n) algorithm? Discuss. (Hint: this is a trick question.)

- Read about C pointers, structs, and malloc, and Python classes.
- Start reading about linked lists, stack, and queues.
- In Python, write and test LinkedList and Node classes that implement a doubly linked list. The LinkedList class should hold (at least) the first and last nodes. Include an "insert" method which puts a new node1 into the list after some other given node2.
- In C, write and test a linked_list and node structs which implement a similar doubly linked list. Again, include an "insert" method. Here's some starting code to get you going :

```
typedef struct _node *node;
struct _node {
int value;
node next;
node previous;
};
typedef struct _linkedlist *linkedlist;
struct _linkedlist {
node first;
node last;
};
node new_node(int itsvalue){
// TODO
return newnode;
}
linkedlist new_linkedlist(){
// TODO
return newlist;
}
```

- Write a recursive function to find the length of the linked list in both languages.
- Using those classes, implement the stack (push, pop, is_empty) and queue (enqueue, dequeue, is_empty) methods, and show that they give the expected first-in-first-out (queue) or (first-in-last-out) stack behavior. (I'm using "enqueue" mean "push for a queue data structure" and "dequeue" to mean "pop for a queue data structure" which is a fairly common convention.)
- Tell me how this went ... if this is too much too fast, then we can slow down.

- Finish up exploring linked lists, stacks, and queues.
- Clean up any parts of the assignment from last week that seem interesting. You can use the code that I showed in class Monday the 8th or work through and adapt my worked solutions.
- This week, work in either Python or C or both; your choice.
- Code and test a function that reverses a singly linked list. Your code should run in O(n) time. (This is a classic linked-list exercise, all about rearranging pointers.)
- Code and test a function that checks to see if a string made up of parens, braces, and brackets has them in correctly matching arrangements. For example, "([]){{()}}" is correctly matched, while "({)}" is not. Hint: use a stack. (There's a solution to this problem in the runestone pythonds; see if you can do it without looking at their code.) Can you do this without using a stack? (Answer: yes, but it's harder.)
- Do at least one of these recursion excercies :
- Find the n'th Fibonnaci number, in two ways: with an explicit loop, and using recursion. Discuss what happens to the run time as n increases and why for the two cases.
- Using python turtle graphics, write a recursive program to draw the Siepinski triangle.

- Read about sorting algorithms in Skienna's book or the runestone python online book.
- Implement BubbleSort and QuickSort, show that each works on a test case, and explain what their
`O()`

behavior is and why. - Explain what a "loop invariant" is, and why they can be helpful in showing that an algorithm is valid. See for example wikipedia and this discussion of bubble sort. How is this connected to "proof by induction"?
- Choose a sorting algorithm that we haven't discussed (so not SelectionSort, QuickSort, HeapSort) to start studying for your midterm project, and tell me which one you're going to look at. (MergeSort is a good choice if you're not sure ... which can be done linked lists in place!) Where are you reading about it?

- This is for one of your three semester grades.
- With a sorting algorithm that we haven't done in class (mergesort is a good choice), implement, explain, discuss the expected
`O()`

behavior of, and perform numerical experiments to show that behavior on lists of random numbers of various sizes. You can use either timing or number of steps as a measure. - You may use either Python or C. (If in C, I'd suggest outputting the results to a text file, and plotting those numbers in a python or R notebook.)
- Think of this as a short paper - I want to see an explanation, code, output from the code, some graphs, references of what sources you used - all that.

- Read about hash tables - what they are and how to implement them. There are several sources in my March 29 notes.
- Implement and test a hash table, doing something like my hash_table_demo.py in the code/hash_table folder. Use a different hash function, a different collision mechanism, try different sizes for the hash tables - any variation is fine. The point is to play around with and understand how these things work. Do include some sort of tests for your code, to see how well it works.
- In theory, hash tables give us fast O(1) lookup for (key, value) pairs. In practice, collisions may slow this down if we aren't careful. Explain what this means practically for your hash table code.

- Explore the vocabulary and ideas of graphs and trees using the sources in my April 8 notes.
- Study in more detail how to (a) how to store a graph in a program and (b) how to search (i.e. traverse) a graph, visiting each node and doing something (printing its name or seeing if it is the search goal), using both a depth-first and breadth-first approach using a loop and a fringe (either a Stack or a Queue).
- Consider this graph :
- What kind of graph is this? (In other words, what are its properties?)
- Write down (a) an "adjacency matrix" and (b) an "adjacency list" representation of the graph. (You can do this either by hand or write code to print one out.)
- Write down the order that the nodes will be visited if the graph is searched starting at A, for both a depth-first and breadth-first search. (Again, you can do this by hand or by writing code to do the search. Either way, make sure you understand and can explain why the order is what it is.)

- Now consider this thing : .
- Is this a tree? Is it a binary tree? Is it a graph?
- Could you do the same things to this as the previous problem? Why or why not?
- Write code that implements a recursive depth-first-search starting at 'a'.
- Can you run that code starting at 'd'? Why or why not?

- Read about and explain four classic algorithms for two classic graph problems, as described in my graph notes part 3.
- What exactly is the "shortest path" problem?
- Summarize the basic ideas behind (a) Dijkstra's algorithm and (b) the Floyd-Warshall algorithm.
- For at least one of those two, work through the algorithm step by step by hand on a tiny example. Include and explain whatever data structures are needed.

- What exactly is the "minimum spanning tree" problem?
- Summarize the basic ideas behind (a) Prim's algorithm and (b) Kruskal's algorithm.
- For at least one of those two, work through a tiny example by hand step by step, including any needed data structures.

- What exactly is the "shortest path" problem?
- Look at and run my peg solitaire program pegs.py. What sort of search is this, on what sort of graph?
- There are many variations of peg solitaire. What would it take to solve a different one than the triangle that I did?

- Start thinking about what you might want to do for a final project - a problem, algorithm to solve it, code to implement it, a discussion and analysis.

- Implement and explain a program to solve at least one of the following search problems, for at least small sizes :
- the eight queens puzzle
- the knight's tour problem
- a two-player game such as Nim or chomp or tic-tac-toe
- a solitaire game such as the sliding block puzzle or sudoku .

- propose a final project topic

- Explore some of the SQL exercises on the resources page such as those at swcarpentry . I describe how to get going with sqlite3 in a terminal on jupyter.bennington.college in my May 3 notes. Describe what you did, with some specific examples. Also :
- Explain what a "JOIN" is and give an example.
- How does that compare with a "SELECT" within another "SELECT" ?

- Explain what a "B-tree" is, what it has to do with SQL, and what it's O() behavior is for its basic API operations.

- Turn in a first draft of your final project - code, output, and discussion paper.
- Explain what "P vs NP" is all about.

- Final project due - code, output, discussion, bibliography
- Choose any classic algorithmic problem, implement a solution, explain its expected O() behavior, and do numerical experiments to show that behavior, including a graph.
- Write a short paper explaining the problem, both your own work and optionally other approaches. Include a bibliography.

- Optional show'n'tell : for brownie points, show your work to the class in our last meeting.