Questions about anything?

Midterm sorting projects, due in a week:

- Pick a sorting algorithm we haven't done already in class. Merge sort is a good choice. Radix sort is another.
- Quote your sources (including my code if you use that as a starting place).
- Python or C or both, on IndexedArray or LinkedList.
- Explain the algorithm in words and implement in clearly written code.
- Describe any particular properties: Is it stable? Can be distributed? Memory needed?
- Explain what you expect its O() time behavior to be, and why.
- Test the algorithm, and show me the output from your tests.
- Do numerical experiments on random lists to measure its O() time, using either run time or step counting.
- Think of this as a short paper - not just code. A jupyter notebook is OK but so are other formats such as files for the code & output and a pdf discussion.
- Your job is to convince me that (a) you understand what's going on, and (b) that you've put in the time to do a thorough job.

As I showed earlier this term, I have examples of making plots in code/plotting_examples.

Today we'll continue our discussion of the priority queue ADT and one implementation, the heap data structure.

A priority queue is an abstract data structure (ADT) with at least these operations:

```
pq = PriorityQueue() # create an empty one (in this case, a minimum version)
pq.insert(x) # put a new "priority" value in
smallest = pq.pull() # remove the smallest value and return it
```

These things get used in other algorithms, including some graph search ones we'll do later this term.

And you can also use them for sorting: insert all the values, then pull them out, smallest to largest.

There are different ways to implement one of these things.

One way is to keep all the values completely sorted, perhaps in an IndexedArray. Each value to be inserted must be put in the right place; that's an O(n) operation. Removing the smallest means popping the leftmost, and sliding all the others one to the left ... also O(n).

A better implementation is with a "binary heap", which is today's topic.

Resources :

- runestone pythonds binary heap notes
- wikipedia: binary tree
- wikipedia: priority queue
- wikipedia: heap data structure
- Skienna's heap lecture notes

We'll work through the ideas in class, first conceptually, and then see if we can implement it in class. We'll start with the pythonds notes.

The essential ideas are :

- it's a binary tree, with each node smaller (or bigger) than its children
- it's always "complete", i.e. filled in top-to-bottom and left-to-right (avoids being unbalanced)
- we can store it in a regular indexed array (with an indexing trick)
- we can add or remove elements by swaps that are at most the height of the tree,
`O(log n)`

.

I've attached some starting python code (heap_v1.py), which we'll build upon with some live coding to see how to implement the "insert" and "pull" operations.

Heaps are well studied data structures used in a variety of other algorithms, and there are a number of variations.

One trick comes whey you want to turn a list of numbers into a heap. The obvious way is to start with an empty heap, then add each element one by one. Each add is O(log n), so for n elements this is O(n log(n)). But it turns out there's a faster, O(n) algorithm, Floyd's "heapify"; see for example Skienna's heap notes.

Note also that python includes a built-in library which implements this priority queue, which it calls heapq.

If there's time we may try to run this to see what it does.

Don't confuse the heap (a binary tree, stored in an indexed array) with binary search, which we talked about before ... and may code again here in class, if there is time and/or interest.

- We did this as a zoom class ... icy roads today.
- I added the code we wrote, heap_v2.py
- A question from the chat I missed : "Can you explain which one is O(log(n)) and which one is O(n*log(n))?" We can discuss Monday.

https://cs.bennington.college /courses /spring2022 /algorithms /notes /march24

last modified Thu March 24 2022 3:43 pm

last modified Thu March 24 2022 3:43 pm

last modified | size | ||

heap_v1.py |
Thu Mar 24 2022 01:03 pm | 3.4K | |

heap_v2.py |
Thu Mar 24 2022 03:43 pm | 7.1K |