This is a toy version of the DCT transform used in jpeg lossy compression.

Jim Mahoney cs.bennington.edu | March 2021

In [1]:

```
from numpy import *
from matplotlib.pyplot import plot
import matplotlib.pyplot as plt
import numpy as np
```

Here are the basis vectors that we'll use for this work.

$$ \begin{align} \hat{b}_1 & = [1, 1, 1, 1] \frac{1}{2} \\ \hat{b}_2 & = [1, 0, -1, 0] \frac{1}{\sqrt{2}} \\ \hat{b}_3 & = [0, 1, 0, -1] \frac{1}{\sqrt{2}} \\ \hat{b}_4 & = [1, -1, 1, -1] \frac{1}{2} \\ \end{align} $$As seen in last week's homework, these are orthonormal: unit length and orthogonal to each other.

In [2]:

```
# And here's what they look like in python numpy.
half = 1/2
sqrt2 = 1/sqrt(2)
b = matrix([ [half, half, half, half],
[sqrt2, 0, -sqrt2, 0],
[0, sqrt2, 0, -sqrt2],
[half, -half, half, -half]
])
print(" The rows of this matrix are the basis vectors. ")
print(b)
print()
print(" The 0'th one is b[0,:] i.e.")
b0 = b[0,:]
b0
```

The rows of this matrix are the basis vectors. [[ 0.5 0.5 0.5 0.5 ] [ 0.70710678 0. -0.70710678 0. ] [ 0. 0.70710678 0. -0.70710678] [ 0.5 -0.5 0.5 -0.5 ]] The 0'th one is b[0,:] i.e.

Out[2]:

matrix([[0.5, 0.5, 0.5, 0.5]])

In [3]:

```
# Here you can see the (rows, columns) dimensions of the top row.
# (The ':' here means 'all columns).
shape(b0)
```

Out[3]:

(1, 4)

In [4]:

```
# As a 4 x 1 column vector this is
transpose(b0)
```

Out[4]:

matrix([[0.5], [0.5], [0.5], [0.5]])

In [5]:

```
# The matrix product of a (1x4) * (4x1) is the "inner" or dot product. We've already seen that.
# Here since these are numpy matrices, they are multiplied with matrix multiplication.
b0 * transpose(b0)
```

Out[5]:

matrix([[1.]])

In [6]:

```
# The matrix product of a (4x1) * (1x4) is the "outer" product.
# This is one of 16 (4*4) basis images that we will use for our four_by_four transform.
b00 = transpose(b0) * b0
print(b00)
```

[[0.25 0.25 0.25 0.25] [0.25 0.25 0.25 0.25] [0.25 0.25 0.25 0.25] [0.25 0.25 0.25 0.25]]

In [7]:

```
# Here are all 16 of them. Here "row" and "column" are in the final array
plt.figure(figsize=(14, 14), dpi=200) # 9" x 12"
for i in range(4):
for j in range(4):
plt.subplot(4, 4, 1 + 4*i + j)
b_i = b[i, :]
b_j= b[j, :]
basis_image = transpose(b_i) * b_j
#print("{}".format(i,j))
#print(basis_image)
plt.imshow(basis_image)
plt.colorbar()
```

In [8]:

```
# Or if you like numbers better :
# Here are all 16 of them. Here "row" and "column" are in the final array
for i in range(4):
for j in range(4):
b_i = b[i, :]
b_j= b[j, :]
basis_image = transpose(b_i) * b_j # i.e. bb_ij in a notation I use below.
print("{}".format((i,j)))
print(basis_image)
print()
```

In [9]:

```
# Now let's choose an toy "image" to work with :
image = matrix([[0,0,0,0], [0, 100, 100, 0], [0, 100, 100, 0], [0,0,0,0]])
plt.imshow(image)
plt.colorbar()
```

Out[9]:

<matplotlib.colorbar.Colorbar at 0x7fd8408fc7f0>

The transform we have in mind is to turn this into a sum of 16 coefficients times the 16 basis images.

$$ \text{image} = c_{00} * \hat{bb}_{00} + c_{01} * \hat{bb}_{01} + ... + c_{33} * \hat{bb}_{33} $$where each of those $ \hat{b}_{ij} $ is one of the 16 images

This is essentially a 2D Fourier series, built from 4-component basis vectors.

(The DCT transform is very similar, but with 8x8 matrices.)

As we did last week, the trick is to do a dot product of the desired result (the image) with one of the basis vectors (which here are the $bb$ matrices.)

We need to be a bit careful here, because *this* dot product is treating the whole collection of 16 numbers as a single thing. We want to multiply corresponding terms and add, which is *not* matrix multiplication here, since the things that we're dotting are matrices. Essentially we want to flatten out the 16 numbers into one long vector, and dot those.

In [10]:

```
# In other words, these 16 4x4 images (the ones I did pictures of above)
# form a basis under the following dot product.
def bb(i,j):
""" Return one of the 16 basis images, 0<i<3, 0<j<3 """
return transpose(b[i, :]) * b[j, :]
def flatten(a):
""" Return a numpy 1D array from a numpy 2D matrix. """
return array(a).flatten()
def flatdot(a, b):
return dot(flatten(a), flatten(b))
```

In [11]:

```
# Here's the evidence that the bb's form an orthonormal bases.
def print_flatdot(i1, j1, i2, j2):
print(" flatdot(bb({},{}), bb({},{})) = {}".format(
i1, j1, i2, j2, flatdot(bb(i1,j1), bb(i2,j2))))
print("basis dot same basis => 1.0 (16 of 'em)")
print_flatdot(0,0, 0,0)
print_flatdot(1,3, 1,3)
print_flatdot(2,0, 2, 0)
print()
print("basis dot different basis => 0.0 (16*15/2 = 120 of 'em)")
print_flatdot(0,0, 0,1)
print_flatdot(1,0, 0,1)
print_flatdot(2,3, 0,2)
```

In [ ]:

```
```

Now it's your turn.

- Find 16 coefficients of the transformed image.
- For the compression step, chop the highest frequency by setting it to zero.
- Take those new coefficients and recreate (almost) the original image
- Show the new image ... which should be "blurry".
- Discuss.

In [ ]:

```
```