image compression with a four by four basis

This is a toy version of the DCT transform used in jpeg lossy compression.

Jim Mahoney cs.bennington.college | March 2021

four 1D frequncy (i.e. wave-like, oscillating) basis vectors

Here are the basis vectors that we'll use for this work.

$$ \begin{align} \hat{b}_1 & = [1, 1, 1, 1] \frac{1}{2} \\ \hat{b}_2 & = [1, 0, -1, 0] \frac{1}{\sqrt{2}} \\ \hat{b}_3 & = [0, 1, 0, -1] \frac{1}{\sqrt{2}} \\ \hat{b}_4 & = [1, -1, 1, -1] \frac{1}{2} \\ \end{align} $$

As seen in last week's homework, these are orthonormal: unit length and orthogonal to each other.

inner and outer product: a 1x1 scalar and a 4x4 "image".

a spatial wave basis : 16 4x4 matrices

A toy image

The image transform

The transform we have in mind is to turn this into a sum of 16 coefficients times the 16 basis images.

$$ \text{image} = c_{00} * \hat{bb}_{00} + c_{01} * \hat{bb}_{01} + ... + c_{33} * \hat{bb}_{33} $$

where each of those $ \hat{b}_{ij} $ is one of the 16 images

This is essentially a 2D Fourier series, built from 4-component basis vectors.

(The DCT transform is very similar, but with 8x8 matrices.)

As we did last week, the trick is to do a dot product of the desired result (the image) with one of the basis vectors (which here are the $bb$ matrices.)

We need to be a bit careful here, because this dot product is treating the whole collection of 16 numbers as a single thing. We want to multiply corresponding terms and add, which is not matrix multiplication here, since the things that we're dotting are matrices. Essentially we want to flatten out the 16 numbers into one long vector, and dot those.

TO DO :

Now it's your turn.

Jim's answers to the TO DO :

... which except for roundoff errors is pretty good.

Now let's look at the "blurry" version; removing the highest frequency.

discussion

It's similar to the original : numbers of order 100 in the central four and numbers of (mostly) order 0 along the edge ... well, 25 isn't 0 but it is closer to 0 than 100. :)

We've lost some of the total intensity; not unexpected since we didn't renormalize.