Part 1: Image compression

Introduction

I will be studying the singular value decomposition (SVD) of these images:

The image on the left is of the assignment. Henceforth, it is pic_1.png. Because we are doing SVD for this assignment, I thought it would be appropriate to do SVD on the assignment. Textual data may also provide a challenge for SVD, because rendering text well is a nontrivial task.

The two images on the right come from my lab work on Se-Te nanostructures. These contain lamellar and circular patterns of different feature sizes. it will be interesting to see how their spectra compare to the text as well as how the reconstructions at various stages recapture the image quality. From left to right, these are pic_2.png and pic_3.png.

Program

In the section below, we do the following for each image:

  • Read it into a grayscale array
  • Compute the Frobenius norm of the matrix (to use for comparisons)
  • Perform a singular value decomposition (SVD)
  • Reconstruct the image from the largest $10^{k+2} %$ of the singular values for $k \in {-1, -2, -3}$
  • Compute the Frobenius distance of the approximation to the original
  • Save the reconstructed image

We also:

  • Create a plot of the SVD spectrum
  • Save computed norms into a data frame
import numpy as np
import pandas as pd
from PIL import Image, ImageOps
import matplotlib.pyplot as plt
%matplotlib inline
def name_files(pathname, ext, n=0, out=''):
    """Outputs a generator of numbered strings 'folder/name[_i][_out].ext'"""
    undr = lambda x, y: bool(x) * ('_' + str(y))
    for i in range(n):
        yield ''.join([pathname, undr(n, i+1), undr(out, out), '.', ext])

def svd_rc(u, s, vh, n):
    """Reconstruct matrix using n largest principal values of svd"""
    return u[:, :n] * s[:n] @ vh[:n, :]
%%capture plot

pic_path  = 'pictures/pic'
pic_files = name_files(pic_path, 'png', 3)

fig, ax = plt.subplots()

vals_og = { 
    'file' : [],
    'Nval' : [],
    'FroN' : [],
}

vals_rc = { 
    'file' : [],
    'expo' : [],
    'Nrec' : [],
    'FroD' : [],
}

for i, pic_file in enumerate(pic_files):
    # Convert image to grayscale array
    img = Image.open(pic_file)
    img = ImageOps.grayscale(img)
    # Python feature: variables/pointers are not of fixed type
    img = np.asarray(img)
    # SVD of image
    res = np.linalg.svd(img, full_matrices=False)
    # Save total number of singular values and Frobenius norm of original
    vals_og['file'].append(pic_file)
    vals_og['Nval'].append(res[1].size)
    vals_og['FroN'].append(np.linalg.norm(img))
    # Plot spectrum normalized by array size and maximal principal value
    ax.loglog(
        np.arange(res[1].size) / res[1].size, 
        res[1] / np.max(res[1]), 
        label=pic_file,
    )
    for k in range(-3, 0):
        out_file = ''.join([pic_path, '_', str(i+1), '_out_k=', str(k), '.png'])
        # Reconstruct image from SVD with 10 ** k % of principal values
        Npv = int(np.ceil((10 ** k) * res[1].size))
        rec = svd_rc(*res, Npv)
        # Save number of values used in reconstruction and Frobenius distance
        vals_rc['file'].append(pic_file)
        vals_rc['expo'].append(k)
        vals_rc['Nrec'].append(Npv)
        vals_rc['FroD'].append(np.linalg.norm(img - rec))
        # Save reconstruction to file
        rec = Image.fromarray(rec)
        rec = ImageOps.grayscale(rec)
        rec.save(out_file)

df_og = pd.DataFrame(vals_og)
df_rc = pd.DataFrame(vals_rc)
        
ax.set_ylim([1e-4, 1])
ax.set_title('Singular value spectrum')
ax.set_ylabel('Fraction of largest singular value')
ax.set_xlabel('Quantile of singular values')
ax.legend()
plt.show()

Results

Spectra

I would first like to look at some features of the original images.

df_og

file Nval FroN
0 pictures/pic_1.png 836 219990.308432
1 pictures/pic_2.png 1768 293350.212546
2 pictures/pic_3.png 1887 250397.226442

For each file, the Nval column the number singular of singular values used to decompose the original image, which is also the largest dimension of the image. Notably, pic_1.png has half as many singular values in the image as the other pictures because the image has half as many pixels in its largest dimension. Additionally, the FroN column computes the Frobenius norm of the image. On its own, the value doesn’t mean much, but we will use it to normalize the Frobenius distance of reconstructions to compare for different images.

df_rc['normFroD'] = df_rc.FroD / df_rc.file.replace(
    { e : df_og.FroN[i] for i, e in enumerate(df_og.file) }
)

The next thing I would like to study is the plot of the SVD spectra. Since very few images contain a great deal of high frequency data, I will use logarithmic scales on the plot to focus on the first, largest singular values. Additionally, I will normalize the values by diving the values on each axis by the largest in each array so that we can compare spectra.

plot()

png

Reconstructions

pic_1.png

Ordered from not compressed (left) to most compressed (right), the images are:

These reconstructions use the following parameters:

  • expo: The base-ten exponent representing the compression fraction
  • Nrec: The number of principal singular values used in the reconstruction
  • FroD: The Frobenius distance of the reconstruction from the original
  • normFroD: The value of FroD divided by FroN of the original image
df_rc[df_rc.file == 'pictures/pic_1.png']

file expo Nrec FroD normFroD
0 pictures/pic_1.png -3 1 32445.117391 0.147484
1 pictures/pic_1.png -2 9 26717.258393 0.121447
2 pictures/pic_1.png -1 84 12733.380540 0.057882
pic_2.png

Ordered from not compressed (left) to most compressed (right), the images are:

These reconstructions use the following parameters:

df_rc[df_rc.file == 'pictures/pic_2.png']

file expo Nrec FroD normFroD
3 pictures/pic_2.png -3 2 130700.350187 0.445544
4 pictures/pic_2.png -2 18 104763.350022 0.357127
5 pictures/pic_2.png -1 177 33092.120611 0.112808
pic_3.png

Ordered from not compressed (left) to most compressed (right), the images are:

These reconstructions use the following parameters:

df_rc[df_rc.file == 'pictures/pic_3.png']

file expo Nrec FroD normFroD
6 pictures/pic_3.png -3 2 130085.491407 0.519517
7 pictures/pic_3.png -2 19 101196.911866 0.404145
8 pictures/pic_3.png -1 189 32278.979956 0.128911

Discussion

Spectra

It is interesting that pic_1.png, which is very text based, has a spectrum that decays fastest (we might expect it to struggle with text). Across all the spectra, it is clear that due to the logarithmic scale, for all the images, the largest 10% of singular values are in the first thousandth of all the singular values. Since the images are less than 2000 pixels each in their largest dimension, this means that only one or two singular values dominate.

Then it appears there is a gradually sloping region until about between 1 and 10 % of all singular values, which would suggest an informative region, followed by a dip into another sloping region until after 10%, whereupon all the spectra dive down to very small magnitudes. So my choice of reconstruction based on 0.1%, 1% and 10% of all singular values seems to capture each of the regions of interest.

Notably, the spectra of pic_2.png and pic_3.png appear very similar, which seems reasonable because the images are of very similar quality and morphology.

Reconstructions

Starting with pic_1.png, we can see that the reconstruction with 1 singular value is extremely blurry, but at least it captures the lines of text in the image. In the reconstruction with 9 values, it is possible to discern blocks that look like words, but there is no chance of reading them as the features like serifs of the font are mangled by linear approximations. Lastly, the 10% reconstruction appears readable, though of very low quality. The page is grainy and hard to read for any extended period of time, but the text is there and not very different than a low-quality scan of a document, especially compared with a scanner on a cell phone.

In the reconstruction of pic_2.png with 2 singular values, the image looks like a quilt. Already with 1% of the values, the reconstruction has recover the “spaghetti and cheerios” morphology of the structure, though lacks the detail and accuracy of the final image. The 10% reconstruction has lower quality than the original, but is rather close to the original and would be acceptable.

In the reconstruction of pic_3.png with 2 singular values, we again have a quilt, but can see that the decomposition discerns the big black hole at the center of the image as well as the presence of the scale bar at the bottom of the photograph. We can continue to say similar things as the second image because the features of the images are quite similar. On a higher quality close-up of the image, it is interest to look at what happens inside the “black hole” at the center of the image. In the full image, it is nearly dark, but in all of the reconstructions, there is high-frequency noise that sort of imposes the ghostly shadow of the lamellar pattern in the dark area. Without the less emphasized singular values, it is impossible to remove the more subtle defects of the image.

It is also striking to compare the normalized Frobenius distance across the images. In general, the errors are nearly a factor of 2 smaller for pic_1.png than pic_2 or pic_3, which are comparable approximations at each level of reconstruction. Because I have normalized these errors, I suspect the deviation is not due to the fact the first pictures has far fewer pixels to approximate, but because the norm doesn’t care so much about the small details in the text. By comparison, some of the approximations that would be unacceptable for text are tolerable for the larger-scale patterns in the second and third pictures.