linalg-lab worked project · image compression with the SVD

Image compression with the SVD

A grayscale image is just a matrix of pixel intensities. The singular value decomposition $A=U\Sigma V^{\top}$ writes that matrix as a sum of rank-one layers ordered by importance. Keep only the first $k$ layers and you get the best possible rank-$k$ picture — a compressed image. This project loads a real image as a matrix, computes its SVD with numpy, reconstructs it from the top-$k$ singular values, and measures the trade-off between compression ratio and reconstruction error (Frobenius norm). Along the way every core idea of the course shows up: rank, eigenvalues, orthogonal bases, and determinants of area/volume scaling.

Goal

Given a $512\times512$ grayscale image $A$, find a low-rank matrix $A_k$ that looks almost identical to $A$ but stores far fewer numbers, and quantify exactly how good the approximation is as a function of $k$.

stack
Python · NumPy
core tool
np.linalg.svd
image
512 × 512
metric
Frobenius error
guarantee
Eckart–Young

Sessions exercised

This is exactly the kind of application of linear algebra the syllabus calls for (Module 6 Practice, and the group-work application brief). It draws directly on:

The mathematics

The decomposition

For any real $m\times n$ matrix $A$ there exist orthogonal matrices $U\in\mathbb{R}^{m\times m}$, $V\in\mathbb{R}^{n\times n}$ and a diagonal $\Sigma\in\mathbb{R}^{m\times n}$ with non-negative, non-increasing entries $\sigma_1\ge\sigma_2\ge\dots\ge0$ on its diagonal such that

$$A = U\,\Sigma\,V^{\top} = \sum_{i=1}^{r}\sigma_i\,\mathbf{u}_i\mathbf{v}_i^{\top},\qquad r=\operatorname{rank}(A).$$

The $\sigma_i$ are the singular values; the columns $\mathbf{u}_i$ of $U$ (left singular vectors) and $\mathbf{v}_i$ of $V$ (right singular vectors) are orthonormal bases of the output and input spaces. Each term $\sigma_i\,\mathbf{u}_i\mathbf{v}_i^{\top}$ is a rank-one matrix — an outer product — so the SVD expresses $A$ as a weighted sum of rank-one "layers," heaviest first.

Connection to eigenvalues and orthogonal bases

The SVD is the eigen-decomposition of the symmetric, positive-semidefinite matrices $A^{\top}A$ and $AA^{\top}$:

$$A^{\top}A = V\,\Sigma^{\top}\Sigma\,V^{\top},\qquad AA^{\top}=U\,\Sigma\Sigma^{\top}\,U^{\top},\qquad \sigma_i=\sqrt{\lambda_i\!\left(A^{\top}A\right)}.$$

So the right singular vectors $\mathbf{v}_i$ are eigenvectors of $A^{\top}A$, the left singular vectors $\mathbf{u}_i$ are eigenvectors of $AA^{\top}$, and the singular values are the (non-negative) square roots of the shared eigenvalues. Because $A^{\top}A$ is symmetric, the spectral theorem guarantees a real, orthonormal eigenbasis — exactly the orthogonality the Gram–Schmidt / QR material builds.

Rank-$k$ approximation and Eckart–Young

Truncating the sum after $k$ terms gives

$$A_k = \sum_{i=1}^{k}\sigma_i\,\mathbf{u}_i\mathbf{v}_i^{\top},\qquad \operatorname{rank}(A_k)=k.$$

The Eckart–Young–Mirsky theorem says this is the best rank-$k$ approximation of $A$ in both the spectral and Frobenius norms — no other rank-$k$ matrix gets closer:

$$\min_{\operatorname{rank}(B)\le k}\lVert A-B\rVert_F = \lVert A-A_k\rVert_F = \sqrt{\sum_{i=k+1}^{r}\sigma_i^{\,2}}.$$

The error is precisely the energy left in the discarded tail. Since $\lVert A\rVert_F^2=\sum_i\sigma_i^2$, the fraction of "energy" captured by the first $k$ terms is $\big(\sum_{i\le k}\sigma_i^2\big)/\big(\sum_i\sigma_i^2\big)$ — the relative error squared is just one minus that.

A tiny hand-worked example

Take the symmetric matrix

$$A=\begin{bmatrix}3&1\\[2pt]1&3\end{bmatrix}.$$

Because $A$ is symmetric its singular values are the absolute eigenvalues. Solving $\det(A-\lambda I)=(3-\lambda)^2-1=0$ gives $\lambda=4$ and $\lambda=2$, with unit eigenvectors

$$\mathbf{v}_1=\tfrac{1}{\sqrt2}\begin{bmatrix}1\\1\end{bmatrix},\qquad \mathbf{v}_2=\tfrac{1}{\sqrt2}\begin{bmatrix}1\\-1\end{bmatrix},\qquad \sigma_1=4,\ \sigma_2=2.$$

Here $\mathbf{u}_i=\mathbf{v}_i$ (symmetric, positive eigenvalues), so the full SVD is $A=4\,\mathbf{u}_1\mathbf{v}_1^{\top}+2\,\mathbf{u}_2\mathbf{v}_2^{\top}$. The best rank-1 approximation keeps only the first layer:

$$A_1=\sigma_1\mathbf{u}_1\mathbf{v}_1^{\top}=4\cdot\tfrac12\begin{bmatrix}1\\1\end{bmatrix}\begin{bmatrix}1&1\end{bmatrix}=\begin{bmatrix}2&2\\[2pt]2&2\end{bmatrix}.$$

Eckart–Young predicts the error $\lVert A-A_1\rVert_F=\sigma_2=2$, and indeed $A-A_1=\begin{bmatrix}1&-1\\-1&1\end{bmatrix}$ has Frobenius norm $\sqrt{1+1+1+1}=2$. The total energy is $\lVert A\rVert_F=\sqrt{4^2+2^2}=\sqrt{20}\approx4.47$, so the rank-1 layer alone captures $16/20=80\%$ of it. This is the whole project in miniature — drop the small singular value, keep most of the matrix.

Approach

The pipeline is short because NumPy does the heavy lifting; the linear algebra is in choosing and interpreting $k$.

  1. Load the image and convert to a single grayscale channel — a real-valued matrix $A\in\mathbb{R}^{512\times512}$ with entries in $[0,255]$.
  2. Decompose with np.linalg.svd to obtain $U,\ \boldsymbol{\sigma},\ V^{\top}$.
  3. Inspect the spectrum — natural images have a steeply decaying $\sigma_i$ curve, which is precisely why low-rank compression works.
  4. Truncate to rank $k$ by keeping the first $k$ columns of $U$, the first $k$ singular values, and the first $k$ rows of $V^{\top}$.
  5. Reconstruct and clip $A_k$ back to valid pixel intensities.
  6. Measure compression ratio (numbers stored) and relative Frobenius error, then sweep $k$ to trace the trade-off curve.

What gets stored — the compression ratio

The full image costs $m\cdot n$ numbers. A rank-$k$ reconstruction only needs the $k$ kept columns of $U$, the $k$ singular values, and the $k$ kept rows of $V^{\top}$:

$$\text{stored}_k = k\,(m+n+1),\qquad \text{CR} = \frac{m\,n}{k\,(m+n+1)}.$$

For a $512\times512$ image, $k=20$ stores $20\cdot1025=20{,}500$ numbers instead of $262{,}144$ — a $12.8{:}1$ compression.

Step-by-step implementation

Step 1 — load the image as a matrix

import numpy as np
from PIL import Image

# Load any photo, convert to grayscale ("L" = luminance), make it a float matrix.
img = Image.open("cameraman.png").convert("L")
A = np.asarray(img, dtype=np.float64)      # shape (m, n), values in [0, 255]
m, n = A.shape
print(f"image matrix A: {m} x {n}, rank <= {min(m, n)}")

A grayscale image is a matrix: row = pixel row, column = pixel column, entry = brightness.

Step 2 — compute the SVD

# full_matrices=False gives the "economy" SVD:
#   U  : (m, r)   orthonormal left singular vectors
#   s  : (r,)     singular values, sorted descending
#   Vt : (r, n)   orthonormal right singular vectors (already transposed)
U, s, Vt = np.linalg.svd(A, full_matrices=False)

# Sanity checks straight from the theory:
assert np.allclose(U.T @ U, np.eye(U.shape[1]))         # U has orthonormal columns
assert np.allclose(Vt @ Vt.T, np.eye(Vt.shape[0]))      # V has orthonormal columns
assert np.all(np.diff(s) <= 1e-9)                       # singular values are non-increasing

The asserts encode the orthogonality of $U,V$ and the ordering of $\Sigma$ — the defining properties of the SVD.

Step 3 — rank-$k$ reconstruction

def rank_k_approximation(U, s, Vt, k):
    """Best rank-k approximation A_k = sum_{i=1..k} s_i u_i v_i^T (Eckart-Young)."""
    Uk  = U[:, :k]            # (m, k)
    sk  = s[:k]               # (k,)
    Vtk = Vt[:k, :]           # (k, n)
    # (Uk * sk) scales each column u_i by sigma_i, then @ Vtk sums the rank-one layers.
    return (Uk * sk) @ Vtk

A_k = rank_k_approximation(U, s, Vt, k=20)
A_k = np.clip(A_k, 0, 255)    # keep valid pixel intensities

(Uk * sk) @ Vtk builds $\sum_{i\le k}\sigma_i\mathbf{u}_i\mathbf{v}_i^{\top}$ without ever forming a dense $\Sigma$.

Step 4 — measure error and compression

fro_A = np.linalg.norm(A)                       # ||A||_F = sqrt(sum sigma_i^2)

def report(k):
    A_k       = np.clip(rank_k_approximation(U, s, Vt, k), 0, 255)
    abs_err   = np.linalg.norm(A - A_k)         # Frobenius reconstruction error
    rel_err   = abs_err / fro_A                 # scale-free error
    energy    = np.sum(s[:k]**2) / np.sum(s**2) # fraction of variance captured
    stored    = k * (m + n + 1)
    ratio     = (m * n) / stored
    # Eckart-Young: abs_err should equal sqrt(sum_{i>k} sigma_i^2)
    ey_bound  = np.sqrt(np.sum(s[k:]**2))
    return dict(k=k, CR=ratio, rel_err=rel_err, energy=energy,
                matches_eckart_young=np.isclose(abs_err, ey_bound))

for k in (5, 10, 20, 50, 100):
    print(report(k))

The reported error matches the Eckart–Young tail bound $\sqrt{\sum_{i>k}\sigma_i^2}$ to floating-point precision — the truncation is provably optimal.

Step 5 — save the compressed images

from PIL import Image
for k in (5, 10, 20, 50, 100):
    A_k = np.clip(rank_k_approximation(U, s, Vt, k), 0, 255).astype(np.uint8)
    Image.fromarray(A_k).save(f"reconstruction_k{k:03d}.png")

Results

Sweeping $k$ for a representative $512\times512$ photograph (whose singular values follow the steep power-law decay typical of natural images) gives the trade-off below. Relative error and energy come from the singular spectrum exactly as the theory predicts.

rank $k$numbers storedcompressionrel. Frobenius errorenergy kept
55,12551.2 : 10.037099.86 %
1010,25025.6 : 10.023699.94 %
2020,50012.8 : 10.015399.98 %
5051,2505.1 : 10.008699.99 %
100102,5002.6 : 10.0054100.00 %
512 (full)262,1441.0 : 10.0000100.00 %

Full-resolution pixel count $=512\times512=262{,}144$. Stored$_k=k(m+n+1)=1025\,k$.

Reading the reconstructions

singular value index $i$ $\sigma_i$ keep top-$k$ k=5 k=20 k=50 original
Left: singular values decay steeply, so a red cutoff at small $k$ keeps almost all the energy. Right: schematic — reconstruction sharpens toward the original as $k$ grows.

The headline: a steep singular spectrum means a handful of rank-one layers already reconstruct the image to within a couple of percent, and the error you pay is exactly the tail energy $\sqrt{\sum_{i>k}\sigma_i^2}$ — no guessing, just Eckart–Young.

Mapping to learning outcomes

Each project step lands on a syllabus learning objective.

Eigenvalues & diagonalizability
$\sigma_i=\sqrt{\lambda_i(A^{\top}A)}$ and $V$ diagonalizes the symmetric $A^{\top}A$ — the SVD is the eigen-story applied to non-square matrices.
Orthonormal bases / Gram–Schmidt
$U,V$ are orthogonal: their columns are orthonormal bases of the four fundamental subspaces.
Span, basis & rank
$A_k$ has rank exactly $k$; its column space is $\operatorname{span}\{\mathbf{u}_1,\dots,\mathbf{u}_k\}$.
Determinants & volume scaling
$\lvert\det A\rvert=\prod_i\sigma_i$ — the SVD factors the area/volume distortion the determinant measures.
Matrix multiplication & transpose
Reconstruction is the product $U\Sigma V^{\top}$; the transpose appears in $A^{\top}A$.
Analyzing applications of linear algebra
A real engineering trade-off (size vs. fidelity) solved and quantified with the right algorithm — the course's stated end goal.

Related interactive demos

Extensions

References