Hankel Matrix

Also known as: Hankel structure, trajectory matrix, embedding matrix

TL;DR

A matrix whose anti-diagonals are constant — each entry depends only on the sum of its indices, . The natural data structure for turning a 1D time series into a 2D matrix you can apply SVD to.

A Hankel matrix is a matrix with constant anti-diagonals — every entry depends only on the sum of its row and column indices:

The structure encodes a single sequence — each value appears once along an anti-diagonal, and the whole matrix is parameterized by that sequence.

A Hankel matrix turns a 1D sequence into a 2D matrix by sliding a window over it. Each column is a shifted view of the original signal — column holds the window starting at position .

The construction

Given a length- signal and a window length , the trajectory matrix is an Hankel matrix (with ) whose columns are successive windows of the signal:

This is the data structure that bridges 1D time-series analysis and 2D linear algebra. Once you have a matrix, you can apply SVD, PCA, factorization, low-rank approximation — all the tools that need a matrix input.

Why this matters in ML and signal processing

Hankel matrices show up wherever someone wants to apply matrix methods to a sequence:

  • — builds the trajectory Hankel matrix, runs SVD, separates trend / oscillations / noise from the singular value spectrum, then reconstructs each component via diagonal averaging. The Hankel structure is what makes the reconstruction step well-defined: you average each anti-diagonal back into a 1D signal.
  • System identification — the Ho-Kalman algorithm recovers a linear state-space model from impulse response data by SVD on a Hankel matrix of measurements. Same trick: 1D → 2D via Hankel, then linear algebra.
  • Matrix completion for time series — recommend missing values in a Hankel structure rather than an arbitrary matrix, exploiting the shared-entry constraint as an inductive bias.

The rank structure

The most useful property of a Hankel matrix built from a signal is that rank reveals structure:

  • A constant signal → rank 1.
  • A single pure sinusoid → rank 2.
  • A sum of pure sinusoids → rank (each gives two singular values, for cosine + sine basis).
  • A linear trend → rank 2 (constant + linear basis).
  • White noise → full rank, , with singular values that spread roughly uniformly.

This is why looking at the singular value spectrum of a trajectory Hankel matrix separates signal from noise. Large singular values capture low-rank deterministic components (trends, oscillations); the long tail of small singular values is the noise. Choosing a rank- truncation amounts to picking which structural components to keep — see for the full machinery.

Diagonal averaging — going back to a signal

After truncating the SVD to components, the result is a rank- matrix that’s no longer exactly Hankel — different positions on the same anti-diagonal will hold slightly different values. To recover a 1D signal, you average each anti-diagonal:

where is the -th anti-diagonal. This is the projection of a general matrix back onto the Hankel manifold; it’s also the closed-form minimum-Frobenius-norm projection if you want a clean mathematical justification.

The full Hankel → SVD → truncate → diagonal-average pipeline is the spine of every SSA-style decomposition. Hankel structure isn’t decorative — it’s the structural assumption that makes the whole approach work.

Go further

Why anti-diagonals constant, not main-diagonals?

Constant main diagonals define a Toeplitz matrix, — that's the matrix structure for convolution. Constant anti-diagonals define Hankel: each anti-diagonal contains the same value of the original sequence. The Hankel structure is what you get when you slide a window across a 1D signal — every column is a shifted copy of the previous one — making Hankel the natural matrix for time-series embedding.

What's the rank of a Hankel matrix built from a signal?

For a length- signal embedded into an Hankel matrix (with ), the rank is bounded by . But the effective rank reveals signal structure: a sum of pure sinusoids gives a Hankel matrix of rank ; a pure trend has rank ; white noise spreads energy across all singular values. Looking at the singular value spectrum is what makes SSA (and adjacent methods) useful for separating signal from noise.

ZeroEntropy
The best AI teams build with ZeroEntropy models
Follow us on
GitHubTwitterSlackLinkedInDiscord