Also known as: Hankel structure, trajectory matrix, embedding matrix
TL;DR
A matrix whose anti-diagonals are constant — each entry depends only on the sum of its indices, . The natural data structure for turning a 1D time series into a 2D matrix you can apply SVD to.
A Hankel matrix is a matrix with constant anti-diagonals — every entry depends only on the sum of its row and column indices:
The structure encodes a single sequence — each value appears once along an anti-diagonal, and the whole matrix is parameterized by that sequence.
A Hankel matrix turns a 1D sequence into a 2D matrix by sliding a window over it. Each column is a shifted view of the original signal — column holds the window starting at position .
The construction
Given a length- signal and a window length , the trajectory matrix is an Hankel matrix (with ) whose columns are successive windows of the signal:
This is the data structure that bridges 1D time-series analysis and 2D linear algebra. Once you have a matrix, you can apply SVD, PCA, factorization, low-rank approximation — all the tools that need a matrix input.
Why this matters in ML and signal processing
Hankel matrices show up wherever someone wants to apply matrix methods to a sequence:
Singular Spectrum Analysis (SSA) — builds the trajectory Hankel matrix, runs SVD, separates trend / oscillations / noise from the singular value spectrum, then reconstructs each component via diagonal averaging. The Hankel structure is what makes the reconstruction step well-defined: you average each anti-diagonal back into a 1D signal.
System identification — the Ho-Kalman algorithm recovers a linear state-space model from impulse response data by SVD on a Hankel matrix of measurements. Same trick: 1D → 2D via Hankel, then linear algebra.
Matrix completion for time series — recommend missing values in a Hankel structure rather than an arbitrary matrix, exploiting the shared-entry constraint as an inductive bias.
The rank structure
The most useful property of a Hankel matrix built from a signal is that rank reveals structure:
A constant signal → rank 1.
A single pure sinusoid → rank 2.
A sum of pure sinusoids → rank (each gives two singular values, for cosine + sine basis).
A linear trend → rank 2 (constant + linear basis).
White noise → full rank, , with singular values that spread roughly uniformly.
This is why looking at the singular value spectrum of a trajectory Hankel matrix separates signal from noise. Large singular values capture low-rank deterministic components (trends, oscillations); the long tail of small singular values is the noise. Choosing a rank- truncation amounts to picking which structural components to keep — see SSA for the full machinery.
Diagonal averaging — going back to a signal
After truncating the SVD to components, the result is a rank- matrix that’s no longer exactly Hankel — different positions on the same anti-diagonal will hold slightly different values. To recover a 1D signal, you average each anti-diagonal:
where is the -th anti-diagonal. This is the projection of a general matrix back onto the Hankel manifold; it’s also the closed-form minimum-Frobenius-norm projection if you want a clean mathematical justification.
The full Hankel → SVD → truncate → diagonal-average pipeline is the spine of every SSA-style decomposition. Hankel structure isn’t decorative — it’s the structural assumption that makes the whole approach work.
Go further
Why anti-diagonals constant, not main-diagonals?
Constant main diagonals define a Toeplitz matrix, — that's the matrix structure for convolution. Constant anti-diagonals define Hankel: each anti-diagonal contains the same value of the original sequence. The Hankel structure is what you get when you slide a window across a 1D signal — every column is a shifted copy of the previous one — making Hankel the natural matrix for time-series embedding.
What's the rank of a Hankel matrix built from a signal?
For a length- signal embedded into an Hankel matrix (with ), the rank is bounded by . But the effective rank reveals signal structure: a sum of pure sinusoids gives a Hankel matrix of rank ; a pure trend has rank ; white noise spreads energy across all singular values. Looking at the singular value spectrum is what makes SSA (and adjacent methods) useful for separating signal from noise.