Convex Optimization 2026, 22 - Regularized and Robust Approximation
Regularized approximation aims to find a solution vector x that is small and makes the residual Ax - b small as well. It has two objectives: ||Ax - b|| and ||x||, each of which might have its own norm. The trade off between the two objectives is the difficult component. It's optimal trade-off curve can be derived from one endpoint of the optimal trade-off curve between the problems. The minimum value of ||x|| is zero and is achieved only for x = 0. The residual evaluates to ||b||. The other endpoint requires that the norm of the set of minimizers of ||Ax - b|| = C be Pareto optimal (6.3.1). Regularization scalarizes this bi-criterion problem. One implementation minimizes the weighted sum of the objectives, minimizing ||Ax-b|| + γ||x||. Alternatively, for Euclidean norms, the weighted sum of squared norms can be taken instead (6.3.2.0). Regularization can be used when the matrix A is square and the goal is to solve Ax = b. Most common is TIkhonov regularization, minimizing the square norm. It has an analytical solution x = (ATA + δI)-1Atb (6.3.2.2).
If regularizing with a ||Dx|| instead of an ||x||, then D must be an approximate differentiation or second-order differentiation operator. The Tikhonov regularized problem minimizes ||Ax-b||22 + δ||Δx||22 with Δ the Toeplitz matrix. δ controls the amount of regularization, which plots the optimal fit against its smoothness. Several regularization terms can be introduced, for example an additional η||x||22, which controls the curve's size (6.3.2.3).
Under the l1-norm, the problem can be used as a heuristic toward a sparse solution, in which the residual is measured with the Euclidean norm. Varying its parameter sweeps the optimal trade-off curve between the trivial solutions (6.3.2.4).
A special case of the bi-criterion approximation problem is the reconstruction problem, which starts with a signal x ∈ ℝn, the coefficients xi correspond to the value of some function of time at evenly spaced points. The signal can be corrupted by an additive noise v through xcor = x + v. Assuming v is small, unknown and rapidly varying, the signal can be reconstructed through an estimate of the original signal. The simplest version of this is Quadratic smoothing, using the bi-diagonal matrix D through minimizing ||x' - xcor||22 + δ||Dx'||22 (6.3.3.1).