Convex Optimization 2026, 27 - Nonparametric Distribution Estimation

29 Jun

A random variable X with values in an infinite subset of ℝ has a distribution characterized with p ∈ ℝⁿ with prob(X = α_k) = p_k. p ⪰ 0, 1^Tp = 1. The inverse of argument is also true. This defines the probability simplex {p ∈ ℝⁿ | p ⪰ 0, 1^Tp = 1} (p.359). Many types of prior information about p is written in terms of constraint or inequalities, using E f(X) = ∑ⁿ_i=1p_if(α_i) as a linear function of p. A special case occurs for C ⊆ ℝ, then the probability is a linear function of p. Known expected values of certain functions can be incorporated as linear equality constraints on p (p. 360).

Bounds of the expected value can be derived from prior information by solving the convex problem minimizing ∑ⁿ_i=1f(α_i)p_i (p. 361). p could also be estimated from the distribution using maximum likelihood estimation through a log-likelihood function of l(p) = ∑ⁿ_i=1k_i log p_i (p. 361). The maximum entropy distribution is determined by minimizing ∑ⁿ_i=1p_i log p_i (p. 362). The minimum Kullback-Leibler divergence of p given a prior distribution q is determined by minimizing ∑ⁿ_i=1 p_i log(p_i/q_i) (p. 362).

pens & pixels

Convex Optimization 2026, 27 - Nonparametric Distribution Estimation

Convex Optimization 2026, 26 - Parametric Distribution Estimation