Tractable Upper Bounds for Wasserstein Quality Assessments of Variational Gaussian Approximations
N.D. Gallie Rodriguez (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Joris Bierkens – Mentor (TU Delft - Statistics)
RICHARD C. Kraaij – Mentor (TU Delft - Applied Probability)
Havva Yoldas – Graduation committee member (TU Delft - Mathematical Physics)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Variational inference comprises a family of statistical methods to obtain the optimal approximation of a target probability distribution using some reference class of distributions and a cost function, commonly the Kullback-Leibler (KL) divergence. Recent work on variational inference has yielded a fast, stable set of mean and covariance evolutions which dynamically yield variational Gaussian approximations via a restriction to Gaussian measures of the well-known JKO scheme. The sequence of Gaussian measures thus generated converges towards the KL-optimal Gaussian approximation of the VI target: it may also be used to approximate the entire sequence of distributions generated by a JKO gradient flow directed at this same target, thereby supporting practical usage of Gaussian VI as well as fast, approximate modelling of the Fokker-Planck PDE. However, it is not immediately clear whether this Gaussian sequence offers valid, helpful approximations of the original JKO gradient flow. In this work, three upper bounds for the sequence of Wasserstein-2 distances between the two gradient flows are obtained by exploiting the Riemannian structure of the W2 manifold and the shared properties of the Gaussian and JKO evolutions. Numerical simulations support the validity of these bounds and test their performance in both ordinary and exceptional scenarios. One of the bounds may be computed solely using the Gaussian evolution and the target potential, thus offering a tractable estimator for the suitability of variational Gaussian approximations which retains the attractive properties of Wasserstein distances whilst avoiding their computational demands.