Sticky PDMP samplers for sparse and local inference problems

Journal Article (2023)
Author(s)

Joris Bierkens (TU Delft - Statistics)

Sebastiano Grazzi (TU Delft - Statistics, University of Warwick)

Frank van der Meulen (TU Delft - Statistics, Vrije Universiteit Amsterdam)

MR Schauer (Chalmers University of Technology, TU Delft - Statistics, University of Gothenburg)

Research Group
Statistics
Copyright
© 2023 G.N.J.C. Bierkens, S. Grazzi, F.H. van der Meulen, M.R. Schauer
DOI related publication
https://doi.org/10.1007/s11222-022-10180-5
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 G.N.J.C. Bierkens, S. Grazzi, F.H. van der Meulen, M.R. Schauer
Research Group
Statistics
Issue number
1
Volume number
33
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

We construct a new class of efficient Monte Carlo methods based on continuous-time piecewise deterministic Markov processes (PDMPs) suitable for inference in high dimensional sparse models, i.e. models for which there is prior knowledge that many coordinates are likely to be exactly 0. This is achieved with the fairly simple idea of endowing existing PDMP samplers with “sticky” coordinate axes, coordinate planes etc. Upon hitting those subspaces, an event is triggered during which the process sticks to the subspace, this way spending some time in a sub-model. This results in non-reversible jumps between different (sub-)models. While we show that PDMP samplers in general can be made sticky, we mainly focus on the Zig-Zag sampler. Compared to the Gibbs sampler for variable selection, we heuristically derive favourable dependence of the Sticky Zig-Zag sampler on dimension and data size. The computational efficiency of the Sticky Zig-Zag sampler is further established through numerical experiments where both the sample size and the dimension of the parameter space are large.