The auxiliary variables sampler is a general Markov chain Monte Carlo (MCMC) technique for sampling from probability distributions with unknown normalizing constants [1, Section 3.1]. Specifically, suppose we have functions and we want to sample from the probability distribution That is
where is a normalizing constant. If the set is very large, then it may be difficult to compute or sample from . To approximately sample from we can run an ergodic Markov chain with as a stationary distribution. Adding auxiliary variables is one way to create such a Markov chain. For each , we add a auxiliary variable such that
That is, conditional on , the auxiliary variables are independent and is uniformly distributed on the interval . If is distributed according to and have the above auxiliary variable distribution, then
This means that the joint distribution of is uniform on the set
Put another way, suppose we could jointly sample from the uniform distribution on . Then, the above calculation shows that if we discard and only keep , then will be sampled from our target distribution .
The auxiliary variables sampler approximately samples from the uniform distribution on is by using a Gibbs sampler. The Gibbs samplers alternates between sampling from and then sampling from . Since the joint distribution of is uniform on , the conditional distributions and are also uniform. The auxiliary variables sampler thus transitions from to according to the two step process
- Independently sample uniformly from .
- Sample uniformly from the set .
Since the auxiliary variables sampler is based on a Gibbs sampler, we know that the joint distribution of will converge to the uniform distribution on . So when we discard , the distribution of will converge to the target distribution !
Auxiliary variables in practice
To perform step 2 of the auxiliary variables sampler we have to be able to sample uniformly from the sets
Depending on the nature of the set and the functions , it might be difficult to do this. Fortunately, there are some notable examples where this step has been worked out. The very first example of auxiliary variables is the Swendsen-Wang algorithm for sampling from the Ising model [2]. In this model it is possible to sample uniformly from . Another setting where we can sample exactly is when is the real numbers and each is an increasing function of . This is explored in [3] where they apply auxiliary variables to sampling from Bayesian posteriors.
There is an alternative to sampling exactly from the uniform distribution on . Instead of sampling uniformly from , we can run a Markov chain from the old that has the uniform distribution as a stationary distribution. This approach leads to another special case of auxiliary variables which is called “slice sampling” [4].
References
[1] Andersen HC, Diaconis P. Hit and run as a unifying device. Journal de la societe francaise de statistique & revue de statistique appliquee. 2007;148(4):5-28. http://www.numdam.org/item/JSFS_2007__148_4_5_0/
[2] Swendsen RH, Wang JS. Nonuniversal critical dynamics in Monte Carlo simulations. Physical review letters. 1987 Jan 12;58(2):86. https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.58.86
[3] Damlen P, Wakefield J, Walker S. Gibbs sampling for Bayesian non‐conjugate and hierarchical models by using auxiliary variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 1999 Apr;61(2):331-44. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00179
[4] Neal RM. Slice sampling. The annals of statistics. 2003 Jun;31(3):705-67. https://projecteuclid.org/journals/annals-of-statistics/volume-31/issue-3/Slice-sampling/10.1214/aos/1056562461.full