The auxiliary variables sampler is a general Markov chain Monte Carlo (MCMC) technique for sampling from probability distributions with unknown normalizing constants [1, Section 3.1]. Specifically, suppose we have functions
and we want to sample from the probability distribution
That is
where is a normalizing constant. If the set
is very large, then it may be difficult to compute
or sample from
. To approximately sample from
we can run an ergodic Markov chain with
as a stationary distribution. Adding auxiliary variables is one way to create such a Markov chain. For each
, we add a auxiliary variable
such that
That is, conditional on , the auxiliary variables
are independent and
is uniformly distributed on the interval
. If
is distributed according to
and
have the above auxiliary variable distribution, then
This means that the joint distribution of is uniform on the set
Put another way, suppose we could jointly sample from the uniform distribution on
. Then, the above calculation shows that if we discard
and only keep
, then
will be sampled from our target distribution
.
The auxiliary variables sampler approximately samples from the uniform distribution on is by using a Gibbs sampler. The Gibbs samplers alternates between sampling
from
and then sampling
from
. Since the joint distribution of
is uniform on
, the conditional distributions
and
are also uniform. The auxiliary variables sampler thus transitions from
to
according to the two step process
- Independently sample
uniformly from
.
- Sample
uniformly from the set
.
Since the auxiliary variables sampler is based on a Gibbs sampler, we know that the joint distribution of will converge to the uniform distribution on
. So when we discard
, the distribution of
will converge to the target distribution
!
Auxiliary variables in practice
To perform step 2 of the auxiliary variables sampler we have to be able to sample uniformly from the sets
Depending on the nature of the set and the functions
, it might be difficult to do this. Fortunately, there are some notable examples where this step has been worked out. The very first example of auxiliary variables is the Swendsen-Wang algorithm for sampling from the Ising model [2]. In this model it is possible to sample uniformly from
. Another setting where we can sample exactly is when
is the real numbers
and each
is an increasing function of
. This is explored in [3] where they apply auxiliary variables to sampling from Bayesian posteriors.
There is an alternative to sampling exactly from the uniform distribution on . Instead of sampling
uniformly from
, we can run a Markov chain from the old
that has the uniform distribution as a stationary distribution. This approach leads to another special case of auxiliary variables which is called “slice sampling” [4].
References
[1] Andersen HC, Diaconis P. Hit and run as a unifying device. Journal de la societe francaise de statistique & revue de statistique appliquee. 2007;148(4):5-28. http://www.numdam.org/item/JSFS_2007__148_4_5_0/
[2] Swendsen RH, Wang JS. Nonuniversal critical dynamics in Monte Carlo simulations. Physical review letters. 1987 Jan 12;58(2):86. https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.58.86
[3] Damlen P, Wakefield J, Walker S. Gibbs sampling for Bayesian non‐conjugate and hierarchical models by using auxiliary variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 1999 Apr;61(2):331-44. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00179
[4] Neal RM. Slice sampling. The annals of statistics. 2003 Jun;31(3):705-67. https://projecteuclid.org/journals/annals-of-statistics/volume-31/issue-3/Slice-sampling/10.1214/aos/1056562461.full