My last post was about using importance sampling to estimate the volume of high-dimensional ball. The two figures below compare plain Monte Carlo to using importance sampling with a Gaussian proposal. Both plots use samples to estimate
, the volume of an
-dimensional ball


A friend of mine pointed out that the relative error does not seem to increase with the dimension . He thought it was too good to be true. It turns out he was right and the relative error does increase with dimension but it increases very slowly. To estimate
the number of samples needs to grow on the order of
.
To prove this, we will use the paper The sample size required for importance sampling by Chatterjee and Diaconis [1]. This paper shows that the sample size for importance sampling is determined by the Kullback-Liebler divergence. The relevant result from their paper is Theorem 1.3. This theorem is about the relative error in using importance sampling to estimate a probability.
In our setting the proposal distribution is . That is the distribution
is an
-dimensional Gaussian vector with mean
and covariance
. The conditional target distribution is
the uniform distribution on the
dimensional ball. Theorem 1.3 in [1] tells us how many samples are needed to estimate
. Informally, the required sample size is
. Here
is the Kullback-Liebler divergence between
and
.
To use this theorem we need to compute . Kullback-Liebler divergence is defined as integral. Specifically
Computing the high-dimensional integral above looks challenging. Fortunately, it can reduced to a one-dimensional integral. This is because both the distributions and
are rotationally symmetric. To use this, define
to be the distribution of the norm squared under
and
. That is if
, then
and likewise for
. By the rotational symmetry of
and
we have
We can work out both and
. The distribution
is the uniform distribution on the
-dimensional ball. And so for
and any
Which implies that has density
. This means that
is a Beta distribution with parameters
. The distribution
is a multivariate Gaussian distribution with mean
and variance
. This means that if
, then
is a scaled chi-squared variable. The shape parameter of
is
and scale parameter is
. The density for
is therefor
The Kullback-Leibler divergence between and
is therefor
Getting Mathematica to do the above integral gives
Using the approximation we get that for large
.
And so the required number of samples is
[1] Chatterjee, Sourav, and Persi Diaconis. “THE SAMPLE SIZE REQUIRED IN IMPORTANCE SAMPLING.” The Annals of Applied Probability 28, no. 2 (2018): 1099–1135. https://www.jstor.org/stable/26542331. (Public preprint here https://arxiv.org/abs/1511.01437)