## Sampling from the non-central chi-squared distribution

The non-central chi-squared distribution is a generalisation of the regular chi-squared distribution. The chi-squared distribution turns up in many statistical tests as the (approximate) distribution of a test statistic under the null hypothesis. Under alternative hypotheses, those same statistics often have approximate non-central chi-squared distributions.

This means that the non-central chi-squared distribution is often used to study the power of said statistical tests. In this post I give the definition of the non-central chi-squared distribution, discuss an important invariance property and show how to efficiently sample from this distribution.

### Definition

Let $Z$ be a normally distributed random vector with mean $0$ and covariance $I_n$. Given a vector $\mu \in \mathbb{R}^n$, the non-central chi-squared distribution with $n$ degrees of freedom and non-centrality parameter $\Vert \mu\Vert_2^2$ is the distribution of the quantity $\Vert Z+\mu \Vert_2^2 = \sum\limits_{i=1}^n (Z_i+\mu_i)^2.$

This distribution is denoted by $\chi^2_n(\Vert \mu \Vert_2^2)$. As this notation suggests, the distribution of $\Vert Z+\mu \Vert_2^2$ depends only on $\Vert \mu \Vert_2^2$, the norm of $\mu$. The first few times I heard this fact, I had no idea why it would be true (and even found it a little spooky). But, as we will see below, the result is actually a simply consequence of the fact that standard normal vectors are invariant under rotations.

### Rotational invariance

Suppose that we have two vectors $\mu, \nu \in \mathbb{R}^n$ such that $\Vert \mu\Vert_2^2 = \Vert \nu \Vert_2^2$. We wish to show that if $Z \sim \mathcal{N}(0,I_n)$, then $\Vert Z+\mu \Vert_2^2$ has the same distribution as $\Vert Z + \nu \Vert_2^2$.

Since $\mu$ and $\nu$ have the same norm there exists an orthogonal matrix $U \in \mathbb{R}^{n \times n}$ such that $U\mu = \nu$. Since $U$ is orthogonal and $Z \sim \mathcal{N}(0,I_n)$, we have $Z'=UZ \sim \mathcal{N}(U0,UU^T) = \mathcal{N}(0,I_n)$. Furthermore, since $U$ is orthogonal, $U$ preserves the norm $\Vert \cdot \Vert_2^2$. This is because, for all $x \in \mathbb{R}^n$, $\Vert Ux\Vert_2^2 = (Ux)^TUx = x^TU^TUx=x^Tx=\Vert x\Vert_2^2.$

Putting all these pieces together we have $\Vert Z+\mu \Vert_2^2 = \Vert U(Z+\mu)\Vert_2^2 = \Vert UZ + U\mu \Vert_2^2 = \Vert Z'+\nu \Vert_2^2$.

Since $Z$ and $Z'$ have the same distribution, we can conclude that $\Vert Z'+\nu \Vert_2^2$ has the same distribution as $\Vert Z + \nu \Vert$. Since $\Vert Z + \mu \Vert_2^2 = \Vert Z'+\nu \Vert_2^2$, we are done.

### Sampling

Above we showed that the distribution of the non-central chi-squared distribution, $\chi^2_n(\Vert \mu\Vert_2^2)$ depends only on the norm of the vector $\mu$. We will now use this to provide an algorithm that can efficiently generate samples from $\chi^2_n(\Vert \mu \Vert_2^2)$.

A naive way to sample from $\chi^2_n(\Vert \mu \Vert_2^2)$ would be to sample $n$ independent standard normal random variables $Z_i$ and then return $\sum_{i=1}^n (Z_i+\mu_i)^2$. But for large values of $n$ this would be very slow as we have to simulate $n$ auxiliary random variables $Z_i$ for each sample from $\chi^2_n(\Vert \mu \Vert_2^2)$. This approach would not scale well if we needed many samples.

An alternative approach uses the rotation invariance described above. The distribution $\chi^2_n(\Vert \mu \Vert_2^2)$ depends only on $\Vert \mu \Vert_2^2$ and not directly on $\mu$. Thus, given $\mu$, we could instead work with $\nu = \Vert \mu \Vert_2 e_1$ where $e_1$ is the vector with a $1$ in the first coordinate and $0$s in all other coordinates. If we use $\nu$ instead of $\mu$, we have $\sum\limits_{i=1}^n (Z_i+\nu_i)^2 = (Z_1+\Vert \mu \Vert_2)^2 + \sum\limits_{i=2}^{n}Z_i^2.$

The sum $\sum_{i=2}^n Z_i^2$ follows the regular chi-squared distribution with $n-1$ degrees of freedom and is independent of $Z_1$. The regular chi-squared distribution is a special case of the gamma distribution and can be effectively sampled with rejection sampling for large shape parameter (see here).

The shape parameter for $\sum_{i=2}^n Z_i^2$ is $\frac{n-1}{2}$, so for large values of $n$ we can efficiently sample a value $Y$ that follows that same distribution as $\sum_{i=2}^n Z_i^2 \sim \chi^2_{n-1}$. Finally to get a sample from $\chi^2_n(\Vert \mu \Vert_2^2)$ we independently sample $Z_1$, and then return the sum $(Z_1+\Vert \mu\Vert_2)^2 +Y$.

### Conclusion

In this post, we saw that the rotational invariance of the standard normal distribution gives a similar invariance for the non-central chi-squared distribution.

This invariance allowed us to efficiently sample from the non-central chi-squared distribution. The sampling procedure worked by reducing the problem to sampling from the regular chi-squared distribution.

The same invariance property is also used to calculate the cumulative distribution function and density of the non-central chi-squared distribution. Although the resulting formulas are not for the faint of heart.