## Concavity of the squared sum of square roots

The $p$-norm of a vector $x\in \mathbb{R}^n$ is defined to be:

$\displaystyle{\Vert x \Vert_p = \left(\sum_{i=1}^n |x_i|^p\right)^{1/p}}.$

If $p \ge 1$, then the $p$-norm is convex. When $0, this function is not convex and actually concave when all the entries of $x$ are non-negative. On a recent exam for the course Convex Optimization, we were asked to prove this when $p = 1/2$. In this special case, the mathematics simplifies nicely.

When $p = 1/2$, the $p$-norm can be described as the squared sum of square roots. Specifically,

$\displaystyle{\Vert x \Vert_{1/2} = \left(\sum_{i=1}^n \sqrt{|x_i|} \right)^2}.$

Note that we can expand the square and rewrite the function as follows

$\displaystyle{\Vert x \Vert_{1/2} = \sum_{i=1}^n\sum_{k=1}^n\sqrt{\left|x_i\right|} \sqrt{|x_k|} =\sum_{i=1}^n\sum_{k=1}^n \sqrt{|x_ix_k|}}.$

If we restrict to $x \in \mathbb{R}^n$ with $x_i \ge 0$ for all $i$, then this function simplifies to

$\displaystyle{\Vert x \Vert_{1/2} =\sum_{i=1}^n\sum_{k=1}^n \sqrt{x_ix_k}},$

which is a sum of geometric means. The geometric mean function $(u,v) \mapsto \sqrt{uv}$ is concave when $u,v \ge 0$. This can be proved by calculating the Hessian of this function and verifying that it is negative semi-definite.

From this, we can conclude that each function $x \mapsto \sqrt{x_ix_k}$ is also concave. This is because $x \mapsto \sqrt{x_ix_k}$ is a linear function followed by a concave function. Finally, any sum of concave functions is also concave and thus $\Vert x \Vert_{1/2}$ is concave.

### Hellinger distance

A similar argument can be used to show that Hellinger distance is a convex function. Hellinger distance, $d_H(\cdot,\cdot)$ is defined on pairs of probability distributions $p$ and $q$ on a common set $\{1,\ldots,n\}$. For such a pair,

$\displaystyle{d_H(p,q) = \sum_{i=1}^n \left(\sqrt{p_i}-\sqrt{q_i}\right)^2},$

which certainly doesn’t look convex. However, we can expand the square and use the fact that $p$ and $q$ are probability distributions. This shows us that Helligener distance can be written as

$\displaystyle{d_H(p,q) = \sum_{i=1}^n p_i - 2\sum_{i=1}^n \sqrt{p_iq_i}+\sum_{i=1}^n q_i = 2 - 2\sum_{i=1}^n\sqrt{p_iq_i}}.$

Again, each function $(p,q) \mapsto \sqrt{p_iq_i}$ is concave and so the negative sum of such functions is convex. Thus, $d_H(p,q)$ is convex.

### The course

As a final comment, I’d just like to say how much I am enjoying the class. Prof. Stephen Boyd is a great lecturer and we’ve seen a wide variety of applications in the class. I’ve recently been reading a bit of John D Cook’s blog and agree with all he says about the course here.