Sums of Squares – Part 1: Optimization

About a year ago I had the pleasure of having a number of conversations with Abhishek Bhardwaj about the area of mathematics his PhD is on. This pair of posts is based on the fond memories I have of these conversations. Part two can be found here.

Optimization and Summing Squares

Optimization is a huge area of both pure and applied mathematics. Many questions and problems can be reduced to finding the minimum or maximum value of some function. That is we have a function $f : \mathbb{R}^n \to \mathbb{R}$ and we wish to find the number $\lambda$ such that

$\lambda = \min \{ f(x_1,\ldots, x_n) \mid (x_1,\ldots, x_n) \in \mathbb{R}^n \}$,

or

$\lambda = \max \{ f(x_1,\ldots, x_n) \mid (x_1,\ldots, x_n) \in \mathbb{R}^n \}$.

By considering the function $-f$ we can reduce finding the maximum of $f$ to finding the minimum of another function. Minimizing a function isn’t always easy. Even when the function $f$ is a relative simple function such as a polynomial, it can be a very difficult problem.

A different way of thinking about the problem of minimizing the function $f$, is to instead think about the function $f - \lambda$ were $\lambda$ is a real number. The minimum of $f$ is the largest value of $\lambda$ such that the function $f - \lambda$ is non-negative for all values $x \in \mathbb{R}^n$. Thus we have a reduced our optimization problem into the problem of working out for which values of $\lambda$ is the function $f-\lambda$ non-negative.

An example

Suppose our function $f$ was the following function that takes in two variables

$f(x,y) = 2x^2 + 4y^2-4xy-4x+4y+7$.

To minimize this function we can look at the function $g = f - \lambda$ which is equal to

$g(x,y) = 2x^2 + 4y^2-4xy-4x+4y+7 - \lambda$.

We wish to work for which values of $\lambda$ is this function $g$ non-negative. By doing the following algebraic manipulations we can rewrite $g$ like so

$g(x,y) = (x^2-2x+1)+(x^2+4y^2+1-4xy-2x+4y)+5-\lambda$,

which is in turn equal to

$g(x,y) = (x-1)^2+(-x+2y+1)^2 +(5-\lambda)$

Since $(x-1)^2$ and $(-x+2y+1)^2$ are both squares of real numbers, they are both non-negative. Furthermore at the point $(x,y) = (1,0)$, we have that $(x-1)^2+(-x+2y+1)^2=0$. Thus $g$ is non-negative if and only if $(5-\lambda) \ge 0$, that is if and only if $\lambda \le 5$. Thus we can conclude that the minimum of $f$ is $5$.

Sums of Squares

Note that the number $5-\lambda$ can be written as $(\sqrt{5-\lambda})^2$ if and only if $\lambda \ge 5$. Thus, in the above example we have that $g$ is non-negative if and only if $g$ can be written as a sum of squares. That is $g$ is non-negative if and only if $g$ can be written as $\sum_{i=1}^n h_i^2$ for some polynomials $h_1, h_2, \ldots, h_n$.

In general, if a polynomial can be written as a sum of squares of other polynomials, then the polynomial must be non-negative. A natural question to ask is if every non-negative polynomial can be written as a sum of squares. This would make our optimization problem a lot easier. To minimize $f$, we can check for which values of $\lambda$ can $f-\lambda$ be written as a sum of squares.

This question of whether every non-negative polynomial can be written as a sum of squares was answered in the negative by David Hilbert in 1888. However, Hilbert’s proof was non-constructive, it didn’t give an explicit example of such a polynomial. The proof only showed that such polynomials exist out there somewhere. It took nearly nearly 80 years for the first explicit example to be given by the mathematician Theodore Motzkin in 1967. We will take a look at Motzkin’s polynomial in the next blog post.