The discrete Fourier transform (DFT) is a mapping from to
. Given a vector
, the discrete Fourier transform of
is another vector
given by,
If we let , then
can be written more compactly as
The map from to
is linear and can therefor be represented as multiplication by an
matrix
. Looking at the equation above, we can see that if we set
and use
-indexing, then,
This suggests that calculating the vector would require roughly
operations. However, the famous fast Fourier transform shows that
can be computed in roughly
operations. There are many derivations of the fast Fourier transform but I wanted to focus on one which is based on factorizing the matrix
into sparse matrices.
Matrix factorizations
A factorization of a matrix is a way of writing
as a product of matrices
where typically
is
or
. Matrix factorizations can tell us information about the matrix
. For example, the singular value decomposition tells us the spectral norm and singular vectors
(and lots more). Other matrix factorizations can help us compute things related to the matrix
. The QR decomposition helps us solve least-squares problems and the LU factorization let us invert
in (almost) the same amount of time it takes to solve the equation
.
Each of the factorizations above applied to arbitrary matrices. Every possible matrix has an singular value decomposition and a QR factorization. Likewise all square matrices have an LU factorization. The DFT matrix has a factorization which most matrices other matrices do not. This factorization lets us write
where each matrix
is sparse.
Sparse matrices
A matrix is sparse if many of the entries of
are zero. There isn’t a precise definition of “many”. A good heuristic is that if
has size
by
, then
is sparse if the number of non-zero entries is of the order
or
. In contrast, a dense matrix is one where the number of non-zero entries is of the same order of
.
The structure of sparse matrices makes them especially nice to work with. They take up a lot less memory and they are quicker to compute with. If has only
non-zero elements, then for any vector
, the matrix-vector product
can be computed in order
operations. If
is much less than
, then this can be a huge speed up.
This shows why a sparse factorization of the DFT matrix will give a fast way of calculating
. Specifically, suppose that we have
and each matrix
has just
non-zero entries. Then, for any vector
, we can compute
in stages. First we multiply
by
to produce
. The output
is then multiplied by
to produce
. We continue in this way until we finally multiply
by
to produce
. The total computation cost is of the order
which can be drastically less than
.
The DFT factorization
We’ll now show that does indeed have a sparse matrix factorization. To avoid some of the technicalities, we’ll assume that
is a power of
. That is,
for some
. Remember that the discrete Fourier transform of
is the vector
given by
where . The idea behind the fast Fourier transform is to split the above sum into two terms. One where we add up the even values of
and one where we add up the odd values of
. Specifically,
We’ll now look at each of these sums separately. In the first sum, we have terms of the form , these can by rewritten as follows,
.
We can also use this result to rewrite the terms ,
If we plug both of these back into the expression for , we get
This shows that the Fourier transform of the -dimensional vector
can be done by computing and combining the Fourier transforms of the
-dimensional vectors. One of these vectors contain the even indexed elements of
and the others contain the odd indexed elements of
.
There is one more observation that we need. If is between
and
, then we can simplify the above expression for
. Specifically, if we set
, then
If we plug this into the above formula for , then we get
for
.
This reduces the computation since there are some computations we only have to do once but can use twice. The algebra we did above shows that we can compute the discrete Fourier transform of in the following step.
- Form the vectors
and
.
- Calculate the discrete Fourier transforms of
and
. This gives two vector
.
- Compute
by,
The steps above break down computing into three simpler steps. Each of these simpler steps are also linear maps and can therefore be written as multiplication by a matrix. If we call these matrices
and
, then
- The matrix
maps
to
. That is
- The matrix
is block-diagonal and each block is the
-dimensional discrete Fourier transformation matrix. That is
where is given by
.
- The matrix
combines two
-vectors according to the formula in step 3 above,
The three-step procedure above for calculating is equivalent to iteratively multiplying by
, then
and the finally by
. Another way of saying this is that
is a matrix factorization!
From the definitions, we can see that each of the matrices and
have a lot of zeros. Specifically,
has just
non-zero entries (one in each row),
has
non-zero entries (two blocks of
non-zeros) and
has
non-zero entries (two in each row). This means that the product
can be computed in roughly
operations. This is far away from the operations promised at the start of this blog post. However, the matrix
contains two copies of the
discrete Fourier transform matrix. This means we could apply the exact same argument to the
matrix
. This lets us replace
with
. This brings the number of operations down to
We can again keep going! If we recall that , then we get to repeat the proccess
times in total giving approximately,
operations.
So by using lots of sparse matrices, we can compute in
operations instead of
operations.