This post is inspired by an assignment question I had to answer for STATS 310A – a probability course at Stanford for first year students in the statistics PhD program. In the question we had to derive a few results about couplings. I found myself thinking and talking about the question long after submitting the assignment and decided to put my thoughts on paper. I would like to thank our lecturer Prof. Diaconis for answering my questions and pointing me in the right direction.
What are couplings?
Given two distribution functions and
on
, a coupling of
and
is a distribution function
on
such that the marginals of
are
and
. Couplings can be used to give probabilistic proofs of analytic statements about
and
(see here). Couplings are also are studied in their own right in the theory optimal transport.
We can think of and
as being the cumulative distribution functions of some random variables
and
. A coupling
of
and
thus corresponds to a random vector
where
has the same distribution as
,
has the same distribution as
and
.
The independent coupling
For two given distributions function and
there exist many possible couplings. For example we could take
where
. This coupling corresponds to a random vector
where
and
are independent and (as is required for all couplings)
,
.
In some sense the coupling is in the “middle” of all couplings. This is because
and
are independent and so
doesn’t carry any information about
. As the title of the post suggests, there are couplings were this isn’t the case and
carries “as much information as possible” about
.
The two extremal couplings
Define two function by
and
.
With some work, one can show that and
are distributions functions on
and that they have the correct marginals. In this post I would like to talk about how to construct random vectors
and
.
Let and
be the quantile functions of
and
. That is,
and
.
Now let be a random variable that is uniformly distributed on
and define
and
.
Since if and only if
, we have
and likewise
. Furthermore
occurs if and only if
which is equivalent to
. Thus
Thus is distributed according to
. We see that under the coupling
,
and
are closely related as they are both increasing functions of a common random variable
.
We can follow a similar construction for . Define
and
.
Thus and
are again functions of a common random variable
but
is an increasing function of
and
is a decreasing function of
. Note that
is also uniformly distributed on
. Thus
and
.
Now occurs if and only if
and
which occurs if and only if
. If
, then
and
. On the other hand, if
, then
and
. Thus
,
and so is distributed according to
.
What makes
and
extreme?
Now that we know that and
are indeed couplings, it is natural to ask what makes them “extreme”. What we would like to say is that
is an increasing function of
and
is a decreasing function of
. Unfortunately this isn’t always the case as can be seen by taking
to be constant and
to be continuous.
However the intuition that is increasing in
and
is decreasing in
is close to correct. Given a coupling
, we can look at the quantity
This quantity tells us something about how changes with
. For instance if
and
were positively correlated, then
would be positive and if
and
were negatively correlated, then
would be negative.
For the independent coupling , the quantity
is constantly
. It turns out that the above probability is maximised by the coupling
and minimised by
and it is in this sense that they are extremal. This final claim is the two dimensional version of the Fréchet-Hoeffding Theorem and checking it is a good exercise.
One thought on “Extremal couplings”