I am again tutoring the course MATH3029. The content is largely the same but the name has changed from “Probability Modelling and Applications” to “Probability Theory and Applications” to better reflect the material taught. There was a good question on the first assignment that leads to some interesting mathematics.

The Question

The assignment question is as follows. Let $\Omega$ be a set and let $\mathcal{F} \subseteq \mathcal{P}(\Omega)$ be a $\sigma$-algebra on $\Omega$. Let $\mathbb{P} : \mathcal{F} \to [0,\infty)$ be a function with the following properties

1. $\mathbb{P}(\Omega) = 1$.
2. For any finite sequence of pairwise disjoint sets $(A_k)_{k=1}^n$ in $\mathcal{F}$, we have $\mathbb{P}\left(\bigcup_{k=1}^n A_k \right) = \sum_{k=1}^n \mathbb{P}(A_k)$.
3. If $(B_n)_{n=1}^\infty$ is a sequence of sets in $\mathcal{F}$ such that $B_{n+1} \subseteq B_n$ for all $n \in \mathbb{N}$ and $\bigcap_{n=1}^\infty A_n = \emptyset$, then, as $n$ tends to infinity, $\mathbb{P}(A_n) \to 0$.

Students were then asked to show that the function $\mathbb{P}$ is a probability measure on $(\Omega, \mathcal{F})$. This amounts to showing that $\mathbb{P}$ is countably additive. That is if $(A_k)_{k=1}^\infty$ is a sequence of pairwise disjoint sets, then $\mathbb{P}\left(\cup_{k=1}^\infty A_k\right) = \sum_{k=1}^\infty \mathbb{P}(A_k)$. One way to do this is define $B_n = \bigcup_{k=n+1}^\infty A_k$. Since the sets $(A_k)_{k=1}^\infty$ are pairwise disjoint, the sets $(B_n)_{n=1}^\infty$ satisfy the assumptions of the third property of $\mathbb{P}$. Thus we can conclude that $\mathbb{P}(B_n) \to 0$ as $n \to \infty$.

We also have that for every $n \in \mathbb{N}$ we have $\bigcup_{k=1}^\infty A_k = \left(\bigcup_{k=1}^n A_k\right) \cup B_n$. Thus by applying the second property of $\mathbb{P}$ twice we get

$\mathbb{P}\left(\bigcup_{k=1}^\infty A_k \right) = \mathbb{P}\left( \bigcup_{k=1}^n A_k\right) + \mathbb{P}(B_n) = \sum_{k=1}^n \mathbb{P}(A_k) + \mathbb{P}(B_n)$.

If we let $n$ tend to infinity, then we get the desired result.

A natural follow up question is whether all three of the assumptions in the question are necessary. It is particularly interesting to ask if there is an example of a function $\mathbb{P}$ that satisfies the first two properties but is not a probability measure. It turns out the answer is yes but coming up with an example involves some serious mathematics.

Let $\Omega$ be the set of natural numbers $\mathbb{N} = \{1,2,3,\ldots\}$ and let $\mathcal{F}$ be the power set of $\mathbb{N}$.

One way in which people talk about the size of a subset of natural numbers $A \subseteq \mathbb{N}$ is to look at the proportion of elements in $A$ and take a limit. That is we could define

$\mathbb{P}(A) = \lim_{n \to \infty}\frac{|\{k \in A \mid k \le n \}|}{n}.$

This function $\mathbb{P}$ has some nice properties for instance if $A$ is the set of all even numbers than $\mathbb{P}(A) = 1/2$. More generally if $A$ is the set of all numbers divisible by $k$, then $\mathbb{P}(A) = 1/k$. The function $\mathbb{P}$ gets used a lot. When people say that almost all natural numbers satisfy a property, they normally mean that if $A$ is the subset of all numbers satisfying the property, then $\mathbb{P}(A)=1$.

However the function $\mathbb{P}$ is not a probability measure. The function $\mathbb{P}$ is finitely additive. To see this, let $(A_i)_{i=1}^m$ be a finite collection of disjoint subsets of $\mathbb{N}$ and let $A = \bigcup_{i=1}^m A_i$. Then for every natural number $n$,

$\{k \in A \mid k \le n \} = \bigcup_{i=1}^m \{k \in A_i \mid k \le n\}$.

Since the sets $(A_i)_{i=1}^m$ are disjoint, the union on the right is a disjoint union. Thus we have

$\frac{|\{k \in A \mid k \le n \}|}{n} = \sum_{i=1}^m \frac{|\{k \in A_i \mid k \le n \}|}{n}$.

Taking limits on both sides gives $\mathbb{P}(A)=\sum_{i=1}^m \mathbb{P}(A_i)$, as required. Furthermore, the function $\mathbb{P}$ is not countably additive. For instance if we let $A_i = \{i\}$ for each $i \in \mathbb{N}$. Then $\bigcup_{i=1}^\infty A_i = \mathbb{N}$ and $\mathbb{P}(\mathbb{N})=1$. On the other hand $\mathbb{P}(A_i)=0$ for every $i \in \mathbb{N}$ and hence $\sum_{i=1}^\infty \mathbb{P}(A_i)=0\neq \mathbb{P}(\mathbb{N})$.

Thus it would appear that we have an example of a finitely additive measure that is not countably additive. However there is a big problem with the above definition of $\mathbb{P}$. Namely the limit of $\frac{|\{k \in A \mid k \le n \}|}{n}$ does not always exist. Consider the set $A = \{3,4,9,10,11,12,13,14,15,16,33,\ldots\}$, ie a number $k \in \mathbb{N}$ is in $A$ if and only if $2^{m} < k \le 2^{m+1}$ for some odd number $m\ge 1$. The idea with the set $A$ is that it looks a little bit like this:

There are chunks of numbers that alternate between being in $A$ and not being in $A$ and as we move further along, these chunks double in size. Let $a_n$ represent the sequence of numbers $\frac{|\{k \in A \mid k \le n \}|}{n}$. We can see that $a_n$ increases while $n$ is in a chunk that belongs to $A$ and decreases when $n$ is in a chunk not in $A$. More specifically if $2^{2m-1} < n \le 2^{2m}$, then $a_n$ is increasing but if $2^{2m} < n \le 2^{2m+1}$, then $a_n$ is decreasing.

At the turning points $n = 2^{2m}$ or $n = 2^{2m+1}$ we can calculate exactly what $a_n$ is equal to. Note that

$\{k \in A \mid k \le 2^{2m} \} = 2+8+32+\ldots+2\cdot 4^{m-1} = 2\cdot \frac{4^m-1}{4-1} = \frac{2}{3}(4^m-1)$.

Furthermore since there are no elements of $A$ between $2^{2m}$ and $2^{2m+1}$ we have

$\{k \in A \mid k \le 2^{2m+1}\} = \frac{2}{3}(4^m-1)$.

Thus we have

$a_{2^m} = \frac{\frac{2}{3}(4^m-1)}{2^{2m}}=\frac{2}{3}\frac{4^m-1}{4^m}$ and $a_{2m+1} = \frac{\frac{2}{3}(4^m-1)}{2^{2m+1}}=\frac{1}{3}\frac{4^m-1}{4^m}.$

Hence the values $a_n$ fluctuate between approaching $\frac{1}{3}$ and $\frac{2}{3}$. Thus the limit of $a_n$ does not exist and hence $\mathbb{P}$ is not well-defined.

There is a work around using something called a Banach limit. Banach limits are a way of extending the notion of a limit from the space of convergent sequences to the space of bounded sequences. Banach limits aren’t uniquely defined and don’t have a formula describing them. Indeed to prove the existence of Banach limits one has to rely on non-constructive mathematics such as the Hanh-Banach extension theorem. So if we take for granted the existence of Banach limits we can define

$\mathbb{P}(A) = L\left( \left(\frac{|\{k \in A \mid k \le n \}|}{n}\right)_{n=1}^\infty \right)$,

where $L$ is now a Banach limit. This new definition of $\mathbb{P}$ is defined on all subsets of $\mathbb{N}$ and is an example of measure that is finitely additive but not countably additive. However the definition of $\mathbb{P}$ is very non-constructive. Indeed there are models of ZF set theory where the Hanh-Banach theorem does not hold and we cannot prove the existence of Banach limits.

This begs the question of whether or not there exist constructible examples of measures that are finitely additive but not countably additive. A bit of Googling reveals that non-principal ultrafilters provide another way of defining non-countably additive measures. However the existence of a non-principal ultrafilter on $\mathbb{N}$ is again equivalent to a weak form of the axiom of choice. Thus it seems that the existence of a non-countably additive measure may be inherently non-constructive. This discussion on Math Overflow goes into more detail.