Chapter 4 Filtrations, Conditional Expectation, and Martingales

In this chapter, we extend the probability theory behind our pricing of contingent claims in the binomial model. This will allow us to put the important financial concepts on a proper mathematical foundation, and to prepare for Epiphany term. In continuous time, our intuition is less helpful, and we will need to approach the material using a rigorous mathematical theory.

4.1 The probability space

Remember that in the multi-period binomial model, we are working in a probability space \((\Omega, \mathcal{F}, \mathbb{P})\), in which:

\(\Omega = \{0,1\}^T\) is the set containing all strings of length \(T\) in which every character is a 0 or a 1
\(\mathcal{F} = \mathcal{F}_T\) is the Borel \(\sigma\)-algebra on \(\Omega\), i.e., \(\mathcal{F}_T = 2^{\Omega}\).
we have fundamentally not been interested in learning more about \(\mathbb{P}\), as long as we know that \(\mathbb{P}(\omega) > 0\) for every \(\omega \in \Omega\).

We can view this probability space as equivalent to a coin-toss space, in which we toss a sequence of coins and record a 1 each time a coin lands heads, and a 0 each time a coin lands tails. In case you’ve forgotten how equivalent measures work, the next subsection might be helpful:

4.1.1 Side note: Equivalent measures

Recall that two probability measures \(\mathbb{P}\) and \(\mathbb{Q}\) are equivalent to each other if \[ \mathbb{P}(A) > 0 \text{ if and only if } \mathbb{Q}(A) > 0. \] The measures do not need to assign the same probabilities to each event, but they should agree that either \(A\) is a possible event (both \(\mathbb{P}(A)\) and \(\mathbb{Q}(A)\) are strictly positive), or that \(A\) is an impossible event (both are zero).

In the context of the multi-period binomial mode, we have two equivalent measures: \(\mathbb{P}\), the ‘’objective measure’’, representing our estimate of how likely a head is to occur in reality, and \(\mathbb{Q}\), the martingale measure that we use for pricing.

For any \(0 < p < 1\), we can define a probability measure \(\mathbb{P}\) on \(\Omega_T\) as follows.

Let \(\#\text{heads}(\omega)\) be the number of times that 1 appears in \(\omega = \omega_1 \dots \omega_T\), and \(\#\text{tails}(\omega) = T - \#\text{heads}(\omega)\) be the number of 0s Then
\[ \mathbb{P}(\omega) = p^{\#\text{heads}(\omega)} (1-p)^{\#\text{tails}(\omega)}, \] and for \(A \in \mathcal{F}\), \[ \mathbb{P}(A) = \sum_{\omega \in A} \mathbb{P}(\omega).\]

For a different \(0 < q < 1\), we can define a different measure \(\mathbb{Q}\) on \(\Omega_T\) with \[ \mathbb{Q}(\omega) = q^{\#\text{heads}(\omega)} (1-q)^{\#\text{tails}(\omega)}, \qquad \mathbb{Q}(A) = \sum_{\omega \in A} \mathbb{Q}(\omega).\]

Here \(\mathbb{P}\) and \(\mathbb{Q}\) are equivalent to each other as long as both \(0 < p < 1\) and \(0 < q < 1\).

Remark. In the recombinant multi-period binomial model, the martingale measure \(\mathbb{Q}\) is defined exactly as in Example 4.2, and we set \(q = q_u\). On the other hand if \(u\) and \(d\) are not fixed throughout the model, then for each \(\omega \in \Omega_T\), \(\mathbb{Q}(\omega)\) becomes a product of \(q_u\)s and \(q_d\)s, which can be found by “multiplying along the branches of the tree”.

4.1.2 Back to probability spaces

Remember that in Chapter 3, we defined the share prices in terms of the random variables \(Z_t\): \[ S_t(\omega) = Z_t(\omega) S_{t-1}(\omega),\] where \(Z_1, Z_2, \dots, Z_T\) are random variables defined on \(\Omega\), with \[ Z_t(\omega) = \begin{cases} u & \text{if } \omega_t = 1, \\ d & \text{if } \omega_t = 0. \end{cases} \]

In the recombinant market with \(s=4\), \(T=3\), \(u=2\), and \(d=\frac12\), we have \[\begin{align*} S_0(\omega_1 \omega_2 \omega_3) & = 4, \quad \text{ for all } \omega \in \Omega_T; \\ S_1(\omega_1 \omega_2 \omega_3) & = \begin{cases} 8 & \text{if } \omega_1 = 1, \\ 2 & \text{if } \omega_1 = 0; \end{cases} \\ S_2(\omega_1 \omega_2 \omega_3) & = \begin{cases} 16 & \text{if }\omega_1=\omega_2=1,\\ 4& \text{if } \omega_1 \neq \omega_2,\\ 1& \text{if }\omega_1=\omega_2=0; \end{cases} \end{align*}\] and \[\begin{align*} S_3(\omega_1 \omega_2 \omega_3) & = \begin{cases} 32 & \text{if } \omega_1 = \omega_2 = \omega_3 = 1, \\ 8 & \text{if there are two 1s and one 0 } \\ 2 & \text{if there are two 0s and one 1} \\ \frac12 & \text{if } \omega_1 = \omega_2 = \omega_3 = 0. \end{cases} \end{align*}\]

In the spirit of simplifying the notation in this chapter, we are going to assume that \(u\) and \(d\) are fixed everywhere. All of the theory works in exactly the same way when they’re not, but we have to worry about which \(u\), or which \(q_u\), we’re using all the time.

Under this nice assumption, the \(Z\)s are independent and identically distributed random variables. They encapsulate all the randomness in the probability space \((\Omega_T, \mathcal{F}, \mathbb{P})\): the \(2^T\) elements of \(\Omega_T\) are in one-to-one correspondence with the \(2^T\) possible real-valued sequences \((z_1, z_2, \dots, z_T)\) describing the possible values of \((Z_1, Z_2, \dots, Z_T)\). In other words, if we know the values of all of \(Z_1, Z_2, \dots, Z_T\), then we know which state \(\omega \in \Omega_T\) we are in, and vice versa.

We say that \(\mathcal{F} = 2^{\Omega_T}\) is equal to the \(\sigma\)-algebra generated by the random variables \(Z_1, Z_2, \dots, Z_T\), and we write \(\mathcal{F} = \sigma(Z_1, Z_2, \dots, Z_T)\).

4.2 Partial information

If we only know about the first \(t\) coin tosses (that is, we know what \(Z_1, Z_2, \dots, Z_t\) are, but we do not know \(Z_{t+1}, \dots, Z_T\)), we can still say quite a lot about how the share prices evolve: we can already write down the values of \(S_0, S_1, \dots, S_t\). Intuitively, this seems perfectly reasonable, but before we can say anything probabilistically, we’d better sort out which space we’re working in.

We define a sequence of \(\sigma\)-algebras as follows:

Take \(\mathcal{F}_0=\{\emptyset, \Omega\}\).
Let \(A_1 = \{ \omega \in \Omega : \omega_1 = 1\}\), and let \(A_0 = \{ \omega \in \Omega: \omega_1 = 0\}\). Define \[ \mathcal{F}_1 = \{ \emptyset, A_1, A_0, \Omega\}.\]
Let \(A_{11} = \{ \omega \in \Omega: \omega_1 = \omega_2 = 1\}\), \(A_{10} = \{ \omega \in \Omega : \omega_1 = 1, \omega_2 = 0\}\), \(A_{01} = \{\omega \in \Omega : \omega_1 = 0, \omega_2 = 1\}\), and \(A_{00} = \{ \omega \in \Omega: \omega_1 = \omega_2 = 0\}\). Define \[ \mathcal{F}_2 = \sigma( A_{11}, A_{10}, A_{01}, A_{00}) \] as the smallest \(\sigma\)-algebra containing these four sets.
Continue in this way, defining \(2^t\) subsets which encapsulate the first \(t\) coin tosses, and building \(\mathcal{F}_t\) as the smallest \(\sigma\)-algebra containing these subsets, until we reach \(\mathcal{F}_T = 2^{\Omega}\).

Exercise 4.1

Construct \(\mathcal{F}_2\); what does \(\mathcal{F}_t\) look like?

Constructing these \(\sigma\)-algebras helps us to define what we mean by partial information. We say that we have observed a \(\sigma\)-algebra \(\mathcal{G}\) if, for every event \(A \in \mathcal{G}\), we know whether \(A\) has occurred, that is, if we know the value of the indicator function \(\ind(A)\). Now at time \(t\), we know what the share price has been for every \(k \leq t\); that is, we know all of the values \(S_0, S_1, \dots, S_t\). So our \(\sigma\)-algebra \(\mathcal{F}_t\) encodes all of the information about the share prices up to time \(t\).

Remark. In this chapter, we move back and forth between thinking about having observed a \(\sigma\)-algebra, and having observed some random variables. Because \(\mathcal{F}_t\) encodes all of the randomness in the first \(t\) share prices, and nothing else, we can write \(\mathcal{F}_t = \sigma(S_0, S_1, \dots, S_t)\), or even \(\mathcal{F}_t = \sigma(Z_1, \dots, Z_t)\).

Observing the \(\sigma\)-algebra \(\mathcal{F}_t\) means knowing exactly which of the events in it have occurred, so when we say that we have observed \(\mathcal{F}_t\) we can deduce the values of \(S_1, \dots, S_t\). Similarly, we can go in the other direction: if we know all of the values \(S_1, \dots, S_t\), then for every event \(A \in \mathcal{F}_t\) we know the value of \(\ind(A)\).

Definition 4.1 A filtration is a non-descending sequence of \(\sigma\)-algebras \[ \mathcal{F}_0 \subseteq \mathcal{F}_1 \subseteq \dots \subseteq \mathcal{F}_n. \]

Filtrations help us to model partial information which is accumulating over time. We assume that we have a perfect memory and never forget anything that’s happened; we just keep increasing the amount of information we have.

Exercise 4.2

Prove (convince yourself) that the sequence of \(\sigma\)-algebras we have just constructed is a filtration: for each \(t\), \(\mathcal{F}_{t} \subset \mathcal{F}_{t+1}\).

Definition 4.2 A random variable \(X : \Omega \to \mathbb{R}\) is measurable with respect to a \(\sigma\)-algebra \(\mathcal{G}\) if for every Borel set \(B \in \mathcal{B}(\mathbb{R})\), \[ \{ \omega \in \Omega: X(\omega) \in B \} \in \mathcal{G}. \]

In other words, the random variable \(X\) is measurable with respect to \(\mathcal{G}\) if observing \(\mathcal{G}\) allows us to determine the value of \(X\).

We have already met some random variables which are measurable with respect to filtrations:

The share price \(S_t\) is always measurable with respect to \(\mathcal{F}_t\), because it only depends on the first \(t\) coin tosses.
The holdings in a portfolio at time \(t-1\), \(x_t\) and \(y_t\), are allowed to be random as long as they are measurable with respect to \(\mathcal{F}_{t-1}\).
A random variable \(T\) is a stopping time with respect to the filtration \((\mathcal{F}_t)\) if, for every \(t\), \(\{ T \leq t \} \in \mathcal{F}_{t}\).

The maximal value that the share price will take over the period \([0,T]\) is not measurable with respect to \(\mathcal{F}_t\) for any \(t<T\), because we will only know what the maximum is when we reach time \(T\).

Since the share price \(S_t\) is measurable with respect to \(\mathcal{F}_t\), we sometimes write \(S_t(\omega_1 \omega_2 \dots \omega_t)\), or even \(S_t (\cdots\omega_{t})\), rather than \(S_t(\omega_1 \omega_2 \dots \omega_T)\).

In Example 4.3, we can write \(S_1(1) = 8\), or \(S_2(01) = 4\).

4.3 Conditional expectation

If you have forgotten how conditional expectation with respect to a random variable works, go and check the Prerequisite notes then come back here.

Definition 4.3 Given a random variable \(X\) on \(\Omega_T\) such that \(\mathbb{E}_{\mathbb{Q}}[\vert X \vert] < \infty\), the conditional expectation of \(X\) with respect to \(\mathcal{G}\) is the unique random variable \(\xi\) which satisfies:

\(\mathbb{E}_{\mathbb{Q}}[\vert \xi \vert] < \infty\).
\(\xi\) is measurable with respect to \(\mathcal{G}\)
For every event \(A \in \mathcal{G}\), \(\mathbb{E}_{\mathbb{Q}} \big[ \xi \ind(A) \big] = \mathbb{E}_{\mathbb{Q}}[X \ind(A)]\).

We denote \(\xi = \mathbb{E}_{\mathbb{Q}}[X \vert \mathcal{G}]\).

We can think of the conditional expectation with respect to a sub-\(\sigma\)-algebra as “the expectation of \(X\), given the partial information contained in \(\mathcal{G}\)”. If we use the sub-\(\sigma\)-algebra generated by a random variable \(Y\), \(\mathcal{G} = \sigma(Y)\), we will rediscover our original Probability-1 definition of conditional expectation (\(\mathbb{E}[X\vert Y] = g(Y)\), and so on).

If \(X_T = \Phi(S_1, \dots, S_T)\) then the conditional expectation \(\mathbb{E}_{\mathbb{Q}}[X_T | \mathcal{F}_t]\) can be thought of as the expected value of \(X_T\) if we fix the outcomes of \(S_1, \dots, S_t\) and average over the remaining randomness that determines \(S_{t+1}, \dots, S_T\), i.e., over \(Z_{t+1}, \dots, Z_T\).

This means that, for \(\omega \in \Omega_T\), we have \[\begin{align} \mathbb{E}_{\mathbb{Q}} [ X_T \vert \mathcal{F}_{t} ] (\cdots \omega_t) = q_u \mathbb{E}_{\mathbb{Q}} [X_T \vert \mathcal{F}_{t+1} ] (\cdots \omega_t 1) + q_d \mathbb{E}_{\mathbb{Q}} [X_T \vert \mathcal{F}_{t+1} ] (\cdots \omega_t 0). \end{align}\]

In the recombinant market with \(s=4\), \(T=3\), \(u=2\), and \(d=\frac12\) (the same market as in Example 4.3), we have \[ q_u = \frac{1-\frac{1}{2}}{2-\frac{1}{2}} = \frac{1}{3}, \quad q_d = \frac{2 - 1}{2 - \frac{1}{2}} = \frac{2}{3}. \]

Let’s calculate the values of \(\mathbb{E}_{\mathbb{Q}}[S_2 \vert \mathcal{F}_1]\) and \(\mathbb{E}_{\mathbb{Q}} [ S_3 \vert \mathcal{F}_1]\). The two events in \(\mathcal{F}_1\) are \(\omega_1 = 1\) and \(\omega_1 = 0\), so we need to do two sets of calculations. \[\begin{align} \mathbb{E}_{\mathbb{Q}} [ S_2 \vert \mathcal{F}_1](1) &= 16 \times \frac{1}{3} + 4 \times \frac{2}{3} = 8, \\ \mathbb{E}_{\mathbb{Q}} [ S_2 \vert \mathcal{F}_1](0) &= 4 \times \frac{1}{3} + 1 \times \frac{2}{3} = 2. \end{align}\] Here \(\mathbb{E}_{\mathbb{Q}} [ S_2 \vert \mathcal{F}_1]\) is a random variable defined on \(\Omega_3\) which takes value \(8\) whenever \(\omega_1 = 1\), and \(2\) whenever \(\omega_1 =0\). We see that, in both cases, \(\mathbb{E}_{\mathbb{Q}} [ S_2 \vert \mathcal{F}_1] = S_1\).

Next, \[\begin{align} \mathbb{E}_{\mathbb{Q}} [ S_3 \vert \mathcal{F}_1](1) &= 32 \times \left(\frac{1}{3}\right)^2 + 8 \times 2 \times \frac{1}{3} \times \frac{2}{3} + 2 \times \left(\frac{2}{3} \right)^2 = 8, \\ \mathbb{E}_{\mathbb{Q}} [ S_3 \vert \mathcal{F}_1](0) &= 8 \times \left(\frac{1}{3}\right)^2 + 2 \times 2 \times \frac{1}{3} \times \frac{2}{3} + \frac{1}{2} \times \left(\frac{2}{3} \right)^2 = 2. \end{align}\] Once again, we have \(\mathbb{E}_{\mathbb{Q}} [S_3 \vert \mathcal{F}_1] = S_1\).

Important Properties of Conditional Expectations

Linearity. For constants \(a_1,a_2\), we have \[ \mathbb{E}[a_1X+a_2Y\vert\mathcal{F}_t]=a_1\mathbb{E}[X\vert\mathcal{F}_t]+a_2\mathbb{E}[Y\vert\mathcal{F}_t]. \]
Taking out what is known. If \(X\) depends only on the first \(t\) coin flips, then \[ \mathbb{E}[XY\vert\mathcal{F}_t]=X\cdot\mathbb{E}[Y\vert\mathcal{F}_t]. \]
Iterated conditioning. If \(s \leq t\) then \[ \mathbb{E}[\mathbb{E}[X\vert\mathcal{F}_t]\vert\mathcal{F}_s]=\mathbb{E}[X\vert\mathcal{F}_s]. \]
Independence. If \(X\) depends only on coin tosses \(t+1\) to \(T\), then \[ \mathbb{E}[X\vert\mathcal{F}_t]=\mathbb{E}[X]. \]

4.4 Martingales

Definition 4.4 A sequence of integrable random variables \(Y_0, Y_1, \cdots, Y_T\) is called a martingale under the measure \(\mathbb{Q}\) if for each \(t\), the value of \(Y_t\) depends on the outcome of the first \(t\) coin flips (we say the sequence is adapted to the filtration) and \[ \mathbb{E}_{\mathbb{Q}}[Y_{t+1}\vert\F_t]=Y_t, \quad t=0, 1, \dots, T-1. \]

Martingales arose out of the study of gambling models. If \(Y_0, Y_1, \dots, Y_t\) represent the fortune of a player, then \(\mathbb{E}_{\mathbb{Q}}[Y_{t+1} \vert \mathcal{F}_t]\) represents the amount of money the player can expect to have after the next game, given their position and knowledge at the end of game \(t\). The player will consider the game to be fair if \(\mathbb{E}_{\mathbb{Q}}[Y_{t+1}\vert\F_t]=Y_t\); that is, if the sequence is a martingale.

Here are some simple properties of martingales:

For any \(t\), we have \(\mathbb{E}_{\mathbb{Q}}[X_t] = \mathbb{E}_{\mathbb{Q}}[X_0]\), because \[ \mathbb{E}_{\mathbb{Q}}[X_t] = \mathbb{E}_{\mathbb{Q}} \bigg[ \mathbb{E}_{\mathbb{Q}}[X_t \vert \mathcal{F}_{t-1}] \bigg] = \mathbb{E}_{\mathbb{Q}}[X_{t-1}] \text{ for all } t \geq 1.\]
Whenever \(m > n\), \[\mathbb{E}_{\mathbb{Q}}[X_m \vert \mathcal{F}_n] = X_n.\]
Given a filtration \((\mathcal{F}_t)_{t \leq T}\) and a random variable \(Z\) which is measurable with respect to \(\mathcal{F}_T\), the sequence \[X_t = \mathbb{E}_{\mathbb{Q}}[Z \vert \mathcal{F_t}] \] is always a martingale with respect to \((\mathcal{F}_t)_{t \leq T}\).

4.4.1 Martingales in the multi-period binomial model

Theorem 4.1 The sequence of discounted stock prices \[ \frac{S_t}{(1+r)^t}, \quad t=0, 1, 2, \dots, T, \] is a martingale under the risk-neutral measure \(\mathbb{Q}\).

Remark. The converse of this statement is also true in the multi-period binomial model. That is to say, the martingale measure \(\mathbb{Q}\) in an arbitrage-free and complete multi-period binomial model is determined by the property that \(\frac{S_t}{(1+r)^t}\) forms a martingale sequence under \(\mathbb{Q}\).

Proof. The discounted stock prices are definitely adapted to the filtration, because the sequence \(S_1, \dots, S_t\) is adapted to the filtration.

For the conditional expectation, we have \[ \begin{split} \mathbb{E}_{\mathbb{Q}}\bigg[\frac{S_{t+1}}{(1+r)^{t+1}}\vert\F_t\bigg](\omega_1\cdots \omega_t)&= q_{u }\frac{S_{t+1}(\omega_1\cdots \omega_t 1)}{(1+r)^{t+1}}+q_d\frac{S_{t+1}(\omega_1\cdots \omega_t 0)}{(1+r)^{t+1}} \\ &=\frac{S_{t}(\omega_1\cdots \omega_t)}{(1+r)^{t+1}}[q_{u }u +q_dd] \\ &=\frac{S_{t}(\omega_1\cdots \omega_t)}{(1+r)^t}. \end{split} \]

Theorem 4.2 For any self-financing portfolio in which \((x_{t+1}, y_{t+1})\) is always measurable with respect to \(\mathcal{F}_t\), the discounted value process \[ \frac{V_t}{(1+r)^t},\quad t=0, 1, \dots, T, \] is a martingale under the risk-neutral measure.

Remark. For this proof, we need to rewrite the self-financing condition in a new form: the wealth equation. The definition of the value, \(V_t = x_{t+1}B_t + y_{t+1}S_t\), and the self-financing condition \(V_{t+1} = x_{t+1} B_{t+1} + y_{t+1}S_{t+1}\), together imply that \[ V_{t+1}=y_{t+1}S_{t+1}+(1+r)(V_t-y_{t+1}S_t). \]

Proof. First, we check that \(\frac{V_t}{(1+r)^t}\) is adapted to the filtration; since \(V_t = x_{t+1}B_t + y_{t+1}S_t\) and all parts of this are measurable with respect to \(\mathcal{F}_t\), we can continue.

For the conditional expectation, we have \[ \begin{split} \mathbb{E}_{\mathbb{Q}}\bigg[\frac{V_{t+1}}{(1+r)^{t+1}}\vert\F_t\bigg] &=\mathbb{E}_{\mathbb{Q}} \bigg[\frac{y_{t+1}S_{t+1}+(1+r)(V_t-y_{t+1}S_t)}{(1+r)^{t+1}}\vert\F_t\bigg] \\ \text{(linearity)}\to\quad &=\mathbb{E}_{\mathbb{Q}} \bigg[\frac{y_{t+1}S_{t+1}}{(1+r)^{t+1}}\vert\F_t\bigg]+\mathbb{E}_{\mathbb{Q}} \bigg[\frac{(1+r)(V_t-y_{t+1}S_t)}{(1+r)^{t+1}}\vert\F_t\bigg] \\ \text{(taking out what is known)}\to\quad &=y_{t+1}\mathbb{E}_{\mathbb{Q}} \bigg[\frac{S_{t+1}}{(1+r)^{t+1}}\vert\F_t\bigg]+\frac{V_t-y_{t+1}S_t}{(1+r)^t} \\ &=y_{t+1}\frac{S_{t}}{(1+r)^{t}}+\frac{V_t-y_{t+1}S_t}{(1+r)^t}\\ &=\frac{V_t}{(1+r)^t}, \end{split} \] showing that \(V_t/(1+r)^t\) is a martingale.

This proves the correctness of the risk-neutral valuation formula for pricing contingent claims: \[ V_t=\frac{1}{(1+r)^{T-t}}\mathbb{E}_{\mathbb{Q}}[V_T\vert\F_t]. \]

We finish this section with a version of the First Fundamental Theorem for the multi-period binomial model. We require a definition of arbitrage on the multi-period binomial model.

Definition 4.5 A portfolio \(h \equiv \big( h_t = (x_t,y_t), t=0,1,\dots,T+1 \big)\) on the multi-period binomial model \(\mathcal{M} = (B_t, S_t)\) is an arbitrage portfolio if it is self-financing and its value process \(V^h_t = x_{t+1} B_t + y_{t+1} S_t\) satisfies: \[ V_0^h=0, \quad \mathbb{P}(V_T^h\geq 0)=1, \quad \mathbb{P}(V_T^h>0)>0. \]

Theorem 4.3 The following conditions are equivalent for a multi-period binomial model \(\mathcal{M} = (B_t, S_t)\), \(t=0,1,\ldots, T\), with interest rate \(r\).

The model is arbitrage-free according to Definition 4.5.
The condition \(d < 1+r < u\) holds, where \(d < u\) are the two possible values of \(Z_t = S_t/S_{t-1}\) at each time \(t\). (\(Z_t\) equals \(u\) with probability \(p\) and \(d\) with probability \(1-p\) for some \(0<p<1\)).
There is a measure \(\mathbb{Q}\) defined by \[ q_u = \frac{1+r - d}{u - d}, \quad q_d = \frac{u - (1+r)}{u-d} \] at each node of the tree, such that \(\frac{S_t}{(1+r)^t}\) is a martingale under \(\mathbb{Q}\).

Proof. We have done most of the work needed to prove this theorem. Let us show the implications \((1) \Rightarrow (2) \Rightarrow (3) \Rightarrow (1)\).

\((1) \Rightarrow (2)\): Consider the number of periods \(T\) in the model. If \(T=1\) then the implication holds by Theorem 2.1. For \(T > 1\), the 1-period model obtained by observing the market from \(t=0\) to \(t=1\) has no arbitrage, and so \(d < 1+r < u\) by Theorem 2.1.
\((2) \Rightarrow (3)\): This implication is Theorem 4.1 above.
\((3) \Rightarrow (1)\): Suppose \(h_t = (x_t,y_t)\) is a self-financing portfolio that satisfies the conditions \(\mathbb{P}(V^h_0 = 0) =1\) and \(\mathbb{P}(V^h_T \geq 0) = 1\). Since the measure \(\mathbb{Q}\) is equivalent to \(\mathbb{P}\), it follows that \(\mathbb{Q}(V^h_0 = 0) =1\) and \(\mathbb{Q}(V^h_T \geq 0) = 1\). Now, by Theorem 4.2, \(\frac{V_t}{(1+r)^t}\) is a martingale under \(\mathbb{Q}\) and so in particular, \[\mathbb{E}_{\mathbb{Q}} \left [\frac{V_T}{(1+r)^T}\right ] = \mathbb{E}_{\mathbb{Q}} [V_0] = 0.\] This shows that \(\mathbb{E}_{\mathbb{Q}}[V_T] = 0\) and thus \(V_T\) is a non-negative random variable with mean 0, which implies \(V_T\) is identically zero: \(\mathbb{Q}(V_T > 0) = 0\). Consequently, \(\mathbb{P}(V_T > 0) = 0\) as well so \(h\) is not an arbitrage portfolio.